Duplicate detection of 2D-NMR Spectra

Hinneburg, Alexander; Egert, Björn; Porzel, Andrea

Please use this identifier to cite or link to this item: http://dx.doi.org/10.25673/118442

Full metadata record

DC Field	Value	Language
dc.contributor.author	Hinneburg, Alexander	-
dc.contributor.author	Egert, Björn	-
dc.contributor.author	Porzel, Andrea	-
dc.date.accessioned	2025-03-05T07:13:50Z	-
dc.date.available	2025-03-05T07:13:50Z	-
dc.date.issued	2007	-
dc.identifier.uri	https://opendata.uni-halle.de//handle/1981185920/120401	-
dc.identifier.uri	http://dx.doi.org/10.25673/118442	-
dc.description.abstract	2D-Nuclear magnetic resonance (NMR) spectra are used in the (structural) analysis ofsmall molecules. In contrast to 1D-NMR spectra, 2D-NMR spectra correlate the chemicalshifts of1H and13C at the same time. A spectrum consists of several peaks in a two-dimensional space. The most important information of a peak is the location of its center,which captures the bonding relationships of hydrogen and carbon atoms. A spectrum con-tains much information about the chemical structure of a product, but in most cases thestructure cannot be read off in a simple and straightforward manner. Structure elucidationinvolves a considerable amount (manual) efforts.Using high-field NMR spectrometers, many 2D-NMR spectra can be recorded in shorttime. So the common situation is that a lab or company has a repository of 2D-NMRspectra, partially annotated with the structural information. For the remaining spectra thestructure in unknown. In case two research labs are collaborating, the repositories will bemerged and annotations shared.We reduce that problem to the task of finding duplicates in a given set of 2D-NMR spectra.Therefore, we propose a simple but robust definition of 2D-NMR duplicates, which allowsfor small measurement errors. We give a quadratic algorithm for the problem, which canbe implemented in SQL. Further, we analyze a more abstract class of heuristics, which arebased on selecting particular peaks. Such a heuristic works as a filter step on the pairs ofpossible duplicates and allows false positives. We compare all methods with respect totheir run time. Finally we discuss the effectiveness of the duplicate definition on real data.	eng
dc.language.iso	eng	-
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/4.0/	-
dc.subject.ddc	004	-
dc.title	Duplicate detection of 2D-NMR Spectra	eng
dc.type	Article	-
local.versionType	publishedVersion	-
local.bibliographicCitation.journaltitle	Journal of integrative bioinformatics	-
local.bibliographicCitation.volume	4	-
local.bibliographicCitation.issue	1	-
local.bibliographicCitation.publishername	Walter de Gruyter GmbH	-
local.bibliographicCitation.publisherplace	Berlin	-
local.bibliographicCitation.doi	10.1515/jib-2007-53	-
local.openaccess	true	-
dc.identifier.ppn	584529414	-
cbs.publication.displayform	2007	-
local.bibliographicCitation.year	2007	-
cbs.sru.importDate	2025-03-05T07:12:45Z	-
local.bibliographicCitation	In Journal of integrative bioinformatics - Berlin : Walter de Gruyter GmbH, 2004	-
local.accessrights.dnb	free	-
Appears in Collections:	Open Access Publikationen der MLU

Files in This Item:

File	Description	Size	Format
10.1515_jib-2007-53.pdf		718.56 kB	Adobe PDF	View/Open

Show simple item record BibTeX EndNote