Abstract

Current state-of-the-art approaches for biological sequence querying and alignment require preprocessing and lack robustness to repetitions in the sequence. In addition, these approaches do not provide much support for efficiently querying subsequences, a process that is essential for tracking localized database matches. We propose a query-based alignment method for biological sequences that first maps sequences to time-domain waveforms before processing the waveforms for alignment in the time-frequency plane. The mapping uses waveforms, such as Gaussian functions, with unique sequence representations in the time-frequency plane. The proposed alignment method employs a robust querying algorithm that utilizes a time-frequency signal expansion whose basis function is matched to the basic waveform in the mapped sequences. The resulting WAVEQuery approach was demonstrated for both deoxyribonucleic acid (DNA) and protein sequences using the matching pursuit decomposition as the signal basis expansion. We specifically evaluated the alignment localization of WAVEQuery over repetitive database segments, and we demonstrated its operation in real-time without preprocessing. We also demonstrated that WAVEQuery significantly outperformed the biological sequence alignment method BLAST for queries with repetitive segments for DNA sequences. A generalized version of the WAVEQuery approach with the metaplectic transform is also described for protein sequence structure prediction.

Original languageEnglish (US)
Article number5776708
Pages (from-to)4210-4224
Number of pages15
JournalIEEE Transactions on Signal Processing
Volume59
Issue number9
DOIs
StatePublished - Sep 2011

Fingerprint

DNA
Proteins
Processing
Decomposition

Keywords

  • Chirp signals
  • Gaussian signal
  • matched filter
  • matching pursuit decomposition
  • querying
  • sequence alignment
  • time-frequency analysis

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing

Cite this

Waveform mapping and time-frequency processing of DNA and protein sequences. / Ravichandran, Lakshminarayan; Papandreou-Suppappola, Antonia; Spanias, Andreas; Lacroix, Zoé; Legendre, Christophe.

In: IEEE Transactions on Signal Processing, Vol. 59, No. 9, 5776708, 09.2011, p. 4210-4224.

Research output: Contribution to journalArticle

Ravichandran, Lakshminarayan ; Papandreou-Suppappola, Antonia ; Spanias, Andreas ; Lacroix, Zoé ; Legendre, Christophe. / Waveform mapping and time-frequency processing of DNA and protein sequences. In: IEEE Transactions on Signal Processing. 2011 ; Vol. 59, No. 9. pp. 4210-4224.
@article{ee67604a24e846b89a0cd87432131f3d,
title = "Waveform mapping and time-frequency processing of DNA and protein sequences",
abstract = "Current state-of-the-art approaches for biological sequence querying and alignment require preprocessing and lack robustness to repetitions in the sequence. In addition, these approaches do not provide much support for efficiently querying subsequences, a process that is essential for tracking localized database matches. We propose a query-based alignment method for biological sequences that first maps sequences to time-domain waveforms before processing the waveforms for alignment in the time-frequency plane. The mapping uses waveforms, such as Gaussian functions, with unique sequence representations in the time-frequency plane. The proposed alignment method employs a robust querying algorithm that utilizes a time-frequency signal expansion whose basis function is matched to the basic waveform in the mapped sequences. The resulting WAVEQuery approach was demonstrated for both deoxyribonucleic acid (DNA) and protein sequences using the matching pursuit decomposition as the signal basis expansion. We specifically evaluated the alignment localization of WAVEQuery over repetitive database segments, and we demonstrated its operation in real-time without preprocessing. We also demonstrated that WAVEQuery significantly outperformed the biological sequence alignment method BLAST for queries with repetitive segments for DNA sequences. A generalized version of the WAVEQuery approach with the metaplectic transform is also described for protein sequence structure prediction.",
keywords = "Chirp signals, Gaussian signal, matched filter, matching pursuit decomposition, querying, sequence alignment, time-frequency analysis",
author = "Lakshminarayan Ravichandran and Antonia Papandreou-Suppappola and Andreas Spanias and Zo{\'e} Lacroix and Christophe Legendre",
year = "2011",
month = "9",
doi = "10.1109/TSP.2011.2157915",
language = "English (US)",
volume = "59",
pages = "4210--4224",
journal = "IEEE Transactions on Signal Processing",
issn = "1053-587X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "9",

}

TY - JOUR

T1 - Waveform mapping and time-frequency processing of DNA and protein sequences

AU - Ravichandran, Lakshminarayan

AU - Papandreou-Suppappola, Antonia

AU - Spanias, Andreas

AU - Lacroix, Zoé

AU - Legendre, Christophe

PY - 2011/9

Y1 - 2011/9

N2 - Current state-of-the-art approaches for biological sequence querying and alignment require preprocessing and lack robustness to repetitions in the sequence. In addition, these approaches do not provide much support for efficiently querying subsequences, a process that is essential for tracking localized database matches. We propose a query-based alignment method for biological sequences that first maps sequences to time-domain waveforms before processing the waveforms for alignment in the time-frequency plane. The mapping uses waveforms, such as Gaussian functions, with unique sequence representations in the time-frequency plane. The proposed alignment method employs a robust querying algorithm that utilizes a time-frequency signal expansion whose basis function is matched to the basic waveform in the mapped sequences. The resulting WAVEQuery approach was demonstrated for both deoxyribonucleic acid (DNA) and protein sequences using the matching pursuit decomposition as the signal basis expansion. We specifically evaluated the alignment localization of WAVEQuery over repetitive database segments, and we demonstrated its operation in real-time without preprocessing. We also demonstrated that WAVEQuery significantly outperformed the biological sequence alignment method BLAST for queries with repetitive segments for DNA sequences. A generalized version of the WAVEQuery approach with the metaplectic transform is also described for protein sequence structure prediction.

AB - Current state-of-the-art approaches for biological sequence querying and alignment require preprocessing and lack robustness to repetitions in the sequence. In addition, these approaches do not provide much support for efficiently querying subsequences, a process that is essential for tracking localized database matches. We propose a query-based alignment method for biological sequences that first maps sequences to time-domain waveforms before processing the waveforms for alignment in the time-frequency plane. The mapping uses waveforms, such as Gaussian functions, with unique sequence representations in the time-frequency plane. The proposed alignment method employs a robust querying algorithm that utilizes a time-frequency signal expansion whose basis function is matched to the basic waveform in the mapped sequences. The resulting WAVEQuery approach was demonstrated for both deoxyribonucleic acid (DNA) and protein sequences using the matching pursuit decomposition as the signal basis expansion. We specifically evaluated the alignment localization of WAVEQuery over repetitive database segments, and we demonstrated its operation in real-time without preprocessing. We also demonstrated that WAVEQuery significantly outperformed the biological sequence alignment method BLAST for queries with repetitive segments for DNA sequences. A generalized version of the WAVEQuery approach with the metaplectic transform is also described for protein sequence structure prediction.

KW - Chirp signals

KW - Gaussian signal

KW - matched filter

KW - matching pursuit decomposition

KW - querying

KW - sequence alignment

KW - time-frequency analysis

UR - http://www.scopus.com/inward/record.url?scp=80051751263&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80051751263&partnerID=8YFLogxK

U2 - 10.1109/TSP.2011.2157915

DO - 10.1109/TSP.2011.2157915

M3 - Article

AN - SCOPUS:80051751263

VL - 59

SP - 4210

EP - 4224

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

SN - 1053-587X

IS - 9

M1 - 5776708

ER -