A machine text-inspired machine learning approach for identification of transmembrane helix boundaries

Betty Yee Man Cheng, Jaime G. Carbonell, Judith Klein-Seetharaman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

In this paper, we adapt a statistical learning approach, inspired by automated topic segmentation techniques in speech-recognized documents to the challenging protein segmentation problem in the context of G-protein coupled receptors (GPCR). Each GPCR consists of 7 transmembrane helices separated by alternating extracellular and intracellular loops. Viewing the helices and extracellular and intracellular loops as 3 different topics, the problem of segmenting the protein amino acid sequence according to its secondary structure is analogous to the problem of topic segmentation. The method presented involves building an n-gram language model for each 'topic' and comparing their performance in predicting the current amino acid, to determine whether a boundary occurs at the current position. This presents a distinctly different approach to protein segmentation from the Markov models that have been used previously and its commendable results is evidence of the benefit of applying machine learning and language technologies to bioinformatics.

Original languageEnglish (US)
Title of host publicationFoundations of Intelligent Systems - 15th International Symposium, ISMIS 2005, Proceedings
PublisherSpringer Verlag
Pages29-37
Number of pages9
ISBN (Print)3540258787, 9783540258780
DOIs
StatePublished - 2005
Externally publishedYes
Event15th International Symposium on Methodologies for Intelligent Systems, ISMIS 2005 - Saratoga Springs, NY, United States
Duration: May 25 2005May 28 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3488 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Symposium on Methodologies for Intelligent Systems, ISMIS 2005
Country/TerritoryUnited States
CitySaratoga Springs, NY
Period5/25/055/28/05

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'A machine text-inspired machine learning approach for identification of transmembrane helix boundaries'. Together they form a unique fingerprint.

Cite this