Unsupervised gene function extraction using semantic vectors

Ehsan Emadzadeh, Azadeh Nikfarjam, Rachel E. Ginn, Graciela Gonzalez

    Research output: Contribution to journalArticle

    1 Scopus citations

    Abstract

    UNLABELLED: Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task.

    DATABASE URL: https://code.google.com/p/rainbow-nlp/

    Original languageEnglish (US)
    JournalDatabase : the journal of biological databases and curation
    Volume2014
    DOIs
    StatePublished - 2014

    ASJC Scopus subject areas

    • Information Systems
    • Biochemistry, Genetics and Molecular Biology(all)
    • Agricultural and Biological Sciences(all)

    Fingerprint Dive into the research topics of 'Unsupervised gene function extraction using semantic vectors'. Together they form a unique fingerprint.

  • Cite this