Unsupervised gene function extraction using semantic vectors

Ehsan Emadzadeh, Azadeh Nikfarjam, Rachel E. Ginn, Graciela Gonzalez

    Research output: Contribution to journalArticle

    1 Citation (Scopus)

    Abstract

    UNLABELLED: Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task.

    DATABASE URL: https://code.google.com/p/rainbow-nlp/

    Original languageEnglish (US)
    JournalDatabase : the journal of biological databases and curation
    Volume2014
    DOIs
    StatePublished - 2014

    Fingerprint

    Gene Ontology
    Information Storage and Retrieval
    Semantics
    Genes
    Ontology
    genes
    Information Systems
    methodology

    ASJC Scopus subject areas

    • Medicine(all)

    Cite this

    Unsupervised gene function extraction using semantic vectors. / Emadzadeh, Ehsan; Nikfarjam, Azadeh; Ginn, Rachel E.; Gonzalez, Graciela.

    In: Database : the journal of biological databases and curation, Vol. 2014, 2014.

    Research output: Contribution to journalArticle

    Emadzadeh, Ehsan ; Nikfarjam, Azadeh ; Ginn, Rachel E. ; Gonzalez, Graciela. / Unsupervised gene function extraction using semantic vectors. In: Database : the journal of biological databases and curation. 2014 ; Vol. 2014.
    @article{0f4d58929e044d0bb103169d8bc80295,
    title = "Unsupervised gene function extraction using semantic vectors",
    abstract = "UNLABELLED: Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task.DATABASE URL: https://code.google.com/p/rainbow-nlp/",
    author = "Ehsan Emadzadeh and Azadeh Nikfarjam and Ginn, {Rachel E.} and Graciela Gonzalez",
    year = "2014",
    doi = "10.1093/database/bau084",
    language = "English (US)",
    volume = "2014",
    journal = "Database : the journal of biological databases and curation",
    issn = "1758-0463",
    publisher = "Oxford University Press",

    }

    TY - JOUR

    T1 - Unsupervised gene function extraction using semantic vectors

    AU - Emadzadeh, Ehsan

    AU - Nikfarjam, Azadeh

    AU - Ginn, Rachel E.

    AU - Gonzalez, Graciela

    PY - 2014

    Y1 - 2014

    N2 - UNLABELLED: Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task.DATABASE URL: https://code.google.com/p/rainbow-nlp/

    AB - UNLABELLED: Finding gene functions discussed in the literature is an important task of information extraction (IE) from biomedical documents. Automated computational methodologies can significantly reduce the need for manual curation and improve quality of other related IE systems. We propose an open-IE method for the BioCreative IV GO shared task (subtask b), focused on finding gene function terms [Gene Ontology (GO) terms] for different genes in an article. The proposed open-IE approach is based on distributional semantic similarity over the GO terms. The method does not require annotated data for training, which makes it highly generalizable. We achieve an F-measure of 0.26 on the test-set in the official submission for BioCreative-GO shared task, the third highest F-measure among the seven participants in the shared task.DATABASE URL: https://code.google.com/p/rainbow-nlp/

    UR - http://www.scopus.com/inward/record.url?scp=84925545091&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84925545091&partnerID=8YFLogxK

    U2 - 10.1093/database/bau084

    DO - 10.1093/database/bau084

    M3 - Article

    C2 - 25209025

    AN - SCOPUS:84925545091

    VL - 2014

    JO - Database : the journal of biological databases and curation

    JF - Database : the journal of biological databases and curation

    SN - 1758-0463

    ER -