The DIEGO Lab graph based gene normalization system

Ryan Sullivan; Robert Leaman; Graciela Gonzalez

doi:10.1109/ICMLA.2011.140

The DIEGO Lab graph based gene normalization system

Ryan Sullivan, Robert Leaman, Graciela Gonzalez

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.

Original language	English (US)
Title of host publication	Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
Pages	78-83
Number of pages	6
DOIs	https://doi.org/10.1109/ICMLA.2011.140
State	Published - Dec 1 2011
Event	10th International Conference on Machine Learning and Applications, ICMLA 2011 - Honolulu, HI, United States Duration: Dec 18 2011 → Dec 21 2011

Publication series

Name	Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
Volume	2

Other

Other	10th International Conference on Machine Learning and Applications, ICMLA 2011
Country/Territory	United States
City	Honolulu, HI
Period	12/18/11 → 12/21/11

ASJC Scopus subject areas

Computer Science Applications
Human-Computer Interaction

Access to Document

10.1109/ICMLA.2011.140

Cite this

The DIEGO Lab graph based gene normalization system. / Sullivan, Ryan; Leaman, Robert; Gonzalez, Graciela.
Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011. 2011. p. 78-83 6147052 (Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011; Vol. 2).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Sullivan, R, Leaman, R & Gonzalez, G 2011, The DIEGO Lab graph based gene normalization system. in Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011., 6147052, Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011, vol. 2, pp. 78-83, 10th International Conference on Machine Learning and Applications, ICMLA 2011, Honolulu, HI, United States, 12/18/11. https://doi.org/10.1109/ICMLA.2011.140

@inproceedings{a04a27bd30c54f6d889e6b9b2ee02050,

title = "The DIEGO Lab graph based gene normalization system",

abstract = "Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.",

author = "Ryan Sullivan and Robert Leaman and Graciela Gonzalez",

year = "2011",

month = dec,

day = "1",

doi = "10.1109/ICMLA.2011.140",

language = "English (US)",

isbn = "9780769546070",

series = "Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011",

pages = "78--83",

booktitle = "Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011",

note = "10th International Conference on Machine Learning and Applications, ICMLA 2011 ; Conference date: 18-12-2011 Through 21-12-2011",

}

TY - GEN

T1 - The DIEGO Lab graph based gene normalization system

AU - Sullivan, Ryan

AU - Leaman, Robert

AU - Gonzalez, Graciela

PY - 2011/12/1

Y1 - 2011/12/1

N2 - Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.

AB - Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.

UR - http://www.scopus.com/inward/record.url?scp=84857836615&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84857836615&partnerID=8YFLogxK

U2 - 10.1109/ICMLA.2011.140

DO - 10.1109/ICMLA.2011.140

M3 - Conference contribution

AN - SCOPUS:84857836615

SN - 9780769546070

T3 - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011

SP - 78

EP - 83

BT - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011

T2 - 10th International Conference on Machine Learning and Applications, ICMLA 2011

Y2 - 18 December 2011 through 21 December 2011

ER -

The DIEGO Lab graph based gene normalization system

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this