Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences

Robert L. Strausberg, Elise A. Feingold, Lynette H. Grouse, Jeffery G. Derge, Richard D. Klausner, Francis S. Collins, Lukas Wagner, Carolyn M. Shenmen, Gregory D. Schuler, Stephen F. Altschul, Barry Zeeberg, Kenneth H. Buetow, Carl F. Schaefer, Narayan K. Bhat, Ralph F. Hopkins, Heather Jordan, Troy Moore, Steve I. Max, Jun Wang, Florence HsiehLuda Diatchenko, Kate Marusina, Andrew A. Farmer, Gerald M. Rubin, Ling Hong, Mark Stapleton, M. Bento Soares, Maria F. Bonaldo, Tom L. Casavant, Todd E. Scheetz, Michael J. Brownstein, Ted B. Usdin, Shiraki Toshiyuki, Piero Carninci, Christa Prange, Sam S. Raha, Naomi A. Loquellano, Garrick J. Peters, Rick D. Abramson, Sara J. Mullahy, Stephanie A. Bosak, Paul J. McEwan, Kevin J. McKernan, Joel A. Malek, Preethi H. Gunaratne, Stephen Richards, Kim C. Worley, Sarah Hale, Angela M. Garcia, Laura J. Gay, Stephen W. Hulyk, Debbie K. Villalon, Donna M. Muzny, Erica J. Sodergren, Xiuhua Lu, Richard A. Gibbs, Jessica Fahey, Erin Helton, Mark Ketteman, Anuradha Madan, Stephanie Rodrigues, Amy Sanchez, Michelle Whiting, Anup Madan, Alice C. Young, Yuriy Shevchenko, Gerard G. Bouffard, Robert W. Blakesley, Jeffrey W. Touchman, Eric D. Green, Mark C. Dickson, Alex C. Rodriguez, Jane Grimwood, Jeremy Schmutzs, Richard M. Myers, Yaron S.N. Butterfield, Martin I. Krzywinski, Ursula Skalska, Duane E. Smailus, Angelique Schnerch, Jacqueline E. Schein, Steven J.M. Jones, Marco A. Marra

Research output: Contribution to journalArticle

1394 Scopus citations


National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see

Original languageEnglish (US)
Pages (from-to)16899-16903
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number26
StatePublished - Dec 24 2002


ASJC Scopus subject areas

  • General

Cite this

Strausberg, R. L., Feingold, E. A., Grouse, L. H., Derge, J. G., Klausner, R. D., Collins, F. S., Wagner, L., Shenmen, C. M., Schuler, G. D., Altschul, S. F., Zeeberg, B., Buetow, K. H., Schaefer, C. F., Bhat, N. K., Hopkins, R. F., Jordan, H., Moore, T., Max, S. I., Wang, J., ... Marra, M. A. (2002). Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proceedings of the National Academy of Sciences of the United States of America, 99(26), 16899-16903.