TY - GEN
T1 - Corpus-guided sentence generation of natural images
AU - Yang, Yezhou
AU - Teo, Ching Lik
AU - Daumé, Hal
AU - Aloimonos, Yiannis
PY - 2011
Y1 - 2011
N2 - We propose a sentence generation strategy that describes images by predicting the most likely nouns, verbs, scenes and prepositions that make up the core sentence structure. The input are initial noisy estimates of the objects and scenes detected in the image using state of the art trained detectors. As predicting actions from still images directly is unreliable, we use a language model trained from the English Gigaword corpus to obtain their estimates; together with probabilities of co-located nouns, scenes and prepositions. We use these estimates as parameters on a HMM that models the sentence generation process, with hidden nodes as sentence components and image detections as the emissions. Experimental results show that our strategy of combining vision and language produces readable and descriptive sentences compared to naive strategies that use vision alone.
AB - We propose a sentence generation strategy that describes images by predicting the most likely nouns, verbs, scenes and prepositions that make up the core sentence structure. The input are initial noisy estimates of the objects and scenes detected in the image using state of the art trained detectors. As predicting actions from still images directly is unreliable, we use a language model trained from the English Gigaword corpus to obtain their estimates; together with probabilities of co-located nouns, scenes and prepositions. We use these estimates as parameters on a HMM that models the sentence generation process, with hidden nodes as sentence components and image detections as the emissions. Experimental results show that our strategy of combining vision and language produces readable and descriptive sentences compared to naive strategies that use vision alone.
UR - http://www.scopus.com/inward/record.url?scp=80053258778&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80053258778&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:80053258778
SN - 1937284115
SN - 9781937284114
T3 - EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 444
EP - 454
BT - EMNLP 2011 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2011
Y2 - 27 July 2011 through 31 July 2011
ER -