Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our long term goal is to develop systems that can “understand” natural language text. By “understand” we mean that the system can take natural language text as input and answer questions with respect to that text. A key component in building such systems is to be able to translate natural language text into appropriate knowledge representation (KR) languages. Our approach to achieve that is inspired by Montagues path breaking thesis (1970) of viewing English as a formal language and by the research in natural language semantics. Our approach is based on PCCG (Probabilistic Combinatorial Categorial Grammars), λ-calculus and statistical learning of parameters. In an initial work, we start with an initial vocabulary consisting of λ-calculus representations of a small set of words and a training corpus of sentences and their representation in a KR language. We develop a learning based system that learns the λ-calculus representation of words from this corpus and generalizes it to words of the same category. The key and novel aspect in this learning is the development of Inverse Lambda algorithms which when given λ-expressions β and γ can come up with an α such that application of α to β (or β to α) will give us γ. We augment this with learning of weights associated with multiple meanings of words. Our current system produces improved results on standard corpora on natural language interfaces for robot command and control and database queries. In a follow-up work we are able to use patterns to make guesses regarding the initial vocabulary. This together with learning of parameters allow us to develop a fully automated (without any initial vocabulary) way to translate English to designated KR languages. In an on-going work we use Answer Set Programming as the target KR language and focus on (a) solving combinatorial puzzles that are described in English and (b) answering questions with respect to a chapter in a ninth grade biology book. The systems that we are building are good examples of integration of results from multiple sub-fields of AI and computer science, viz.: machine learning, knowledge representation, natural language processing, λ-calculus (functional programming) and ontologies. In this presentation we will describe our approach and our system and elaborate on some of the lessons that we have learned from this effort.

Original languageEnglish (US)
Title of host publicationLogic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings
PublisherSpringer Verlag
Number of pages1
Volume6645 LNAI
ISBN (Print)9783642208942
DOIs
StatePublished - Jan 1 2011
Event11th International Conference on Logic Programming and Nonmonotonic Reasoning, LPNMR 2011 - Vancouver, BC, Canada
Duration: May 16 2011May 19 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6645 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other11th International Conference on Logic Programming and Nonmonotonic Reasoning, LPNMR 2011
CountryCanada
CityVancouver, BC
Period5/16/115/19/11

Fingerprint

Knowledge representation
Knowledge Representation
Natural Language
Calculus
Functional programming
Formal languages
Computer science
Statistical Learning
Ontology
Learning systems
Functional Programming
Answer Set Programming
Command and Control
Robot Control
Question Answering
Formal Languages
Subfield
Guess
Semantics
Language

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Baral, C. (2011). Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages. In Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings (Vol. 6645 LNAI). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6645 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-642-20895-9_3

Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages. / Baral, Chitta.

Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings. Vol. 6645 LNAI Springer Verlag, 2011. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6645 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Baral, C 2011, Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages. in Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings. vol. 6645 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 6645 LNAI, Springer Verlag, 11th International Conference on Logic Programming and Nonmonotonic Reasoning, LPNMR 2011, Vancouver, BC, Canada, 5/16/11. https://doi.org/10.1007/978-3-642-20895-9_3
Baral C. Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages. In Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings. Vol. 6645 LNAI. Springer Verlag. 2011. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-20895-9_3
Baral, Chitta. / Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages. Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings. Vol. 6645 LNAI Springer Verlag, 2011. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{9de9940e155c4f43be7175ba7853d826,
title = "Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages",
abstract = "Our long term goal is to develop systems that can “understand” natural language text. By “understand” we mean that the system can take natural language text as input and answer questions with respect to that text. A key component in building such systems is to be able to translate natural language text into appropriate knowledge representation (KR) languages. Our approach to achieve that is inspired by Montagues path breaking thesis (1970) of viewing English as a formal language and by the research in natural language semantics. Our approach is based on PCCG (Probabilistic Combinatorial Categorial Grammars), λ-calculus and statistical learning of parameters. In an initial work, we start with an initial vocabulary consisting of λ-calculus representations of a small set of words and a training corpus of sentences and their representation in a KR language. We develop a learning based system that learns the λ-calculus representation of words from this corpus and generalizes it to words of the same category. The key and novel aspect in this learning is the development of Inverse Lambda algorithms which when given λ-expressions β and γ can come up with an α such that application of α to β (or β to α) will give us γ. We augment this with learning of weights associated with multiple meanings of words. Our current system produces improved results on standard corpora on natural language interfaces for robot command and control and database queries. In a follow-up work we are able to use patterns to make guesses regarding the initial vocabulary. This together with learning of parameters allow us to develop a fully automated (without any initial vocabulary) way to translate English to designated KR languages. In an on-going work we use Answer Set Programming as the target KR language and focus on (a) solving combinatorial puzzles that are described in English and (b) answering questions with respect to a chapter in a ninth grade biology book. The systems that we are building are good examples of integration of results from multiple sub-fields of AI and computer science, viz.: machine learning, knowledge representation, natural language processing, λ-calculus (functional programming) and ontologies. In this presentation we will describe our approach and our system and elaborate on some of the lessons that we have learned from this effort.",
author = "Chitta Baral",
year = "2011",
month = "1",
day = "1",
doi = "10.1007/978-3-642-20895-9_3",
language = "English (US)",
isbn = "9783642208942",
volume = "6645 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
booktitle = "Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings",
address = "Germany",

}

TY - GEN

T1 - Lessons from Efforts to Automatically Translate English to Knowledge Representation Languages

AU - Baral, Chitta

PY - 2011/1/1

Y1 - 2011/1/1

N2 - Our long term goal is to develop systems that can “understand” natural language text. By “understand” we mean that the system can take natural language text as input and answer questions with respect to that text. A key component in building such systems is to be able to translate natural language text into appropriate knowledge representation (KR) languages. Our approach to achieve that is inspired by Montagues path breaking thesis (1970) of viewing English as a formal language and by the research in natural language semantics. Our approach is based on PCCG (Probabilistic Combinatorial Categorial Grammars), λ-calculus and statistical learning of parameters. In an initial work, we start with an initial vocabulary consisting of λ-calculus representations of a small set of words and a training corpus of sentences and their representation in a KR language. We develop a learning based system that learns the λ-calculus representation of words from this corpus and generalizes it to words of the same category. The key and novel aspect in this learning is the development of Inverse Lambda algorithms which when given λ-expressions β and γ can come up with an α such that application of α to β (or β to α) will give us γ. We augment this with learning of weights associated with multiple meanings of words. Our current system produces improved results on standard corpora on natural language interfaces for robot command and control and database queries. In a follow-up work we are able to use patterns to make guesses regarding the initial vocabulary. This together with learning of parameters allow us to develop a fully automated (without any initial vocabulary) way to translate English to designated KR languages. In an on-going work we use Answer Set Programming as the target KR language and focus on (a) solving combinatorial puzzles that are described in English and (b) answering questions with respect to a chapter in a ninth grade biology book. The systems that we are building are good examples of integration of results from multiple sub-fields of AI and computer science, viz.: machine learning, knowledge representation, natural language processing, λ-calculus (functional programming) and ontologies. In this presentation we will describe our approach and our system and elaborate on some of the lessons that we have learned from this effort.

AB - Our long term goal is to develop systems that can “understand” natural language text. By “understand” we mean that the system can take natural language text as input and answer questions with respect to that text. A key component in building such systems is to be able to translate natural language text into appropriate knowledge representation (KR) languages. Our approach to achieve that is inspired by Montagues path breaking thesis (1970) of viewing English as a formal language and by the research in natural language semantics. Our approach is based on PCCG (Probabilistic Combinatorial Categorial Grammars), λ-calculus and statistical learning of parameters. In an initial work, we start with an initial vocabulary consisting of λ-calculus representations of a small set of words and a training corpus of sentences and their representation in a KR language. We develop a learning based system that learns the λ-calculus representation of words from this corpus and generalizes it to words of the same category. The key and novel aspect in this learning is the development of Inverse Lambda algorithms which when given λ-expressions β and γ can come up with an α such that application of α to β (or β to α) will give us γ. We augment this with learning of weights associated with multiple meanings of words. Our current system produces improved results on standard corpora on natural language interfaces for robot command and control and database queries. In a follow-up work we are able to use patterns to make guesses regarding the initial vocabulary. This together with learning of parameters allow us to develop a fully automated (without any initial vocabulary) way to translate English to designated KR languages. In an on-going work we use Answer Set Programming as the target KR language and focus on (a) solving combinatorial puzzles that are described in English and (b) answering questions with respect to a chapter in a ninth grade biology book. The systems that we are building are good examples of integration of results from multiple sub-fields of AI and computer science, viz.: machine learning, knowledge representation, natural language processing, λ-calculus (functional programming) and ontologies. In this presentation we will describe our approach and our system and elaborate on some of the lessons that we have learned from this effort.

UR - http://www.scopus.com/inward/record.url?scp=85037530004&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85037530004&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-20895-9_3

DO - 10.1007/978-3-642-20895-9_3

M3 - Conference contribution

AN - SCOPUS:85037530004

SN - 9783642208942

VL - 6645 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

BT - Logic Programming and Nonmonotonic Reasoning - 11th International Conference, LPNMR 2011, Proceedings

PB - Springer Verlag

ER -