Large-scale grammar induction from linguistic resources

Advisors: Katrien Beuls, Luc Steels

Prerequisites: An interest in natural language processing and grammars

Context:

Fluid Construction Grammar is a reversible grammatical formalism that is used to parse natural language sentences into a semantic representation as well as formulate a given conceptualization by means of an utterance. A grammar is a large inventory of linguistic rules that interact while processing a given sentence. An example of such a rule, or construction, is the following:

When this rule is used in a comprehension process, it will contribute a number of non-typed feature-value pairs to a pre-existing linguistic structure through the process of unification. For a sentence such as “John reads the book that Mary likes”, not less than 23 constructions apply to come up with the following meaning network:

 

Goal and research activities:

Typically, these constructions are hand-crafted and creating a grammar is therefore time and labour intensive. Yet, with the growing availability of annotated linguistic resources such as FrameNet or VerbNet, a large part of the creation of a grammar could be automatized. The goal of this project is to investigate the possibilities in the area of automatic grammar induction that goes beyond the mere learning of lexical rules. Moreover, novel evaluation metrics will be in place to evaluate the accuracy and the coverage of the machine-learned construction grammars.

References:

Steels, Luc. (2016). The Basics of Fluid Construction Grammar. Constructions and Frames.

Examples of a precision grammar for Portuguese: https://www.fcg-net.org/demos/propor-2016/

FrameNet: http://framenet.icsi.berkeley.ed