01whole.pdf (3.63 MB)
Download fileMethods for taking semantic graphs apart and putting them back together again
thesis
posted on 2022-03-29, 01:19 authored by Jonas GroschwitzThis thesis develops the AM dependency parser, a semantic parser for Abstract Meaning Representation (AMR, Banarescu et al. (2013)) that owes its strong performance to its effective combination of neural and compositional methods. Neural networks have proven to be enormously effective machine learning tools for natural language processing. Compositionality as a linguistic principle it has a strong tradition in semantic construction. However, both approaches have distinct challenges. Pure neural models are data hungry, since they have no prior knowledge of the inherent structure in language. Compositional approaches have robustness issues and suffer from the ambiguity of latent structural information in the training data.
This thesis combines the strengths of both worlds to address these challenges. The AM dependency parser drops the restrictive syntactic constraints of classic compositional approaches, instead relying only on semantic types and meaningful semantic operations as structural guides. The ability of neural networks to encode contextual information allows the parser to make correct decisions in the absence of hard syntactic constraints.
Consequently, the thesis focuses on terms for semantic representations, which are algebraic `building instructions'. The thesis frst examines the suitability of the HR algebra (a general tool for building graphs, Courcelle and Engelfriet (2012)) for this purpose. It then develops the linguistically motivated AM algebra, that proves much better suited for the purpose. Representing the terms over the AM algebra as dependency trees further simplifies the semantic construction. In particular, the move from the HR algebra to the AM algebra and then to AM dependency trees drastically removes the ambiguity of latent structural information required for training the model.
In conclusion, the AM dependency trees yield a simple semantic parser, where neural tagging and dependency models predict interpretable, meaningful operations that construct the AMR.