Current students


Section: Computer Science and Engineering

Major Research topic:
Aurora: natural language sleeps in an uncanny valley of ambiguous semantic constructions

My thesis highlights the need for a shared knowledge between humans and machines in conversational settings. It explores the differences between natural and artificial languages and captures the notion of abstraction as a level of representation. I can exploit the level of representation I build to prevent the rise of uncanny valley in conversations.

The goal of my thesis is to build a level of representation to characterize a conversation. This layer may uncover some of the reasons that cause a gap in the interaction between users and machines. In my thesis I investigate the bond between syntax and semantics in natural language. I consider both syntax and semantics linked by an intrinsic layer able to depict the conversation in terms of pre-linguistically available “abstractions” (i.e. entities and relations). The abstraction layer provides a blueprint that I can reconstruct for every conversation.

The blueprint is a graph constructed on top of the surface structure by means of parsing the syntactic dependencies under a semantical point of view. I inspect the part of speech - POS - tags dependencies and drive semantic relations from specific POS that are configurable as arguments of the service I build. As a result I can obtain a graph that constitutes a layer representing the semantics dependencies of a conversation.

Since it is a blueprint of the conversation and a graph, I name this level of abstraction graphprint. The graphprint is the main output of this thesis, the fact that it respects all properties of graphs makes it possible to apply graph theory to the graphprint. I am able to analyze a conversation semantical structure with one of the most powerful data structures in computer science. I reduce the natural language understanding, for what concerns the semantics, to a simpler interpretable level: a graph.

My thesis yields the following contributions: ;
  1. The graphrint generation: a conversational blueprint representing the conversation semantics as it arises from syntactic dependencies.
  2. ;
  3. The graphprint interpretation: it uncovers a layer that captures one possible origin for the uncanny valley in conversations.
  4. ;
  5. The graphprint replication: it is deterministic, reproducible and inspectable by means of kown graph algorithms. 
  6. ;
In conclusion, my thesis provides a tool that is configurable and useful to generate alternative representation of conversations under a different point of view that depends both on syntax and semantics. This suggests a reconciliation between competence and performance explaining the rules of the language and the use of the language respectively. As other studies suggested, both mechanisms should have evolved as a package: the construction of the graphprint provides a hint advocating for a symbiotic evolution of syntax together with semantics. The graphprint offers a deterministic metric on conversations and indicates a possible measure for the uncanny valley in conversations.