Projects
Inspire: Interpreting Natural Language using Answer Set Programming, Inconsistency Management, and Relevance Theory
In the Inspire project we studied methods of Answer Set Programming on applications that are related with Interpreting Natural Language.
Principal Investigator (Project Manager): Peter Schüller
Duration: April 2015 to September 2017
Funding: Scientific and Technological Research Council of Turkey (TÜBİTAK) Program 3501
English Title: Interpreting Natural Language using Answer Set Programming, Inconsistency Management, and Relevance Theory
Turkish Title: Çözüm Kümesi Programlaması, Tutarsızlık Yönetimi, ve Bağıntı Kuramıyla Doğal Dilin Yorumlaması
Project Overview:
Natural language is a very efficient form of communication: humans leave out many details when they use language because other humans can easily fill these details. As an example, ‘morning coffee’ means ‘coffee drunk in the morning’ while ‘morning newspaper’ means ‘newspaper read in the morning’, and humans understand that without effort although neither ‘drunk’ nor ‘read’ is visible in the text. This underspecification, often combined with a high ambiguity of natural language, is a big challenge for NLU systems. The Inspire project aims to advance scientific methods that allow computers to interpret natural language text with the goal of recovering its intended meaning.
An existing approach in that direction is the usage of background knowledge bases (WordNet, FrameNet) together with abductive reasoning. The idea of abduction is to find the best explanation for a given observed text input. In this project we want to build an improved NLU formalism based on Answer Set Programming (ASP).
ASP is a general purpose logic programming formalism that supports comfortable representation of knowledge and nonmonotonic reasoning (such as abduction). In an ASP program we describe a set of potential solutions and relationships/constraints between concepts. Based on such a representation, an ASP solver (a software tool) computes solutions that respect all specified relationships and constraints. ASP can handle circular knowledge without problems and allows for an efficient integration of external knowledge such as WordNet and FrameNet.
The project has been concluded successfully in 2017.
Results:
- Mishal Benz (Kazmi) was funded by the project and received her PhD degree in 2017.
- The First-Order Horn Abduction Framework was developed and published.
- The CaspR Coreference Resolution Adjudication Tool was created and published.
- Several publications in collaboration with DEMACS at University of Calabria on lazy instantiation of constraints in Answer Set Programming.
OmSieve: Open-Minded Coreference Resolution Sieve Based on Answer Set Programming
In the OmSieve project we applied methods of Answer Set Programming to Coreference Resolution.
Principal Investigator (Project Manager): Peter Schüller
Duration: January 2015 to December 2016
Funding: Scientific and Technological Research Council of Turkey (TÜBİTAK) Program 3001
English Title: Open-Minded Coreference Resolution Sieve Based on Answer Set Programming
Turkish Title: Çözüm Kümesi Programlama Tabanlı Muhakeme Edilen Eşgönderge Sieve Çözümlenmesi
Project Overview:
Answer Set Programming (ASP) is a general purpose logic programming formalism that supports comfortable representation of knowledge, non-monotonic reasoning processes, and reasoning with hybrid knowledge bases. In an ASP logic program we describe (i) a set of potential solutions, (ii) relationships between concepts in the solution, and (iii) constraints on solutions. Given such a representation an ASP solver (a software tool) computes those solutions that adhere to the specified relationships and constraints. ASP solvers can find all solutions to such problems and they are engineered to find these solutions efficiently. Moreover ASP supports hybrid reasoning which means that some relationships between concepts can be described outside the ASP logic, for example in a Python program.
Coreference Resolution is the Computer Linguistics task of finding out which phrases of a natural language discourse refer to the same entity in the world. For example in the sentence “He said to the people: 'I need your help'” the task is to find out that “he” and “I” refers to the same entity (the speaker), furthermore “the people” and “your” refers to the same entity (the listeners). Coreference Resolution is challenging: noun phrases can refer to the same entity for various reasons, they can be synonyms, hypernyms, or hyponyms, or they can be coreferent because of background knowledge and discourse information (e.g., “my brother”, “John”, and “the king” can be coreferent due to contextual information).
The OmSieve project was successfully concluded in early 2017.
Results:
- Kenda Alakraa (MSc student) obtained a scholarship from the project and graduated within the scope of the project.
- Kübra Cıngıllı (MSc student) obtained a partial scholarship from the project and assisted the creation of the Marmara Turkish Coreference Corpus.
- The CaspR - Semi-Automatic Coreference Resolution Adjudication Tool based on Answer Set Programming was created.
- The Marmara Turkish Coreference Corpus and Coreference Resolution Baseline was created and published online and as a technical report.
Marmara Turkish Coreference Corpus
The Marmara Turkish Coreference Resolution Corpus was created over the course of two years at Marmara University in Istanbul, Turkey.
The corpus is a layer on top of the METU-Sabanci Turkish Treebank.
The corpus is available at bitbucket.org/knowlp/marmara-turkish-coreference-corpus.
The coreference annotation manual used for preparing the corpus is available at bitbucket.org/knowlp/turkish-coreference-annotation-guide.
The CaspR Coreference Resolution Adjudication Tool was used to create this Corpus.
CaspR - Coreference Resolution Adjudication Tool
This tool was created to enable building the Marmara Turkish Coreference Corpus, where we we had to adjudicate up to ten independent annotations of the same document into one gold standard.
The tool permits a fully automatic adjudication mode with four possible objective functions (which are described in the accompanying journal article). Moreover, part of the output can be specified by the human adjudicater and the tool will create the best possible solution that fits the specifications of the human, making the tool a semi-automatic support for human adjudication. CaspR runs on the command line, uses the popular CoNLL format for Coreference Annotations, and is based on Answer Set Programming and Python.
The tool is publicly available at github.com/knowlp/caspr-coreference-tool.
The accompanying Coreference Adjudication Benchmark Dataset is available at bitbucket.org/knowlp/asp-coreference-benchmark.
Hexlite - A Lightweight HEX Solver
The Hexlite solver is a lightweight alternative to dlvhex for logic programs with a restricted set of external computations.
The solver was created with lightweightness as a principle, using Python as the only programming language and delegating as much as possible to the backend solver which is currently Clingo.
The software is publicly available at github.com/hexhex/hexlite and includes a lightweight Acthex implementation.
The software is also available as a Docker container on https://www.ai4eu.eu/resource/hexlite.
Winograd Schema Processor
This tool was created as part of a publication at the International Conference on Principles of Knowledge Representation and Reasoning (KR 2014).
The tool is based on Clingo and features Graphviz output.
The detailed description of the tool and the approach can be found on the project homepage: Tackling Winograd Schemas by Formalizing Relevance Theory in Knowledge Graphs.
First-Order Horn Abduction Framework
This project, which led to several publications, implements abductive reasoning with costs in First Order Horn logic using Answer Set Programming.
The specific knowledge base format and reasoning task that are supported by the framework are those in the ACCEL benchmark (Ng & Mooney, 1992), plus two further abductive cost objective functions from other papers.
Work on this project pioneered an approach for on-demand constraints in Answer Set Solving to manage reasoning in large theories. Follow-up work led to several further projects (in particular the Hexlite Solver can be counted as follow-up work) and collaborations (in particular with Francesco Ricca, Carmine Dodaro, and Bernardo Cuteri).
The tool is available at Bitbucket: bitbucket.org/knowlp/asp-fo-abduction.
The dlvhex Solver
The dlvhex Solver is a C++ software for computing answer sets of logic programs with external sources. During my PhD, I performed a rewrite of the software to make it more efficient. After my PhD, Christoph Redl continued to improve the software in his PhD.
The core of dlvhex is available at github.com/hexhex/core.
Several plugins are available at github.com/hexhex/.
For details please see The homepage of dlvhex.
AI4EU - A European AI On Demand Platform and Ecosystem
In the AI4EU project we are part of a big consortium to create a European AI one-stop-shop and ecosystem to help AI research in the EU to be used by industry, understood by citizens, and to facilitate collaborations.
Project main homepage: https://ai4eu.eu/
Duration: January 2019 to December 2021
Funding: European Union's Horizon 2020 research and innovation programme under grant agreement 825619
English Title: Interpreting Natural Language using Answer Set Programming, Inconsistency Management, and Relevance Theory