Welcome note by the organizers
Vasilis Efthymiou IBM Research - Almaden
The best time to talk with speakers and attendees
Wrap-up + Open Discussion (Organisers + Invited members)
The best time to talk with speakers and attendees
Welcome note by the organizers
Many business users and line of business owners rely on technical people to query and gain insights from their business data. These technical people are experts on using complex query languages such as SQL or SPARQL. Today, it is vital for non-technical business owners to derive insights from their data as quickly as possible to make effective business decisions. Natural language interfaces enable such non-technical users to explore their business data in a more natural way and without relying on technical users' help. In this talk, we will cover some basics about natural language interfaces to data, we will overview the main components of our ontology-based Natural Language Query (NLQ) stack, and we will explore how this NLQ system can be extended to cover the needs of a conversational system. Specifically, we will see how we enabled a conversational system over IBM Watson Assistant for healthcare data about medication, adverse effects, contra-indications, etc. Finally, we will review our latest works on extending the query answers that we provide by leveraging external ontologies, like SNOMED CT.
Dr. Vasilis Efthymiou Dr. , IBM Research - Almaden
Vasilis Efthymiou is a Postdoctoral Researcher at IBM Research - Almaden, working in the areas of data integration, knowledge management, and query answering. Before joining IBM, he was a collaborating researcher at the Information Systems Laboratory of ICS-FORTH, Greece. He received his Ph.D. from the Computer Science Department of University of Crete, Greece, for research on entity resolution. After his internship at IBM T.J. Watson Research Center, NY, on matching Web tables to Knowledge Graphs, he has joined forces with researchers from IBM Research, Oxford University, and City, University of London to co-organize the SemTab challenges: an effort to benchmark systems dealing with the tabular data to KG matching problem, so as to facilitate their comparison on the same basis and the reproducibility of the results. Finally, he has given tutorials in top-tier data management conferences (WWW, ICDE, CIKM, ESWC), and he has co-authored the book 'Entity Resolution in the Web of Data'.
Large knowledge graphs capture information of a large number of entities and their relations. Among the many relations they capture, class subsumption assertions are usually present and expressed using the rdfs:subClassOf construct. From our examination, publicly available knowledge graphs contain many potentially erroneous cyclic subclass relations, a problem that can be exacerbated when different knowledge graphs are integrated as Linked Open Data. In this paper, we present an automatic approach for resolving such cycles at scale using automated reasoning by encoding the problem of cycle-resolving to a MAXSAT solver. The approach is tested on the LOD-a-lot dataset, and compared against a semi-automatic version of our algorithm. We show how the number of removed triples is a trade-off against the efficiency of the algorithm. The code and the resulting cycle-free class hierarchy of the LOD-a-lot are published at www.submassive.cc.
View Pre-printAutomatic subject indexing has been a longstanding goal of digital curators to facilitate effective retrieval access to large collections of both online and offline information resources. Controlled vocabularies are often used for this purpose, as they standardise annotation practices and help users to navigate online resources through following interlinked topical concepts. However, to this date, the assignment of suitable text annotations from a controlled vocabulary is still largely done manually, or at most (semi-)automatically, even though effective machine learning tools are already in place. This is because existing procedures require a sufficient amount of training data and they have to be adapted to each vocabulary, language and application domain anew. In this paper, we argue that there is a third solution to subject indexing which harnesses cross-domain knowledge graphs. Our KINDEX approach fuses distributed knowledge graph information from different sources. Experimental evaluation shows that the approach achieves good accuracy scores by exploiting correspondence links of publicly available knowledge graphs
View Pre-printMonitoring the trajectories of icebergs is a crucial task for the safety of ship traffic in the arctic circle. In this work, we present a system that addresses this problem by combining Earth observation techniques with large scale RDF analytics. To the best of our knowledge, this is the first Semantic Web application in this field.
View Pre-printWrap-up + Open Discussion (Organisers + Invited members)