Data Exploration by Example

PhD Course

PhD Course on Data Exploration using Example-based Methods

Davide Mottin, Matteo Lissandrini

Exploration is one of the primordial ways to accrue knowledge about the world and its nature. As we accumulate, mostly automatically, data at unprecedented volumes and speed, our datasets have become complex and hard to understand. In this context exploratory search provides a handy tool for progressively gather the necessary knowledge by starting from a tentative query that hopefully leads to answers at least partially relevant and that can provide cues about the next queries to issue. An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so called example-based methods, in which the user, or the analyst circumvent query languages by using examples as input.

This shift in semantics has led to a number of methods receiving as query a set of example members of the answer set. The search system then infers the entire answer set based on the given examples and any additional information provided by the underlying database. In this tutorial, we present an excursus over the main example-based methods methods for exploratory analysis. We show how different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data. We conclude by providing a unifying view of this query-paradigm and identify new exciting research directions.

PhD Course Material

Content of the course

This is a PhD Course that was given at Aalborg University in May 2019.

Format: Readings, lectures, and hands on exercises.

A general background in computer science and general familiarity with database management, as can be achieved through an undergraduate database course, is expected. Participants who have taken a graduate database course will benefit from this additional background.

Learning objectives:
The goal of this course is to enable the students to understand ongoing trends in exploratory analysis and example-based methods. In particular, the course will cover techniques designed for relational, textual, and graph data as well as highlight challenges and new frontiers of machine learning in online settings.

  1. Example methods in relational databases (PDF)
  2. Example methods in textual data (PDF)
  3. Example methods in graphs (PDF)
  4. Learning methods based on examples (PDF)