Data Exploration by Example

Exploring the Data Wilderness through Examples

Data Exploration using Example-based Methods

Matteo Lissandrini, Davide Mottin, Themis Palpanas, Yannis Velegrakis

Exploration is one of the primordial ways to accrue knowledge about the world and its nature. As we accumulate, mostly automatically, data at unprecedented volumes and speed, our datasets have become complex and hard to understand. In this context exploratory search provides a handy tool for progressively gather the necessary knowledge by starting from a tentative query that hopefully leads to answers at least partially relevant and that can provide cues about the next queries to issue. An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so called example-based methods, in which the user, or the analyst circumvent query languages by using examples as input.

This shift in semantics has led to a number of methods receiving as query a set of example members of the answer set. The search system then infers the entire answer set based on the given examples and any additional information provided by the underlying database. In this tutorial, we present an excursus over the main example-based methods methods for exploratory analysis. We show how different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data. We conclude by providing a unifying view of this query-paradigm and identify new exciting research directions.

Data Exploration using Example-based Methods

Read the book

Preface:

Cover of book: Data Exploration using Example-based Methods
Synthesis Lectures on Data Management — Morgan & Calypool publishers

Exploration is one of the primordial ways to accrue knowledge about the world and its nature. It describes the act of becoming familiar with something by testing or experimenting, and at the same time it evokes the image of a traveler traversing a new territory. As we accumulate, mostly automatically, data at unprecedented volumes and speed, our datasets have become less and less familiar to us. In this context we speak of exploratory search as of the process of gradual discovery and understanding of the portion of the data that is pertinent to an often-times vague user's information need. Contrary to traditional search, where the desired result is well defined and the focus is on precision and performance, exploratory search usually starts from a tentative query that hopefully leads to answers at least partially relevant and that can provide cues about the next query. By understanding the distinction between a traditional query and an exploratory query, we can change the semantics of the user input: instead of a strict prescription of the contents of the result-set, we provide a hint of what is relevant. This shift in semantics has led to a number of methods having in common the very specific paradigm of search by-example. Search by-example receives as query a set of example members of the answer set. The search system then infers the entire answer set based on the given examples and any additional information provided by the underlying database.

With this book we have surveyed more than two hundreds research sources to highlight the main example-based techniques for relational, graph, and textual data. The book provides insights on how these example-based search systems can be employed by expert and non-expert users in retrieving the portion of the data that is relevant to their interest, while avoiding the use of complex query languages. We hope this book answers the questions and builds the necessary knowledge to those interested in constructing new data exploration systems.

Data Exploration in the middleware between the user and the data management system. Covers Relational, Graph, and Textual data models.
This book covers example-based techniques for Relational, Graph, and Textual data models.

Graduate students would hopefully deepen their interest in the subject and being involved in the new challenges and opportunities allowed by the powerful exploration method of search-by-example. Researchers and practitioners working in the area will probably find new insights for further improving their approaches and systems.

More info

Read more (PDF), Check the sample chapter (PDF), or

Buy the book

Cite:

and
Data Exploration Using Example-Based Methods.”
Synthesis Lectures on Data Management , 10 (4) (): pages: 164. Morgan & Claypool Publishers
ISBN: 9781681734552.

Tutorials at International Conferences

Example-driven Search: a New Frontier for Exploratory Search

Tutorial at SIGIR'19

Get the slides: PDF    PPT

Participant form for questions/comments: j.mp/ExploreSIGIR


Exploratory search includes methods to efficiently extract knowledge from data repositories, even if we do not know what exactly we are looking for, nor how to precisely describe our needs. The need for new and effective exploratory search methods is particularly relevant given the current abundance and richness of today's large datasets. In common exploratory settings, the user progressively acquires the knowledge by issuing a sequence of generic queries to gather intelligence about the data. However, the existing body of work in data analysis, data visualization, and predictive models, assumes the user is willing to pose several well defined or structured queries to the underlying database in order to progressively gather the required information. This assumption stems from the intuition that the user is accustomed to data analysis techniques. Yet, very often, this assumption is not true.

SIGIR'19 in Paris
A special version of our tutorial has been presented on Sunday, July 21st at SIGIR'19 in Paris.

We note that the flexibility examples provide does not compromise the richness of the results, yet, it can overcome the ambiguity of generic keyword searches, which are frequently found in information retrieval. On the other hand, while data exploration techniques assume the user is willing to pose several exploratory queries, the use of examples allows the searcher to provide more information with less effort, making example-based methods a more palatable choice for novice users, as well as for practitioners. This new functionality can empower existing information retrieval systems with a complementary tool: whenever a query is too complex to be expressed with detailed set of conditions, examples represent a natural alternative. In this respect example-based exploration is a middle ground between the user interface, and the data-management layer, enabling new functionalities for the former and allowing more natural exploitation of the latter. Moreover, the use of examples has been demonstrated to be very effective in visual query interfaces. Here we demonstrate how example-based methods can be employed as an expressive and powerful method for exploratory search systems.

The tutorial covers

A background on example-based methods for exploratory search for relational data

Then it will focus on Example-based methods for exploratory search on:

  1. Documents and Text
  2. Networks and Knowledge Graphs

It covers also machine-learning methods that can learn from user interaction for intelligent exploration.


Exploring the Data Wilderness through Examples.

Tutorial at SIGMOD'19

Get the slides: PDF    PPT

Recently, the research community has resorted to the use of examples as a proxy for exploratory analysis. One of the earliest attempts to bring examples as a query method is query-by-example (QBE). The main idea was to help the user in the query formulation, allowing her to specify the shape of the results in terms of templates for tuples, i.e., examples. Query-by-example has been lately revisited, and the use of examples have found application in several areas across various data types. The definition of example has transformed from a mere template to the representative of the intended results the user would like to have. These example-based approaches are fundamentally different from the initial query-by-example idea, and have been successfully applied to relational, textual, and graph data.

In this tutorial, we aim at describing the main developments of examples as an expressive and powerful method for exploratory data analysis.

SIGMOD'19 in Paris
Our tutorial has been presented on Sunday, June 30th at SIGMOD'19 in Amsterdam.

The tutorial covers

Example-based methods for exploratory search on:

  1. Relational data
  2. Textual data
  3. Graphs

It also covers approaches to exploit machine-learning methods based on examples for intelligent exploration.


New Trends on Exploratory Methods for Data Analytics

Tutorial at VLDB'17

Get the slides: PDF    PPT

Data usually comes in a plethora of formats and dimensions, rendering the exploration and information extraction processes cumbersome. Thus, being able to cast exploratory queries in the data with the intent of having an immediate glimpse on some of the data properties is becoming crucial. An exploratory query should be simple enough to avoid complicate declarative languages (such as SQL) and mechanisms, and at the same time retain the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so called example-based methods, in which the user, or the analyst circumvent query languages by using examples as input.

An example is a representative of the intended results, or in other words, an item from the result set. Example-based methods exploit inherent characteristics of the data to infer the results that the user has in mind, but may not able to (easily) express.

They can be useful both in cases where a user is looking for information in an unfamiliar dataset, or simply when she is exploring the data without knowing what to find in there. In this tutorial, we present an excursus over the main methods for exploratory analysis, with a particular focus on examplebased methods. We show how different data types require different techniques, and present algorithms that are specifically designed for relational, textual, and graph data.

VLDB'17 in Munich
Our tutorial has been presented at VLDB'17 in Munich.

The tutorial covers

  1. Example methods in relational databases
  2. Example methods in textual data
  3. Example methods in graphs
  4. Learning methods based on examples