Français Anglais
Accueil Annuaire Plan du site
Accueil > Evenements > Séminaires
Salle 445 - Resolving Entities in the Web of Data
Vassilis Christophides

05 May 2017, 16:30
Salle/Bat : 445/PCRI-N
Contact :

Activités de recherche : Integration of Data and Knowledge

Résumé :
Over the past decade, numerous knowledge bases (KBs) have
been built to power a new generation of Web applications that provide
entity-centric search and recommendation services. These KBs offer
comprehensive, machine-readable descriptions of a large variety of
real-world entities (e.g., persons, places, products, events) published
on the Web as Linked Data (LD). Even when derived from the same data
source (e.g., a Wikipedia entry), KBs such as DBpedia, YAGO2, or
Freebase may provide multiple, non-identical descriptions for the same
real-world entities. This is due to the different information extraction
tools and curation policies employed by KBs, resulting to complementary
and sometimes conflicting entity descriptions. Entity resolution (ER)
aims to identify different descriptions that refer to the same
real-world entity, and emerges as a central data-processing task for an
entity-centric organization of Web data. ER is needed to enrich
interlinking of data elements describing entities, even by
third-parties, so that the Web of data can be accessed by machines as a
global data space using standard languages, such as SPARQL. ER can also
facilitate an automated KB construction by integrating entity
descriptions from legacy KBs with Web content published as HTML documents.
ER has attracted significant attention from many researchers in
information systems, database and machine-learning communities. The
objective of this lecture is to present the new ER challenges stemming
from the Web openness in describing, by an unbounded number of KBs, a
multitude of entity types across domains, as well as the high
heterogeneity (semantic and structural) of descriptions, even for the
same types of entities. The scale, diversity and graph structuring of
entity descriptions published according to the LD paradigm challenge the
core ER tasks, namely, (i) how descriptions can be effectively compared
for similarity and (ii) how resolution algorithms can efficiently filter
the candidate pairs of descriptions that need to be compared.
In a multi-type and large-scale entity resolution, we need to examine
whether two entity descriptions are somehow (or near) similar without
resorting to domain- specific similarity functions and/or mapping rules.
Furthermore, the resolution of some entity descriptions might influence
the resolution of other neighbourhood descriptions. This setting clearly
goes beyond deduplication (or record linkage) of collections of
descriptions usually referring to a single entity type that slightly
differ only in their attribute values. It essentially requires
leveraging similarity of descriptions both on their content and
structure. It also forces us to revisit traditional ER workfows
consisting of separate indexing (for pruning the number of candidate
pairs) and matching (for resolving entity descriptions) phases.

Pour en savoir plus :
Measuring Similarity between Logical Arguments
Automated Reasoning
Monday 06 March 2023 - 00:00
Salle : 0 - 650
Victor David .............................................

Imputing Out-of-Vocabulary Embeddings with LOVE Ma
Data-Centric Languages and Systems
Monday 20 February 2023 - 00:00
Salle : 455 - PCRI-N
Lihu Chen .............................................

On the Interplay between Software Product Lines an
Automated Reasoning
Tuesday 18 October 2022 - 14:15
Salle : 2013 - DIG-Moulon
Vander Alves .............................................

Combining randomized and observational data: Towar
Automated Reasoning
Thursday 13 October 2022 - 10:30
Salle : 2011 - DIG-Moulon
Bénédicte Colnet .............................................

New Achievements of Artificial Intelligence in Mul
Automated Reasoning
Tuesday 11 October 2022 - 14:15
Salle : 2013 - DIG-Moulon