The Web of Data has emerged as a way of exposing structured linked data on the Web. It builds on the central building blocks of the Web (URIs, HTTP) and benefits from its simplicity and wide-spread adoption. It does, however, also inherit the unresolved issues such as the broken link problem. Broken links may occur when (Web) resources are removed, moved or updated. Furthermore links may be temporarily broken when their referenced resources become temporarily unavailable due to network problems or similar.
Broken links constitute a major issue for actors consuming Linked Data as they require them to deal with reduced accessibility of data. We believe that the broken link problem is a major threat to the whole Web of Data idea and that both Linked Data consumers and providers will require solutions that deal with this problem.
DSNotify is a generic change detection framework for Linked Data sources that informs data-consuming actors about the various types of events (create, remove, move, update) that can occur in data sources. DSNotify is currently under development by the University of Vienna.
Our approach for fixing broken links is based on an indexing infrastructure. A monitor (e.g., a Web crawler) accesses considered Linked Data resources, extracts feature vectors and stores them to an index. The monitor detects what resources were created, removed or modified by consulting this index. Detected events are written to a central event log and result in notifications sent to registered applications.
A so-called housekeeper accesses the index periodically and heuristically detectes
move events based on feature vector comparisons. Such detected move events are also
logged and subscribers are notified by DSNotify.
For further details, please consult out publications.
DSNotify is implemented in Java and makes use of several open-source frameworks such as Apache lucene, Jena, Quartz and Jetty. DSNotify is highly configurable which allows it to be tailored to particular application domains. It is possible to provide plug-in implementations for the following components:
- Monitors, data spaces and data space regions
- Move detector heuristics
- Feature extractors and comparators
DSNotify is not meant to be a service that monitors the whole Linked Data space but rather as a light-weight component that can be tailored to application-specific needs and detects modifications in selected Linked Data sources.
A typical usage scenario is depicted in the figure above: A Web portal hosts Linked Data that is linked to other, external data sources. As these links can break, the portal uses DSNotify to monitor the remote data source and recieves notifications about detected events such as resource moves or deletions. The portal updates the links in its hosted data therby preseving link integrity.
For further details about DSNotify, please consult our publications.