LDViz

Linked Data Visualizer

Explore linked data with ease using our project! Build, debug, and test SPARQL queries on your chosen SPARQL endpoint. Visualize query results with a range of integrated tools, including MGExplorer, all within our user-friendly query management interface. Enhance your data exploration and gain valuable insights effortlessly!

See the complete list of available visualization tools here

If you are interested in using it, please contact us (see contact information below) and we will provide you authentication.

Coverage Analysis

We evaluated our tool against 419 public SPARQL endpoints to ensure broad compatibility and thorough coverage.

SPARQL result set format

Our application uses the data format proposed by the W3C Recommendation to represent SPARQL select query results using JSON. The data format looks like the following:

{
    head: { 
        link: [], vars: [ "s", "p", "o", "label", "type", "date" ] },
    results: { 
        distinct: false, 
        ordered: true, 
        bindings: [
            {s: { type: "literal", xml:lang: "en", value: "Maximilian Schell" },
            p: { type: "uri", value: http://dbpedia.org/resource/A_Bridge_Too_Far_(film)},
            o: { type: "literal", xml:lang: "en", value: "Dirk Bogarde"},
            label: { type: "literal", xml:lang: "en", value: "A Bridge Too Far (film)"},
            type: { type: "literal", xml:lang: "en", value: "non-fiction" },
            date: { type: "typed-literal", 
            datatype: http://www.w3.org/2001/XMLSchema#date, 
            value: "1977-06-15" }}
        ] 
    } 
}

You find more about the format at https://www.w3.org/TR/2013/REC-sparql11-results-json-20130321/.

Tested SPARQL endpoints

IndeGx is a framework designed to index public KGs that are available online through a SPARQL endpoint. The indexing process uses SPARQL queries to either extract the available metadata from a KG or to generate as much metadata as the endpoint allows it. The results of this indexing process are publicly available through a SPARQL endpoint at http://prod-dekalog.inria.fr/sparql, from which we retrieved the list of endpoints using the query below:


                    prefix index: <http://ns.inria.fr/kg/index\#>

                    prefix desc: <http://www.w3.org/ns/sparql-service-description\#> 

                    SELECT DISTINCT ?endpointUrl where {  

                         GRAPH ?g {  ?metadata index:curated ?dataset . 

                              ?dataset desc:endpoint ?endpointUrl .  

                     }  

                    }

Procedure

We used a set of queries that support RDF graph/vocabulary inspection and RDF summarizations exploration, defined as follows.

RDF graph and vocabulary

RDF graph

select * where { ?s ?p ?o }
Hierarchy of classes

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> select * where { ?s ?p ?o filter(?p = rdfs:subClassOf) }
Hierarchy of properties

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> select * where { ?s ?p ?o filter(?p = rdfs:subPropertyOf) }
Signature of properties

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> select * where { ?s ?p ?o filter(?p = rdfs:domain || ?p = rdfs:range)} }

RDF summarizations

Class paths

select distinct ?s ?p ?o where { ?a ?p ?b . ?a a ?s . ?b a ?o }
Property paths

prefix ldv: <http://ldv.fr/path/> select distinct ?s (ldv: as ?p) ?o where { ?x ?s ?y . ?y ?o ?z . filter (?s != ?o) }
Paths of form "class → property → class"

prefix ldv: <http://ldv.fr/path/> select distinct ?s (ldv: as ?p) ?o where { { ?a ?b ?c. ?a a ?s . bind (?b as ?o) } UNION { ?a ?b ?c. ?c a ?s . bind (?b as ?o) } }

These queries are rather generic and can be applied to any SPARQL endpoint. The only specific vocabulary used by these queries is the RDF Schema, which provides a data modeling vocabulary for RDF data and would be therefore expected to appear in most RDF graphs. We limited each query to 10 solutions to speed up the process, as our goal was to inspect the resulting data format to check whether we could visualize it using LDViz; the actual data was not important for this analysis. A request would have mainly two possible outcomes:

In case of a successful request, we inspect the resulting data format to verify whether it matches the SPARQL JSON result set defined by W3C Recommendation. If the data does not match the expected format, we inspect it further to identify its format, which may sometimes be HTML or CSV, for instance.
In case of a failed request, we inspect the error thrown to understand why we were unable to retrieve data from that particular endpoint.

Results

About 41.77% of the SPARQL endpoints returned a valid result set that could be explored using LDViz. We noticed that the queries seeking for class and property hierarchy, and signature of properties were slightly less successful than the remaining, where only about 38.19% of SPARQL endpoints returned a valid result set. Regarding the issues found while querying the SPARQL endpoints, we could identify 11 different reasons for why it cannot be explored using LDViz, described as follows:

HTML: About 16.06% of the requests returned an HTML object, which may contain valid results from the SPARQL endpoint, but cannot be processed by the LDViz transformation engine.
Service Not Found: The SPARQL endpoint could not be found (request status 404 and 410). We encountered this issue in about 14.18% of requests.
Service Unreachable: This issue is identified when the connection is refused by the server (throwing the ECONNREFUSED error), or the protocol encountered an unrecoverable error for that endpoint (throwing the EPROTO error). On average, 7.06% of requests encountered this issue. We noticed that this issue appeared slightly more often for the SPARQL query recovering the RDF summarization through class paths then for the remaining, where we observed the issue in 8.59% of requests.
Timeout: About 6.27% of the requests encountered a timeout issue. This is due to the request response not being received within the default timeout of the fetch request, which is of about 300 seconds (request statuses 408 and 504) or the Virtuoso server estimating the query processing time to be longer than its established timeout of 400 seconds.
No results: This issue means that the request returned a valid JSON object, but the \texttt{bindings} array was empty. In average, 4.30% of SPARQL endpoints did not provide results to our queries. However, once again, we observe that this number is higher for SPARQL queries seeking for the signature of properties, class, and property hierarchies, where we observe that about 8.35% of endpoints did not provide results against an average of only 1.25% of endpoints not providing results for the remaining queries.
Invalid Certificate: The request could not be completed due to an invalid certificate on the SPARQL endpoint side. This issue was observed in about 3.10\% of requests, which correspond to 13 SPARQL endpoints.
CSV: On average, 2.18% of the requests returned a string object which content follows a CSV format. The result set may contain valid data but cannot be processed by the LDViz transformation engine.
Bad Request: The request could not be fulfilled due to bad syntax (request status 400). This error was thrown by 2.18% of requests, which correspond to 9 to 11 SPARQL endpoints. We could observe that the SPARQL queries seeking for the RDF graph and an RDF graph summarization through class paths were slightly less affected than the remaining.
Format Not Supported: Requests for 6 different SPARQL endpoints have responded with this error (1.43% of requests), which means that the server can only generate a response that is not accepted by the client (status 406).
Access Unauthorized: This issue encompasses the following request responses: the server refuses to respond (status 401 and 403), and authentication is required (status 407 and 511). We observed that three endpoints (0.95\% of requests -- 4 SPARQL endpoints) required authentication, which we could not provide.
Not W3C compliant: The request responded with a JSON object that does not follow the JSON format specified by the W3C Recommendation. This issue was observed in 2 SPARQL endpoints (0.48%).

The TreeMap graph below shows the distribution of different responses obtained while querying 419 SPARQL endpoints. The top level of the TreeMap graph contains seven rectangles each of which covers a specific query type (i.e Paths → Properties → Class, Class paths, Class hierarchy, RDF graph, property hierarchy, signature of properties, and properties path). These rectangles are further divided in smaller colored rectangles that summarize the results obtained per query type including SPARQL endpoints supported by LDViz and issues encountered while accessing the endpoints (e.g., Access Unauthorized, Service not found, etc.). The size of the rectangle encodes the number of results obtained for each query.

Related Publications

Aline Menin, Pierre Maillot, Catherine Faron, Olivier Corby, Carla Maria Dal Sasso Freitas, et al.. LDViz: a tool to assist the multidimensional exploration of SPARQL endpoints. Web Information Systems and Technologies : 16th International Conference, WEBIST 2020, November 3-5, 2020, and 17th International Conference, WEBIST 2021, October 26–28, 2021, Virtual Events, Revised Selected Papers, LNBIP - 469, Springer, pp.149-173, 2023, LNBIP - Lecture Notes in Business Information Processing, 978-3-031-24196-3. ⟨10.1007/978-3-031-24197-0⟩. (hal-03929913)
Aline Menin, Minh Nhat Do, Carla Dal Sasso Freitas, Olivier Corby, Catherine Faron Zucker, et al.. Using Chained Views and Follow-up Queries to Assist the Visual Exploration of the Web of Big Linked Data. International Journal of Human-Computer Interaction, 2022. (hal-03518845)
Aline Menin, Catherine Faron Zucker, Olivier Corby, Carla Dal Sasso Freitas, Fabien Gandon, et al.. From Linked Data Querying to Visual Search: Towards a Visualization Pipeline for LOD Exploration. International Conference on Web Information Systems and Technologies (WEBIST), Oct 2021, Online Streaming, France. (10.5220/0010654600003058). (hal-03404572)
Aline Menin, Ricardo Cava, Carla Dal Sasso Freitas, Olivier Corby, Marco Winckler. Towards a Visual Approach for Representing Analytical Provenance in Exploration Processes. IV 2021 - 25th International Conference Information Visualisation, Jul 2021, Melbourne / Virtual, Australia. (10.1109/IV53921.2021.00014). (hal-03292172)
Maroua Tikat, Aline Menin, Michel Buffa, Marco Winckler. Engineering Annotations to Support Analytical Provenance in Visual Exploration Processes. ICWE 2022 - 22nd International Conference of Web Engineering, Jul 2022, Bari, Italy. pp.1-16, (10.1007/978-3-031-09917-5_14). (hal-03779349)
Anne Toulet, Franck Michel, Anna Bobasheva, Aline Menin, Sébastien Dupré, et al.. ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scientists Search and Make Sense of a Scientific Archive. ISWC 2022 - 21st International Semantic Web Conference, Oct 2022, Hangzhou, China. (10.1007/978-3-031-19433-7_38). (hal-03807744)

Contact

Aline Menin, Associate Professor at Université Côte d'Azur (E-mail: aline.menin@inria.fr)
Marco Winckler, Full Professor at Université Cote d'Azur (E-mail: marco.winckler@inria.fr)

*This work is developed by the Wimmics team at the Centre Inria, University Côte d'Azur. It results from collaborative efforts involving several researchers and students.

LDViz

LDViz

Linked Data Visualizer

Coverage Analysis

SPARQL result set format

Tested SPARQL endpoints

List of SPARQL endpoints used in the analysis (click to reveal)

Procedure

RDF graph and vocabulary

RDF summarizations

Results

Related Publications

Contact