About Me

edit

I am a master student at the Eindhoven University of Technology. I have a focus in databases, mostly relational but more recently also graph based databases like WikiData. Currently I am doing my thesis on improving the UI of ShEx validators, and specifically the ShEx Simple Online Validator used on Wikidata. The goal of the project is to make it easier to use the validator to change data to conform to existing schemas. This fits within the existing vision of Wikidata to encourage high data quality through using schemas to encourage standardization.

I live in the Netherlands, and my timezone is Central Europe (CET) If you want to contact me outside Wikidata, mail to m.v.alten@student.tue.nl

The Tabular Validator

edit

I have recently released a new version of the validator used for schema validation in Wikidata. The main change is a new look for the validation reports that the validator generates. Some features are still a work in progress and some bugs and issues remain, but it can now be used and is hosted on toolforge.

Validator Evaluation

edit

I invite all Wikidata Users to help me evaluate the new validator in an interview with me. During this interview you will be asked to fix some nonconformances in a Wikibase Cloud instance created for this evaluation while I observe. After this we will talk about your experiences and you will fill out a questionnaire about your experience with the validator tools you used.

Evaluation signup is now closed. I want to thank all participants for their time and effort, and it has given me some further things to implement for the validator.

Using the Validator

edit

The new Validator expects the same input as the old validator. However it is not built on the exact version of the validator Wikidata uses, and there are some new fields. When opening the validator, there are 3 main text fields: The blue schema input on the left, the green data input on the right, and the query input further down. If you are using Wikidata, enter "Endpoint: https://query.wikidata.org/sparql" into the data field. To choose a schema, copy paste the text of the entity schema you want to validate against in the blue field on the left. Then to add your data, in the text field labelled 'Query Map', write a SPARQL query with before it

SPARQL'''

and after it

'''@START

.

To validate your data against this schema, press the validate button. This will generate a table and an error that says "error validating: elt.attr is not a function". The error can be ignored. The table shows a different possible nonconformance on each row.

If you find a nonconformance with the error type "ClosedShapeViolation", "SemActFailure" or "TypeMismatch", you will notice the output you get is not very useful. This is because they aren't properly implemented yet. Please send me an email with what schema and query you used, and a screenshot of the table, and don't fix the nonconformance. This will help me develop the code to handle these types of nonconformances.

Exporting the table

edit

If you are part of a group working on making many items validate to a specific schema, you can export the table to Wikidata using the "Copy Table Wikidata Code" button that appears once a table is being generated. This copies the table to clipboard, and you can then paste it into a Wikidata page you are editing (or somewhere else that supports HTML tables, if you want to).

Using a different Wikibase Instance

edit

If you are using a different Wikibase instance, such as a Wikibase Cloud instance, you need to do the following to make the validator query your data rather than Wikidata: - In the schema field, add any prefixes used in your schema, for example "PREFIX p: <https://www.validatortest.wikibase.cloud/prop/>". - In the data field, replace the endpoint with the sparql query endpoint for your wikibase. - In the Query Endpoint and Wikibase Prefix fields, add the query endpoint and the start of the link to your wikibase. Note that unlike the preview, the Wikibase Prefix should not end with a slash (/). Note that Wikibase Cloud prefix links start with https, while Wikidata links start with http.

Future Development

edit

There are a few main features I still want to implement, and I would like to get rid of some bugs.

Future Features:

  • Implement the features in the Simple Online Validator to not require the text around each SPARQL query.
  • Allow the values in the new text fields to be encoded in the website url, like the schema and data already are.

Known bugs:

Main Version History

edit

2024-06-26

edit
  • Fixed a major issue causing some Wikidata items to not be rendered in the output
  • Changed the wording on a bunch of things based on feedback by Lydia Pintscher (WMDE)
  • Default behaviour is now to show conforming items

2024-06-20

edit
  • Threading now works correctly!

2024-06-18

edit
  • Exported tables now come with a "inspection done" column. This can be used to check off whether you have fixed issues (or confirmed something as not an issue), especially useful when working with one validation result as a group.
  • Some code changes in an attempt to make the program threaded (the old version of the validator is)
  • Links to items or properties now come with the id and label both, eg. instance of (P31)
  • Links now open in new tabs
  • There is text with instructions on how to use the validator on the validator, and new text about how to use the table appears when the table in generated.

2024-05-21

edit
  • It is now possible to export a table. Large tables can be copied over to another page by copying them to the clipboard with the new button. This can be used to add a list of things to check to a wikiproject to allow people not used to inputting information into the validator to benefit from them.
  • The bug with a table row being repeated behind the first instance of that row should be fixed.
  • Some fields that shouldn't have links (eg. a string giving a persons name) had links, but now don't

2024-05-07

edit
  • Fixed the issue showing elt.attr error after successfull rendering
  • Inheritance added to items: If there is a non-conformance with the problem in a different item, both items are now shown in the item field, with an arrow between them.

2024-05-02

edit
  • Added an option to show conformant items (in green)
  • Fixed a bug that prevented using endpoints that are not wikidata
  • Removed some old console logging

2024-04-29: First release

edit

Hosted initial version of validator with table output mode.

User Subpages:

edit

Special:Prefixindex/User:M.alten.tue