Page MenuHomePhabricator

Thadguidry (Thad Guidry Sr.)
User

Projects

User does not belong to any projects.

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Jul 8 2019, 2:27 PM (274 w, 1 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
Thadguidry [ Global Accounts ]

Recent Activity

May 27 2024

Thadguidry added a comment to T362149: Alternative, affordable, lower-barrier approach(es) to reconciliation.

One of our plans with DB2Rest is to provide a simple instant Recon API for database tables.
It will be a web app, like OpenRefine is, and allow a user to instantly create a Recon API from local files or existing database.

May 27 2024, 2:43 AM · Technical-Tool-Request, artificial-intelligence, Reconciliation

Jan 12 2024

Thadguidry created T354922: Type field does not always select from the drop down, if backspaced and retyped.
Jan 12 2024, 5:28 AM · Abstract Wikipedia team (25Q1 (Jul–Sep)), WikiLambda Front-end

Jun 14 2023

Thadguidry added a comment to T257719: Add support for AVIF: allow uploading AVIF files to Wikimedia servers.

I wanted to give my support to adding AVIF format especially to allow upload AVIF images to Commons. In fact, that should be a primary use case more than any other.
AVIF is supported now in all the major operating systems and image software.
Agree with @Trougnouf that we should think of the other side of things, storage and convenience for users. HDR 8K + ranges are supported by AVIF.
Our wiki article https://en.wikipedia.org/wiki/AVIF is excellent to gain a sense of support now in 2023, and a few other highlights are mentioned on a Mozilla dev page https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Image_types#avif_image

Jun 14 2023, 1:03 AM · Thumbor, Wikimedia-Site-requests

Aug 22 2022

Thadguidry added a comment to T315569: Guess Types query failed error : java.io.IOException: HTTP error 400 : BAD REQUEST for URL /en/api.

@Spinster @Vojtech.dosta l would say that non-printable chars (hidden) should be non-reconcilable. The reasoning is that hidden characters are reserved for machine use rather than human use. Regardless, in OpenRefine we hope to finish this longstanding issue https://github.com/OpenRefine/OpenRefine/issues/1286 so that folks can have a visual sense of when data quality might suffer for reconciliation. BUT we also need a hidden characters facet https://github.com/OpenRefine/OpenRefine/issues/5207 added to OpenRefine which could be done in a few hours and submitted for a PR. Then as a best practice, the hidden characters facet would be a pre-step to any reconciliation workflow.

Aug 22 2022, 7:34 PM · Reconciliation

Apr 11 2022

Thadguidry added a comment to T224214: Allow structured data to be added via API:Upload.

@Dominicbm you bring up a good point that I also had concerns of... if the upload process is not synchronous/simultaneous with attaching structured data... then having orphaned/incomplete uploaded files. Even though we would want to be efficient with batch uploading files, I think the reality is that the "batch" would be considered "a series of individual file uploads with attached structured data". The whole batch could fail, or only 1 of the individual files where the batch would be considered partially upload completed.

Apr 11 2022, 7:28 PM · Structured-Data-Backlog, Structured Data Engineering

Feb 23 2022

Thadguidry added a comment to T302414: Only display Caption option in data extension dialog window when user types actual (existing, valid) language codes.

Ah, ok

Feb 23 2022, 5:27 PM · StructuredDataOnCommons, Reconciliation
Thadguidry added a comment to T302414: Only display Caption option in data extension dialog window when user types actual (existing, valid) language codes.

Oh you likely mean only the Wikimedia language code https://www.wikidata.org/wiki/Property:P424 ?

Feb 23 2022, 4:10 PM · StructuredDataOnCommons, Reconciliation
Thadguidry added a comment to T302414: Only display Caption option in data extension dialog window when user types actual (existing, valid) language codes.

@Spinster Wouldn't that be a bit too limiting, potentially for the future, if any new properties are minted? I'm not sure myself so just asking about the full context here. My thoughts are that maybe allowing options for further constraints in this dialog like properties that are subclassed under https://www.wikidata.org/wiki/Q18616084 or https://www.wikidata.org/wiki/Q20824104 ? Maybe there could be an extra toggle filter in this dialog that says "Only languages" and it filters to only those properties under those 2 subclasses?

Feb 23 2022, 4:01 PM · StructuredDataOnCommons, Reconciliation

Feb 6 2022

Thadguidry added a comment to T299460: Evaluate the Apache Jena Framework.

Hi @AndySeaborne What is the latest benchmarks for loading Wikidata all and truthy with Jena 4.4.0 release and the new TDB2 xloader with "--threads" argument? I noticed the release notes said this:

Feb 6 2022, 2:28 PM · MediaWiki-Stakeholders-Group, Wikidata, Epic, Wikidata-Query-Service

Nov 19 2021

Thadguidry added a comment to T289760: Evaluate Oxigraph as alternative to Blazegraph.

@BenAtOlive I think for bikeshedding or hand-waving discussions, you can just start an new discussion thread in Oxigraph's GitHub Discussions (not Issues). Here: https://github.com/oxigraph/oxigraph/discussions

Nov 19 2021, 8:05 PM · Wikidata, Wikidata-Query-Service
Thadguidry added a comment to T289760: Evaluate Oxigraph as alternative to Blazegraph.

As someone who has "been there, done that" (even with Apache Geode)... I can tell you that data locality is very important when you want to maximize performance. But if the data is maintained as distributed, then the only way to squeeze out improved performance is if you can temporarily have that data locality and that sometimes means temporary or ad hoc data replication...which has a cost itself but isn't insurmountable.

Nov 19 2021, 7:54 PM · Wikidata, Wikidata-Query-Service

Sep 15 2021

Thadguidry added a comment to T220823: Use ElasticSearch for bulk Wikidata entity term lookup.

@Addshore That's what I figured. :-) This issue did feel old and sort of in a dustbin. Agree it should be closed.

Sep 15 2021, 3:16 PM · Discovery-Search, User-Addshore, [DEPRECATED] wdwb-tech, Wikidata

Sep 10 2021

Thadguidry added a watcher for Reconciliation: Thadguidry.
Sep 10 2021, 3:31 PM

Aug 31 2021

Thadguidry added a comment to T289760: Evaluate Oxigraph as alternative to Blazegraph.

@Tpt Looks great! The ROADMAP file was a suggested alternative to the Milestones, sorry didn't make that clear. I much prefer grouping or tagging issues against Milestones as you have done! You have the right idea regarding a single source of truth and exactly the best practices! Your a natural.

Aug 31 2021, 1:03 PM · Wikidata, Wikidata-Query-Service

Aug 26 2021

Thadguidry added a comment to T289760: Evaluate Oxigraph as alternative to Blazegraph.

Hi @Tpt Can you elaborate more in your Milestones and create more Milestone as necessary for your future vision? Like what you mean by "no storage format stability for now", and what that really means to users and what you are thinking about in the long term towards solving that?
Maybe a ROADMAP.md file in the repo might be good to add as a quick high-level overview, which then has links to Milestones (and perhaps make more future vision Milestone links, even if 2 years away, or just a dream but wrapped with practicality).
https://github.com/oxigraph/oxigraph/milestones

Aug 26 2021, 5:19 PM · Wikidata, Wikidata-Query-Service

Aug 22 2021

Thadguidry updated the task description for T289428: U+002C comma is not being excluded by default in simple search input box for CirrusSearch.
Aug 22 2021, 3:40 PM · CirrusSearch, Discovery-Search, Wikidata
Thadguidry updated the task description for T289428: U+002C comma is not being excluded by default in simple search input box for CirrusSearch.
Aug 22 2021, 3:38 PM · CirrusSearch, Discovery-Search, Wikidata
Thadguidry created T289428: U+002C comma is not being excluded by default in simple search input box for CirrusSearch.
Aug 22 2021, 3:37 PM · CirrusSearch, Discovery-Search, Wikidata
Thadguidry added a comment to T220823: Use ElasticSearch for bulk Wikidata entity term lookup.

I'd suggest adding replica shards (copies of primary shards) that help to both ensure redundancy to protect against failure, but they also vastly increase the capacity for read requests such as searching, like Adam's entity term lookup use case. You can change the number of replica shards at any time without affecting indexing or query operations. https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html

Aug 22 2021, 3:04 PM · Discovery-Search, User-Addshore, [DEPRECATED] wdwb-tech, Wikidata

Aug 19 2021

Thadguidry updated subscribers of T206560: [Epic] Evaluate alternatives to Blazegraph.

+1 for Oxigraph. @Tpt has been putting in a ton of good effort, research, features, and stability. Sponsoring him now in GitHub as well for his effort.
As it's being developed in Rust, it automatically takes advantage of data streaming in places that utilizes intrinsic functions (forwarded through LLVM compiler IR) in CPU's. Java 17 is just now getting into a better position with it's new Vector API. On top of that, the RIO Parser is one of the fastest RDF parsers I've seen run on my system, which he also graciously maintains in Rust.

Aug 19 2021, 12:32 AM · Wikidata, Epic, Wikidata-Query-Service

Aug 16 2021

Thadguidry added a comment to T210961: Add a rank for outdated but correct data.

We'll also want to improve the Help:Ranking page once this proposal task is implemented.

Aug 16 2021, 5:14 PM · Wikidata
Thadguidry added a comment to T210961: Add a rank for outdated but correct data.

Agree generally on this proposals' assertions. It makes sense also from a data quality perspective, and since we are actively adding new tools to improve our data quality, then having a new "outdated" rank to represent a "once upon a time this was factual" would be very convenient and easier to arrive at community consensus. In fact, some blockchains form consensus in the exact same fashion, ex. Solana blockchain sorta does the same thing to gain speed and efficiency (otherwise getting consensus on details slows it down) ... it cares about the fact that something occurred or has changed or is outdated...but the details of when, where, how, can be deduced or ascertained later.

Aug 16 2021, 5:09 PM · Wikidata
Thadguidry awarded T210961: Add a rank for outdated but correct data a Like token.
Aug 16 2021, 5:02 PM · Wikidata

Aug 15 2021

Thadguidry updated the name of F34597027: Open_Tasks_does_not_allow_any_way_for_Advanced_Search_of_it.png from "Open_Tasks_does_not_allow_Advanced_Search_for_it.png" to "Open_Tasks_does_not_allow_any_way_for_Advanced_Search_of_it.png".
Aug 15 2021, 8:19 PM
Thadguidry updated the name of F34597027: Open_Tasks_does_not_allow_any_way_for_Advanced_Search_of_it.png from "image.png" to "Open_Tasks_does_not_allow_Advanced_Search_for_it.png".
Aug 15 2021, 8:18 PM
Thadguidry added a comment to T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.

Yes, that is the steps to reproduce. The general UX is that there's a context shift that the user didn't ask for yet and the users are not given any clue about it happening. It's a bad user experience and something that all the users in T10640 are experiencing and asking for the same thing. The flow of how to get to an Advanced Search for/from Manifests specifically is the core of the problem. Using the search field in the upper right corner is quite expected, and then users see the dropdown with the Advanced Search option...ultimately clicking it and leading down into the wrong hole.

Aug 15 2021, 8:17 PM · Phabricator (Search)
Thadguidry edited projects for T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search, added: Phabricator (Search), Tool-global-search, Advanced-Search; removed Phabricator.
Aug 15 2021, 5:04 PM · Phabricator (Search)
Thadguidry updated the task description for T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 4:59 PM · Phabricator (Search)
Thadguidry updated the task description for T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 4:58 PM · Phabricator (Search)
Thadguidry updated the task description for T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 4:56 PM · Phabricator (Search)
Thadguidry updated the task description for T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 4:55 PM · Phabricator (Search)
Thadguidry reopened T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search as "Open".
Aug 15 2021, 4:49 PM · Phabricator (Search)
Thadguidry renamed T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search from Allow Phabricator Advanced Search to filter by greater/less than a Year (or ideally a Date) to Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 4:49 PM · Phabricator (Search)
Thadguidry added a comment to T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.

To help the team, I've mocked up perhaps a better menu representation that would always display a Global Advanced Search underneath the context of Current Application - Advanced Search? (Diffusion, Manifest, etc. as Current Application context, but always also displaying the Global Advanced Search)

Aug 15 2021, 4:46 PM · Phabricator (Search)
Thadguidry added a comment to T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.

I see the documentation, but it doesn't match the interface. I do not see a Date field as mentioned in the documentation. What else am I missing?

image.png (624×1 px, 45 KB)

Aug 15 2021, 4:20 PM · Phabricator (Search)
Thadguidry renamed T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search from Allow Phabricator to filter by greater/less than a Year (or ideally a Date) to Allow Phabricator Advanced Search to filter by greater/less than a Year (or ideally a Date).
Aug 15 2021, 12:07 PM · Phabricator (Search)
Thadguidry created T288905: Allow Phabricator Current Application context of Advanced Search & always display a Global Advanced Search.
Aug 15 2021, 12:06 PM · Phabricator (Search)

Aug 10 2021

Thadguidry added a comment to T287164: Improve bulk import via API.

Hi @aidhog Aidan in my opinion I would say "NO, not a good test-case for this need". And the only reason is this... it's ASCII only (chars <128) and doesn't let us unsure proper load handling for all data in all languages, multilingual data (ASCII > 128) such as UTF-8, etc.
DBLP.xml is however a great test-case for any SAX parser as I can see in it's PDF https://dblp.uni-trier.de/xml/docu/dblpxml.pdf

Aug 10 2021, 8:14 PM · NFDI-Germany, Product-Feature, Wikibase Suite Team, Wikidata, [DEPRECATED] wdwb-tech, Wikibase (3rd party installations)

Aug 2 2021

Thadguidry added a comment to T285795: Limit languages on EntityStub rdf builders.

Someone needs to add a Documentation task to this.
I assume all the new options available and perhaps a reference link to this ticket would go somewhere in here? https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format

Aug 2 2021, 1:14 PM · Patch-For-Review, User-Ladsgroup, User-Addshore, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata, [DEPRECATED] wdwb-tech

Jul 30 2021

Thadguidry added a watcher for GraphQL: Thadguidry.
Jul 30 2021, 8:52 PM

Jul 22 2021

Thadguidry added a comment to T287164: Improve bulk import via API.

Hmm, this is missing a detail of how your entity data sets or the community's is likely formatted (either from some other system or program, or manually created by database exports or software tools).

  1. What are the import formats that are likely to be wanted to import in bulk into Wikibase? Simple CSV Tables? JSON? RDF/XML? Or directly any of the formats that Rio https://github.com/oxigraph/rio currently provides (RDF-star is one of the newest it now supports)?
Jul 22 2021, 3:04 PM · NFDI-Germany, Product-Feature, Wikibase Suite Team, Wikidata, [DEPRECATED] wdwb-tech, Wikibase (3rd party installations)

Jul 15 2021

Thadguidry awarded T280656: Include the EDTF Datatype extension in the Fall 2021 Wikibase Docker release a Like token.
Jul 15 2021, 2:07 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), [DEPRECATED] wdwb-tech, Wikidata, Wikibase Release Strategy
Thadguidry added a comment to T280656: Include the EDTF Datatype extension in the Fall 2021 Wikibase Docker release.

This reaches beyond just GLAM and cultural heritage and impacts scientific organizations as well. Please add my support for this.

Jul 15 2021, 2:04 PM · Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), [DEPRECATED] wdwb-tech, Wikidata, Wikibase Release Strategy

Jul 10 2021

Thadguidry added a comment to T219037: Display constraint clarifications in violation messages.

I'd like to see this made a bit higher priority? It seems it would be fairly trivial to implement with a good impact. One that I have seen often repeated over and over by the Lexicographical community is this particular constraint and the explanation given over and over when folks hit the constraint but are left wondering what it really means. Here's an example where I've given "usage example" a constraint clarification text of that explanation we repeat so often to folks in Telegram chat.
https://www.wikidata.org/wiki/Property:P5831

Jul 10 2021, 7:42 PM · MW-1.40-notes (1.40.0-wmf.25; 2023-02-27), Wikidata Dev Team (Sprint-∞), Wikibase-Quality-Constraints, Wikidata

May 19 2021

Thadguidry added a comment to T282796: Design a file format to represent Wikibase edits.

I thought there was already a standard around some "diff" format like DoubleCheck uses between Mediawiki revision table rev_ids? I recall using Wikiloop DoubleCheck which has an interesting interface to expose a portion of an edit for judgement and rollback.
It probably makes sense to pull someone from their team or others into this conversation as well to explore ideas on Merge conflict resolution displaying and what formats lend themselves well to that?

May 19 2021, 8:04 PM · OpenRefine

Mar 1 2021

Thadguidry awarded T274569: Uncaught TypeError: activeInstances[i].update is not a function a Love token.
Mar 1 2021, 4:06 PM · [DEPRECATED] wdwb-tech (legacy-backlog), User-Ladsgroup, MW-1.36-notes (1.36.0-wmf.33; 2021-03-02), JavaScript, Wikidata, Wikidata-Campsite, Wikibase (3rd party installations), Wikimedia-production-error

Jan 9 2021

Thadguidry added a comment to T236493: Adding a new lexeme should constraint languages form to languages.

To Reproduce:

  1. Create a new Lexeme
  2. Lemma: type chevrette
  3. Language of Lemma: type cajun and look at dropdown listing
  4. Notice that Louisiana French Q3083213 is at the bottom of dropdown list instead of top of list.
Jan 9 2021, 11:22 PM · Wikidata Lexicographical data, Wikidata
Thadguidry awarded T236493: Adding a new lexeme should constraint languages form to languages a Like token.
Jan 9 2021, 11:16 PM · Wikidata Lexicographical data, Wikidata

Jan 8 2021

Thadguidry awarded T271500: Not possible to search for a sense when adding a statement a Like token.
Jan 8 2021, 11:55 AM · Wikidata Lexicographical data, Wikidata

Oct 22 2020

Thadguidry added a comment to T266212: Wikidata autocomplete service should do token search and not prefix search.

In Freebase, we offered word, phrase, and full (exact match). I think the wbsearchentities API could offer something similar, although with a slight cost of indexing.
Besides name we also supported alias{full}. Using alias: matched both name and aliases, using name: matched only on name.

Oct 22 2020, 10:55 AM · Wikidata

Oct 17 2020

Thadguidry added a comment to T265734: List inherited parameters in MediaWiki API help pages.

In that case, then just on that wbsearchentities page I think having a single sentence about "userlang" and providing the link to the other page would help like hell for all those users that have asked in the last year alone. As an institution of holding the worlds knowledge in many languages then "userlang" is probably the most important parameter to let users know about especially given the frame of reference here when we are talking about THE Search API that so many eventually land upon using and building around it.

Oct 17 2020, 3:58 AM · MW-1.39-notes, MW-1.40-notes, MW-1.41-notes (1.41.0-wmf.27; 2023-09-19), MW-1.37-notes (1.37.0-wmf.7; 2021-05-25), MediaWiki-Action-API, Platform Engineering

Oct 11 2020

Thadguidry added a comment to T238362: Blazegraph write performance tuning.

@Gehel Hi Guillaume Isn't the streaming updater work done now by @dcausse ? Is it time for your tuning engineers to revisit some of this or not really?

Oct 11 2020, 9:55 PM · Wikidata-Query-Service, Wikidata

Sep 22 2020

Thadguidry updated the task description for T263585: action=upload does not have correct parameters marked as "required".
Sep 22 2020, 7:12 PM · MediaWiki-Uploading, MediaWiki-Documentation, Platform Engineering, MediaWiki-Action-API
Thadguidry created T263585: action=upload does not have correct parameters marked as "required".
Sep 22 2020, 7:11 PM · MediaWiki-Uploading, MediaWiki-Documentation, Platform Engineering, MediaWiki-Action-API

Sep 2 2020

Thadguidry added a comment to T258687: The streaming updater should read its events from multiple DC streams.

@dcausse Dunno if this might help but could a simple window help or where you use KeyedProcessFunction on a KeyedStream? If the stream is unkeyed (or initially so), then the other thing might be just finding the patterns in the stream and CEP would help.

Sep 2 2020, 11:23 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service
Thadguidry added a comment to T244590: [Epic] Rework the WDQS updater as an event driven application.
  • the output of this is a simple event without any data saying: do a diff between rev X and Y, fully delete entity QXYZ, ...
Sep 2 2020, 10:58 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic
Thadguidry updated the task description for T244590: [Epic] Rework the WDQS updater as an event driven application.
Sep 2 2020, 10:46 AM · Discovery-Search (Current work), Wikidata, Wikidata-Query-Service, Epic

Aug 22 2020

Thadguidry added a parent task for T203643: Sometimes Special:MergeLexemes gives summary on target lexeme, and sometimes not: T261049: Propagate the error to UX for merge failure when Lemma's do not exactly match. .
Aug 22 2020, 4:02 PM · Wikidata Lexicographical data, Wikidata
Thadguidry added a subtask for T261049: Propagate the error to UX for merge failure when Lemma's do not exactly match. : T203643: Sometimes Special:MergeLexemes gives summary on target lexeme, and sometimes not.
Aug 22 2020, 4:02 PM · Wikidata Lexicographical data, Wikidata
Thadguidry created T261049: Propagate the error to UX for merge failure when Lemma's do not exactly match. .
Aug 22 2020, 4:00 PM · Wikidata Lexicographical data, Wikidata

May 12 2020

Thadguidry added a comment to T250919: Add row/cell annotations to tabular data.

Hi all - My personal opinion and those of a few other experts would be to embrace DRY (Don't Repeat Yourself - or others) and simply allow introduction of W3C standards for Tabular Data:

May 12 2020, 1:33 AM · Commons-Datasets, JsonConfig, covid-19

May 7 2020

Thadguidry added a comment to T249868: take into account additional Properties for mapping to schema.org.

If it helps or is needed, the query that you can use is here:

May 7 2020, 1:46 PM · Wikidata, Wikidata - Reference Treasure Hunt

Apr 9 2020

Thadguidry added a comment to T249868: take into account additional Properties for mapping to schema.org.

@Lydia_Pintscher Oops! You forgot to include the main one also !!! .... Equivalent Property P1628 :-)

Apr 9 2020, 7:57 PM · Wikidata, Wikidata - Reference Treasure Hunt

Feb 10 2020

Thadguidry added a comment to T214884: [ES-M2]: [EPIC] Linking EntitySchemas in statements.

Is there anything inherently wrong or technically infeasible or undesirable, if an id used 2 letters? ES45 versus E45 ?

Feb 10 2020, 4:26 PM · Wikidata-Campsite, EntitySchema (M2: Linking to EntitySchemas in statements), Wikidata Dev Team, MW-1.34-notes (1.34.0-wmf.17; 2019-08-06), User-Ladsgroup, Wikidata

Nov 12 2019

Thadguidry added a comment to T237645: Reconsider how apostrophes are handled in completion search for wikidata.

Thanks, updated ticket.

Nov 12 2019, 3:42 PM · Discovery-Search, Wikidata
Thadguidry updated the task description for T237645: Reconsider how apostrophes are handled in completion search for wikidata.
Nov 12 2019, 3:42 PM · Discovery-Search, Wikidata

Nov 7 2019

Thadguidry updated the task description for T237645: Reconsider how apostrophes are handled in completion search for wikidata.
Nov 7 2019, 7:19 PM · Discovery-Search, Wikidata
Thadguidry added a comment to T237645: Reconsider how apostrophes are handled in completion search for wikidata.

@dcausse Yes, I mean running a full text search. "simple search" is a term used by Blazegraph sometimes. Fulltext searches are cheap when you index terms in multiple ways. Why would you not want to index terms in multiple ways? Freebase was able to leverage this quite easily with Lucene/Solr indexes and provided great results on its search box on each character typed. Are you hurting for RAM to store the cached inverted indexes or something else with the infra? My quick calculations on 1 simple index in memory for all the terms (not just label/alias) in Wikidata, currently stats say 78 billion x 10 bytes per term = 78 gigs. Does Wikidata not have hardware to support multiple indexes? 1TB RAM (16x 64GB)

Nov 7 2019, 6:28 PM · Discovery-Search, Wikidata
Thadguidry updated the task description for T237645: Reconsider how apostrophes are handled in completion search for wikidata.
Nov 7 2019, 2:40 PM · Discovery-Search, Wikidata
Thadguidry created T237645: Reconsider how apostrophes are handled in completion search for wikidata.
Nov 7 2019, 2:35 PM · Discovery-Search, Wikidata

Nov 6 2019

Thadguidry added a comment to T237490: Collect feedback from module and gadget authors for Developer Productivity & onwiki tooling techconf session.

No problem @Tgr

Nov 6 2019, 4:16 AM · Wikimedia-Technical-Conference-2019
Thadguidry added a comment to T234661: Wikimedia Technical Conference 2019 Session: Developer Productivity & onwiki tooling.

What slows me down is not having an efficient quick way to filter properties that don't include " ID" (non-authority properties), in various Property explorer tools such as the excellent https://tools.wmflabs.org/prop-explorer/
Perhaps other Property explorer tools have a quick filter mechanism for this? I tried a simple regex in the Label column filter, but it doesn't work. https://github.com/stevenliuyi/wikidata-prop-explorer/issues/1

Nov 6 2019, 12:37 AM · International-Developer-Events, Wikimedia-Technical-Conference-2019

Oct 29 2019

Thadguidry added a comment to T214884: [ES-M2]: [EPIC] Linking EntitySchemas in statements.

TODO: Just wanted to highlight that once decisions are made... please ensure to update the Glossary item ! Currently it reads:

Oct 29 2019, 2:31 PM · Wikidata-Campsite, EntitySchema (M2: Linking to EntitySchemas in statements), Wikidata Dev Team, MW-1.34-notes (1.34.0-wmf.17; 2019-08-06), User-Ladsgroup, Wikidata

Jul 8 2019

Thadguidry added a comment to T207168: Provide JSON-LD support for Wikidata.

@dbarratt in the Wikibase ontology I could not find those properties in the OWL document returned. Sorry, I'm getting caught up with your schema layouts as fast as I can :-) I expected my parser to retrieve information about their description, range, domain. I do see the class "Statement" however.

Jul 8 2019, 4:33 PM · Patch-For-Review, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata, MediaWiki-extensions-WikibaseRepository
Thadguidry added a comment to T207168: Provide JSON-LD support for Wikidata.

Something is amiss with these...not found.

Jul 8 2019, 2:34 PM · Patch-For-Review, Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)), Wikidata, MediaWiki-extensions-WikibaseRepository