Skip to content

Conversation

grhoten
Copy link
Member

@grhoten grhoten commented Mar 5, 2025

Resolves #49

These changes add support for Spanish. It also includes changes to make the dictionary-parser tests runnable. The tests don't pass yet, but I want to include what I have so far. It did find one bug that needed to be fixed.

@grhoten grhoten requested a review from nciric March 5, 2025 19:00
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to this PR? This looks more like the exclusion (iPhone, Apple...) dictionary we talked about in the meeting (but I could be wrong).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t created those customizations yet. I’ll probably include it for French or Turkish. This is synthetic test data for the tool.

Copy link
Contributor

@nciric nciric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, but I am somewhat confused with two new lexicon files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about this file - why are there Russian looking words in there? (do they come from Wikidata when you parse Spanish and are product of mislabeled lexemes)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test files are to ensure that specific options are working as expected. This data is test data only. I haven’t finished adding the necessary data yet. This synthetic test data is not a part of the library.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stuff in the test directory are used for only tests.

@grhoten grhoten merged commit b011ee4 into unicode-org:main Mar 5, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate es Wikidata into Unicode Inflection
2 participants