Fix search case-sensitivity by adding keyword subfields with lowercase normalizer to ElasticSearch mappings #1178
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1173
Problem
The search functionality was case-sensitive, causing different results for queries like "Symfony" vs "symfony". This was due to ElasticSearch
term
queries being case-sensitive exact matches, and the indexed content not being normalized for case-insensitive matching.Example of the issue:
https://www.yiiframework.com/search?type=news&q=Symfony
https://www.yiiframework.com/search?type=news&q=symfony
These URLs would return different search results, which is unexpected behavior for users.
Root Cause
The original implementation had two issues:
term
queries on text fields inmodels/search/SearchActiveRecord.php
Even with query-side lowercasing, the indexed content remained in original case, causing term queries to fail when searching for lowercased terms against mixed-case indexed data.
Solution
Implemented a comprehensive fix using ElasticSearch's built-in normalization capabilities:
1. Added Lowercase Normalizer
Added custom lowercase normalizer to index settings across all search models:
2. Added Keyword Subfields
Enhanced field mappings to include keyword subfields with lowercase normalizer:
3. Updated Term Queries
Modified exact match queries to use new keyword subfields:
Changes Made
SearchActiveRecord.php:
.keyword
subfieldsmb_strtolower()
calls since normalization happens at index levelAll Search Models (SearchApiType, SearchExtension, SearchGuideSection, SearchNews, SearchWiki):
name
andtitle
field mappingsBenefits
Testing
The fix handles various cases correctly:
"Symfony"
and"symfony"
now produce identical ElasticSearch queries"ÁÉÍÓÚ"
→"áéíóú"
)"ArrayHelper"
→"arrayhelper"
)💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.