MAINT: Ensure correct handling for very large unicode strings #27905
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #27875.
In the future, we can handle these strings (in parts we already can maybe), but for now have to stick to int length because more of the code needs cleanup to actually use it safely. (For user dtypes this is less of a problem, although corner cases probably exist.)
This adds necessary checks to avoid large unicode dtypes.