Skip to content

[DOC] Tweak the regexp documentation #12511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

koic
Copy link
Contributor

@koic koic commented Jan 5, 2025

This PR tweaks the regexp documentation. The original text is as follows:

/[[:blank:]]/: Matches /[[:space:]]/ or tab character:

Unexpectedly, [[:blank:]] doesn't encompass [[:space:]], contrary to the documentation. For example, even a newline character produces different results, as shown below.

$ ruby -ve 'p "\n".match?(/[[:space:]]/)'
ruby 3.5.0dev (2025-01-05T12:41:53Z master e45fca1372) +PRISM [x86_64-darwin23]
true

$ ruby -ve 'p "\n".match?(/[[:blank:]]/)'
ruby 3.5.0dev (2025-01-05T12:41:53Z master e45fca1372) +PRISM [x86_64-darwin23]
false

Actually, [[:blank:]] only supports spaces and tab character, excluding No-Break Space (U+00A0).

This PR updates the documentation regarding [[:space:]] and [[:blank:]] in regular expressions.

@koic koic changed the title [Doc] Tweak the regexp documentation [DOC] Tweak the regexp documentation Jan 5, 2025
This PR tweaks the regexp documentation. The original text is as follows:

> `/[[:blank:]]/`: Matches `/[[:space:]]/` or tab character:

Unexpectedly, `[[:blank:]]` doesn't encompass `[[:space:]]`, contrary to the documentation.
For example, even a newline character produces different results, as shown below.

```console
$ ruby -ve 'p "\n".match?(/[[:space:]]/)'
ruby 3.5.0dev (2025-01-05T12:41:53Z master e45fca1) +PRISM [x86_64-darwin23]
true

$ ruby -ve 'p "\n".match?(/[[:blank:]]/)'
ruby 3.5.0dev (2025-01-05T12:41:53Z master e45fca1) +PRISM [x86_64-darwin23]
false
```

Actually, `[[:blank:]]` only supports spaces and tab character, excluding No-Break Space (`U+00A0`).

- `[[:blank:]]` ... https://github.com/k-takata/Onigmo/blob/Onigmo-6.2.0/enc/unicode/name2ctype.h#L725-L736
- `[[:space:]]` ... https://github.com/k-takata/Onigmo/blob/Onigmo-6.2.0/enc/unicode/name2ctype.h#L2968-L2981

This PR updates the documentation regarding `[[:space:]]` and `[[:blank:]]` in regular expressions.
@koic koic force-pushed the tweak_the_regexp_rdoc branch from 2b28705 to f71c7ec Compare January 5, 2025 15:28
@hsbt hsbt added the Documentation Improvements to documentation. label Jun 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Improvements to documentation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants