gh-74598: add `fnmatch.filterfalse` for excluding names #121185

picnixz · 2024-06-30T15:05:22Z

In this implementation, I did not use a lambda function as it was proposed in the original patch nor did I use auxiliary functions.

EDIT 07/04/2025: I kept the use of filterfalse as on a PGO build the loop variant was much slower when processing past a certain number of files. For < 10 files, the numbers are roughly the same and in the range of nanoseconds so we don't really care.

# 6 files to test

small_match_none_loop: Mean +- std dev: 628 ns +- 9 ns
small_match_none_iter: Mean +- std dev: 622 ns +- 14 ns

small_match_some_loop: Mean +- std dev: 675 ns +- 12 ns
small_match_some_iter: Mean +- std dev: 688 ns +- 12 ns

small_match_digits_loop: Mean +- std dev: 962 ns +- 15 ns
small_match_digits_iter: Mean +- std dev: 987 ns +- 12 ns

small_match_all_loop: Mean +- std dev: 700 ns +- 7 ns
small_match_all_iter: Mean +- std dev: 765 ns +- 19 ns

# 60 files to test

medium_match_none_loop: Mean +- std dev: 4.75 us +- 0.10 us
medium_match_none_iter: Mean +- std dev: 3.85 us +- 0.07 us

medium_match_some_loop: Mean +- std dev: 5.08 us +- 0.07 us
medium_match_some_iter: Mean +- std dev: 4.23 us +- 0.07 us

medium_match_digits_loop: Mean +- std dev: 7.93 us +- 0.15 us
medium_match_digits_iter: Mean +- std dev: 7.12 us +- 0.14 us

medium_match_all_loop: Mean +- std dev: 5.35 us +- 0.12 us
medium_match_all_iter: Mean +- std dev: 4.66 us +- 0.10 us

# 600 files to test

large_match_none_loop: Mean +- std dev: 43.9 us +- 0.9 us
large_match_none_iter: Mean +- std dev: 33.9 us +- 0.7 us

large_match_some_loop: Mean +- std dev: 47.4 us +- 1.0 us
large_match_some_iter: Mean +- std dev: 36.4 us +- 0.7 us

large_match_digits_loop: Mean +- std dev: 75.4 us +- 1.5 us
large_match_digits_iter: Mean +- std dev: 64.6 us +- 1.2 us

large_match_all_loop: Mean +- std dev: 48.2 us +- 0.9 us
large_match_all_iter: Mean +- std dev: 39.4 us +- 0.7 us

Benchmark script

import itertools
import os
import posixpath
from fnmatch import _compile_pattern

import pyperf

def filterfalse_loop(names, pat):
    result = []
    pat = os.path.normcase(pat)
    match = _compile_pattern(pat)
    if os.path is posixpath:
        # normcase on posix is NOP. Optimize it away from the loop.
        for name in names:
            if match(name) is None:
                result.append(name)
    else:
        for name in names:
            if match(os.path.normcase(name)) is None:
                result.append(name)
    return result

def filterfalse_iter(names, pat):
    pat = os.path.normcase(pat)
    match = _compile_pattern(pat)
    if os.path is posixpath:
        # normcase on posix is NOP. Optimize it away from the loop.
        return list(itertools.filterfalse(match, names))

    result = []
    for name in names:
        if match(os.path.normcase(name)) is None:
            result.append(name)
    return result

runner = pyperf.Runner()
runner.parse_args()

base_data = ['Python', 'Ruby', 'Perl', 'Tcl',
             '123456789_path_with_digits.json',
             '_path_with_digits_123456789.json']
for data_label, data_size in [('small', 1), ('medium', 10), ('large', 100)]:
    for pat_label, pattern in [('none', 'A*'), ('some', 'P*'), ('digits', '*[1-9]*'), ('all', '*')]:
        data = base_data * data_size
        runner.bench_func(f'{data_label}_match_{pat_label}_loop', filterfalse_loop, data, pattern)
        runner.bench_func(f'{data_label}_match_{pat_label}_iter', filterfalse_iter, data, pattern)

Issue: Add fnmatch.filterfalse function #74598

📚 Documentation preview 📚: https://cpython-previews--121185.org.readthedocs.build/

Lib/fnmatch.py

picnixz · 2025-03-01T16:01:38Z

FT failure is known so I'll wait until the fix (#130724) is merged.

… feat/fnmatch/filterfalse-74598

…h/filterfalse-74598

picnixz · 2025-04-07T07:49:58Z

I'm going to run some benchmarks again and then I'll merge this one. The feature can still be useful and maintenance is not that hard. If, however, benchmarks are not good, I'll write a recipe instead.

serhiy-storchaka

I did have other plans for fnmatch, for supporting complex filters with multiple positive and negative patterns, this is why I was not particularly interested in this feature. But I don't think that adding filterfalse() is wrong. It will not conflict with future features, and they will not make it obsolete.

LGTM. But please add more tests for filterfalse() similar to other filter() tests in FilterTestCase. The code is completely separate, so we need to test all this.

picnixz · 2025-04-07T11:59:08Z

But please add more tests for filterfalse() similar to other filter() tests in FilterTestCase

Will do!

…false-74598

…g a patern (python#121185)

picnixz added 5 commits June 30, 2024 17:00

add fnmatch.filterfalse

538ab0a

add tests

ce7720c

add doc

11ca0f3

add What's New?

c1957f2

blurb

cbb7813

bedevere-app bot mentioned this pull request Jun 30, 2024

Add fnmatch.filterfalse function #74598

Closed

bedevere-app bot added the awaiting review label Jun 30, 2024

picnixz changed the title ~~gh-74598: add fnmatch.filterfalse for excluding patterns~~ gh-74598: add fnmatch.filterfalse for excluding names Jun 30, 2024

use 'is None' instead of 'not'

56e6a0e

dg-pb reviewed Jun 30, 2024

View reviewed changes

Lib/fnmatch.py Show resolved Hide resolved

dg-pb reviewed Jun 30, 2024

View reviewed changes

Lib/fnmatch.py Show resolved Hide resolved

picnixz added 3 commits July 1, 2024 11:00

use itertools.filterfalse with POSIX paths

f36b2a8

re-order imports

0030a5e

micro-optimization of variable

ddeea64

picnixz added the stdlib Python modules in the Lib dir label Aug 1, 2024

picnixz added 2 commits November 15, 2024 13:22

Merge branch 'main' into fnmatch-filterfalse

c132a60

Merge branch 'main' into feat/fnmatch/filterfalse-74598

a10b2be

picnixz requested a review from barneygale December 15, 2024 14:17

picnixz added 3 commits December 15, 2024 18:20

Update Doc/whatsnew/3.14.rst

02b422c

Update Doc/library/fnmatch.rst

bc1a3c7

Merge branch 'main' into fnmatch-filterfalse

50b5e06

picnixz requested a review from serhiy-storchaka March 1, 2025 11:58

Merge branch 'main' into fnmatch-filterfalse

225a6b4

picnixz added the stale Stale PR or inactive for long period of time. label Mar 18, 2025

Merge branch 'fnmatch-filterfalse' of github.com:picnixz/cpython into…

fb14b91

… feat/fnmatch/filterfalse-74598

picnixz marked this pull request as draft April 7, 2025 07:35

bedevere-app bot removed the awaiting review label Apr 7, 2025

picnixz marked this pull request as ready for review April 7, 2025 07:36

bedevere-app bot added the awaiting core review label Apr 7, 2025

Merge remote-tracking branch 'upstream/main' into x-stale/feat/fnmatc…

838a106

…h/filterfalse-74598

picnixz removed the stale Stale PR or inactive for long period of time. label Apr 7, 2025

picnixz added 2 commits April 7, 2025 10:33

Merge branch 'main' into fnmatch-filterfalse

8807844

a bit of PEP-8

4e3ec14

serhiy-storchaka reviewed Apr 7, 2025

View reviewed changes

picnixz added 2 commits April 8, 2025 11:17

Merge remote-tracking branch 'upstream/main' into feat/fnmatch/filter…

d63fa66

…false-74598

increase test coverage

ac179c1

picnixz enabled auto-merge (squash) April 8, 2025 09:48

picnixz merged commit 3eda146 into python:main Apr 8, 2025
39 checks passed

bedevere-app bot removed the awaiting core review label Apr 8, 2025

picnixz deleted the fnmatch-filterfalse branch April 8, 2025 10:30

seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025

pythongh-74598: add fnmatch.filterfalse for excluding names matchin…

d8bee58

…g a patern (python#121185)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-74598: add `fnmatch.filterfalse` for excluding names #121185

gh-74598: add `fnmatch.filterfalse` for excluding names #121185

picnixz commented Jun 30, 2024 •

edited

Loading

picnixz commented Mar 1, 2025 •

edited

Loading

picnixz commented Apr 7, 2025

serhiy-storchaka left a comment

picnixz commented Apr 7, 2025

gh-74598: add fnmatch.filterfalse for excluding names #121185

gh-74598: add fnmatch.filterfalse for excluding names #121185

Conversation

picnixz commented Jun 30, 2024 • edited Loading

picnixz commented Mar 1, 2025 • edited Loading

picnixz commented Apr 7, 2025

serhiy-storchaka left a comment

Choose a reason for hiding this comment

picnixz commented Apr 7, 2025

gh-74598: add `fnmatch.filterfalse` for excluding names #121185

gh-74598: add `fnmatch.filterfalse` for excluding names #121185

picnixz commented Jun 30, 2024 •

edited

Loading

picnixz commented Mar 1, 2025 •

edited

Loading