gh-120144: Make it possible to use `sys.monitoring` for bdb and make it default for pdb #124533

gaogaotiantian · 2024-09-25T18:43:12Z

This is the most conservative attempt ever to utilize sys.monitoring in bdb and pdb.

Highlights:

Full backward compatibility - no changes to test_pdb and test_bdb at all with the new backend
bdb will still default to sys.settrace, which keeps all the old behavior. Users can opt-in to the new sys.monitoring backend, and the interface is still the same, even for trace_dispatch (that's how test_bdb passes).
New additional and optional interfaces in bdb where user can disable certain events to improve the peformance.
pdb.Pdb will use sys.settrace by default too, and is configurable with pdb.Pdb.DEFAULT_BACKEND
pdb CLI and breakpoint() uses the monitoring backend and no noticable difference I can observe at this point.

Solution:

Basically, I mimicked the behavior of sys.settrace with sys.monitoring to keep the old behavior as much as possible. But I had the chance to use the new API of sys.monitoring to disable certain events.

Performance:

It's not as impressive as the original proposal, but for the following code:

import time
def f(x):
    # Set breakpoint here
    x *= 2
    return x + 1

def test():
    start_time = time.time()
    for i in range(1000):
        for j in range(1000):
            j + i
    cost = time.time() - start_time
    print(cost)
    f(0)

test()

On my laptop, without debugger, it takes 0.025s. With the new pdb attached (b f then c), it takes the same amount of time, and with Python 3.12 pdb, it takes 1.04s(4100%+ overhead). The performance improvement is significant to say at least.

And as you can tell from the diff, the actual changes to pdb is minimal - just change sys.settrace(tracefunc) to self.start_trace(tracefunc) and sys.settrace(None) to self.stop_trace(). That's what the debugger developers need to do to onboard.

Issue: Make it possible to use sys.monitoring for pdb/bdb #120144

gaogaotiantian · 2024-09-25T18:45:46Z

And of course we need documentation updates, I will do it later when the feature is accepted and the interface is decided.

terryjreedy · 2024-09-28T06:22:04Z

Its after midnight so will test much later today. If all ok using default, will patch IDLE to pass 'monitoring'.

terryjreedy · 2024-09-29T06:53:06Z

Ran fine with default backend.

Not fine with backend='monitoring'.
Usually, when I start debugger, stack should one line with bdb.run(). Running a file should top line under that. I think showing bdb.run is an error, but this is what to compare to. With monitoring, there is initially nothing in stack window. Running a file results in 17 lines from threading, idlelib, and bdb. I have to hit 'go' to get to bdb.run + first line. After that, over and step seem to work, but 'go' freezes debugger.

EDIT: The remote execution process crashes because of an unrecoverable exception in the rpc code. Monitoring does not seem to work across the socket connection. Some of the debugger code likely needs a change (as pdb does). (But IDLE does not have to use 'monitoring'.

gaogaotiantian · 2024-09-29T21:10:20Z

Right - the most important thing is IDLE can simply keep working as it is, but it's also a very important example to test the new mechanism.

I think at least part of the issue is multi-threading. For sys.settrace, it only sets trace on current thread, but sys.monitoring sets events on all threads. That should answer some of the questions. I can patch this PR for thread check and maybe you can take a look after the fix.

gaogaotiantian · 2024-10-16T23:30:47Z

Hi @terryjreedy , I "fixed" the multi-threading issue. Well by "fixed" I meant making it the same behavior as before. Let me know if you have some time to test it out :)

gaogaotiantian · 2024-10-17T00:01:54Z

With clear_tool_id, I can easily change the events like LINE to local, and that gives me the full speed! The code example runs with basically 0 overhead now.

pyscripter · 2025-01-17T12:24:25Z

It's not as impressive as the original proposal, but for the following code:
import time
def f(x):
    # Set breakpoint here
    x *= 2
    return x + 1

def test():
    start_time = time.time()
    for i in range(1000):
        for j in range(1000):
            j + i
    cost = time.time() - start_time
    print(cost)
    f(0)

test()
On my laptop, without debugger, it takes 0.025s. With the new pdb attached (b f then c), it takes the same amount of time, and with Python 3.12 pdb, it takes 1.04s(4100%+ overhead). The performance improvement is significant to say at least.

The improvement you see here is entirely due to the changes in break_anywhere(self, frame): introduced in python 3.14. If you try with the released alpha versions of 3.14 you get similar results to yours.

I have tried your new bdb code, based on sys.monitoring, in different cases and, unfortunately, I did not see any improvements. Quite the opposite!

gaogaotiantian · 2025-01-17T15:32:00Z

The improvement you see here is entirely due to the changes in break_anywhere(self, frame): introduced in python 3.14.

#124553 was merged after this PR so that's not possible (I did it btw).

But yes, for this specific case, the changes to break_anywhere will give the same result, and that's good.

What case did you try for this implementation? Like I mentioned, this is not the perfect solution, the major problem is solves is when a line event is triggered multiple times. So, for the same code, if you put a breakpoint at print(cost), it should show the similar diff, and break_anywhere change won't affect it.

When you say quite the opposite, do you mean it's actually significant slower than the original solution? Do you have any examples?

pyscripter · 2025-01-17T16:41:16Z

So, for the same code, if you put a breakpoint at print(cost), it should show the similar diff, and break_anywhere change won't affect it.

Yes it does! Massive improvement in this case!

the major problem is solves is when a line event is triggered multiple times

Could you please elaborate on how this is achieved?

When you say quite the opposite, do you mean it's actually significant slower than the original solution? Do you have any examples?

One script I saw degradation of performance was the following:

from timeit import timeit

class A(object):
    def __init__(self):
        self.value = 1

class B(A):
    @staticmethod
    def create_property(name):
        return property(
            lambda self: getattr(self.srg, name),
            lambda self, v: setattr(self.srg, name, v),
        )
    value = create_property.__func__("value")

    def __init__(self):
        self.srg = __import__("threading").local()
        self.value = 2
        super().__init__()

b1 = B()
b2 = B()

print(timeit("b1.value = 4; c = b1.value", number=100000, globals=globals()))
print(timeit("b1.v = 4; c = b1.v", number=100000, globals=globals()))

With a breakpoint in the last line, the monitoring-based Bdb takes 4 times as much time to reach it.

But let me add some more details about how I am testing. I am using a Bdb-based debugger in PyScripter, which is adapted for multi-treaded debugging. I am not using your modified bdb.py directly. Instead I created a subclass of Bdb integrating your code (see fastbdb.zip). Since, I wanted to back port your code to python 12 and 13, I replaced clear_tool_id with a custom function. Finally, since I am using the code for multi-threaded debugging, I have removed the changes in 99ea70c.
fastbdb.zip

gaogaotiantian · 2025-01-17T17:00:01Z

Okay that's a completely different story. Unfortunately this is the CPython PR so I can't fully solve issues for your debugger, but I have some potential theories.

sys.monitoring, unlike sys.settrace, will be triggered on all threads. If you mentioned multi-thread, that could be the problem. Are you using 4 threads when you say it's 4x slower?

clear_tool_id is also critical in this case, because local events will not be disabled automatically when you free the tool id. I don't know if your implementation of that is correct.

So, if you can reproduce the performance issue on pdb from this branch, I can investigate more into it. If you are experiencing issues on a debugger that's based on a customized version of this bdb, I'm afriad you are on your own because it's highly possible that the issue is caused by your customization (otherwise you should be able to repro it with pure pdb).

pyscripter · 2025-01-17T17:11:48Z

I am testing on python 3.14 and I am using the following clear_tool_id

        def clear_tool_id(tool_id):
            import sys
            if sys.version_info >= (3,14):
                sys.monitoring.clear_tool_id(tool_id)
            else:
                for event in sys.monitoring.events.__dict__.values():
                    if isinstance(event, int) and event:  # Ensure it's an event constant
                        sys.monitoring.register_callback(tool_id, event, None)

So, no difference here.

sys.monitoring, unlike sys.settrace, will be triggered on all threads. If you mentioned multi-thread, that could be the problem. Are you using 4 threads when you say it's 4x slower?

No. There is just one thread and it works exactly like your code. PyScripter allows you to debug multi-threaded python code, but for single-threaded scripts it just uses Bdb.

And I am not asking you to solve my problems. I reported one script where monitoring degrades debugging performance. Why don't you try it on your side?

direct pdb usage

gaogaotiantian · 2025-01-17T17:43:53Z

gaogaotiantian@DESKTOP-I8L3RCK:~/programs/mycpython$ ./python -m pdb example.py 
> /home/gaogaotiantian/programs/mycpython/example.py(1)<module>()
-> from timeit import timeit
(Pdb) b 26
Breakpoint 1 at /home/gaogaotiantian/programs/mycpython/example.py:26
(Pdb) c
0.6028096990003178
0.0015645000003132736
> /home/gaogaotiantian/programs/mycpython/example.py(26)<module>()
-> pass
(Pdb) 
gaogaotiantian@DESKTOP-I8L3RCK:~/programs/mycpython$ ./python -m pdb example.py 
> /home/gaogaotiantian/programs/mycpython/example.py(1)<module>()
-> from timeit import timeit
(Pdb) b 26
Breakpoint 1 at /home/gaogaotiantian/programs/mycpython/example.py:26
(Pdb) c
0.4884036080002261
0.004922300000544055
> /home/gaogaotiantian/programs/mycpython/example.py(26)<module>()
-> pass

Former is monitoring and latter is settrace. Let me know if you have different results. Again, not results from pyscripter, please only send results of pdb monitoring vs pdb settrace. (And you can use the latest PR which merged in the 3.14 changes).

pyscripter · 2025-01-17T18:40:08Z

Using the attached script I get

using settrace
time1=  0.318285699991975
+++ <test> 26 <module> : print("time2= ", timeit("b1.v = 4; c = b1.v", number=100000, globals=globals()))
time2=  0.003290199994808063

using monitoring
time1=  0.3893818999931682
+++ <test> 26 <module> : print("time2= ", timeit("b1.v = 4; c = b1.v", number=100000, globals=globals()))
time2=  0.0014353000151459128

So monitoring is about 25% slower in this example. Consistent with your results.

testfastdbd.zip

markshannon · 2025-02-19T09:34:43Z

I can change the DEFAULT_BACKEND thing. Do you think it should be a class-level variable or a module-level one like multiprocessing? For either one I can add utility functions for users to change it.

I think a module-level one. The interface to pdb is the module. The Pdb class is mostly for extensions.

gaogaotiantian · 2025-02-19T22:38:20Z

I added the module level util for default backends. I also added the docs (with my best effort) and the whatsnew entry. This is a full package that needs review now :)

markshannon

@terryjreedy are you OK with this being merged?

gaogaotiantian · 2025-02-27T17:17:27Z

@terryjreedy friendly ping. Do you have other comments for this change? IDLE can use settrace version bdb now and we can work together to see if we can move it to monitoring in the future. We also have a few months before 3.14 release so time is on our side.

pyscripter · 2025-03-06T21:17:21Z

I found another bug. Consider the following code:

for i in range(10):
    j = i ** 2
    print(j)

Start with a breakpoint in line 3.
When the breakpoint is hit, delete this breakpoint and add a breakpoint at line 2.
Continue execution. The second breakpoint is not honored, and execution continues to the end.

Suggested fix

Call restart_events() whenever a breakpoint is added. I cannot think of a better solution.

    def set_break(self, filename, lineno, temporary=False, cond=None,
                  funcname=None):
        """Set a new breakpoint for filename:lineno.
        ...
        self.restart_events()  #added
        return None

gaogaotiantian · 2025-03-07T02:07:20Z

Thanks for filing the bug. The events should be restarted after every user interaction. I forgot the one after user_line. Fixed and added a regression test.

gaogaotiantian · 2025-03-10T16:01:30Z

Another friendly ping to @terryjreedy :) Sorry I can't reach you on discord. Any objections/suggestions to this PR? Thanks!

gaogaotiantian · 2025-03-17T02:42:09Z

As Terry is busy, I will merge this soon if no one objects. I don't expect this change to break anything (except for pdb maybe).

terryjreedy · 2025-03-17T08:12:40Z

I ran IDLE debugger with updated main, a PR branch, and after adding backend='monitoring' to the super.int()` on line 34 of idlelib/debugger.py. (IDLE debugger never calls sys.settrace.) All seemed to run the same. I saw none of the problems I reported in #124533 (comment) last September. So fine with me to merge. When you do, I will make a PR with separate blurb to modify idlelib.debugger on main. Can the backend be changed on the fly?

gaogaotiantian · 2025-03-17T22:33:43Z

Thank you @terryjreedy for testing it out. I don't think you can change the backend while the debugging is working. It should be fine to change it before it starts tracing or after it's done. I'm glad the current version just works for IDLE - that's the original goal anyway.

… make it default for pdb (python#124533)

bedevere-app bot added the awaiting core review label Sep 25, 2024

bedevere-app bot mentioned this pull request Sep 25, 2024

Make it possible to use sys.monitoring for pdb/bdb #120144

Closed

gaogaotiantian requested review from iritkatriel, markshannon, brandtbucher and Yhg1s September 25, 2024 18:43

gaogaotiantian mentioned this pull request Sep 28, 2024

gh-124552 : Improve the accuracy of possible breakpoint check in bdb #124553

Merged

gaogaotiantian closed this Jan 17, 2025

gaogaotiantian reopened this Jan 17, 2025

gaogaotiantian and others added 7 commits January 17, 2025 12:37

Allow using sys.monitoring for bdb

18e6e9c

Add basic line ignore

b29aff5

Use setttrace as default backend for bdb

23601f3

📜🤖 Added by blurb_it.

c9a92f6

Use settrace by default for pdb.Pdb, but use monitoring for all

8955d78

direct pdb usage

Only trigger events on thread calling start_trace

6afc2e7

Use local events when possible

b595682

gaogaotiantian force-pushed the pdb-use-monitoring branch from 950e032 to b595682 Compare January 17, 2025 17:44

We need BaseException to handle SystemExit case

05cc3b0

gaogaotiantian added 5 commits February 19, 2025 16:42

Move default_backend to module level and provide utils

e6bc287

Do not need to pass trace_dispatch explicitly

f0c1306

Add docs

a9b53ed

Add version added

9790085

Add new functions to __all__

ea811f2

markshannon self-requested a review February 21, 2025 17:08

markshannon approved these changes Feb 21, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting changes labels Feb 21, 2025

pyscripter mentioned this pull request Mar 6, 2025

Breakpoints set while debugging might not be honored pyscripter/pyscripter#1373

Closed

gaogaotiantian added 2 commits March 6, 2025 20:39

Merge branch 'main' into pdb-use-monitoring

ad2179e

Restart events after user line

e4ccd8a

gaogaotiantian added 2 commits March 16, 2025 20:35

Merge branch 'main' into pdb-use-monitoring

d2c1f2e

Remove blank line

e9252ec

gaogaotiantian merged commit a936af9 into python:main Mar 17, 2025
39 checks passed

gaogaotiantian deleted the pdb-use-monitoring branch March 17, 2025 22:34

bedevere-app bot removed the awaiting merge label Mar 17, 2025

colesbury pushed a commit to colesbury/cpython that referenced this pull request Mar 20, 2025

pythongh-120144: Make it possible to use sys.monitoring for bdb and…

45f432a

… make it default for pdb (python#124533)

seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025

pythongh-120144: Make it possible to use sys.monitoring for bdb and…

c26e7d0

… make it default for pdb (python#124533)

wshanks mentioned this pull request May 9, 2025

Support for sys.monitoring backend? inducer/pudb#683

Open

Uh oh!

gh-120144: Make it possible to use sys.monitoring for bdb and make it default for pdb #124533

gh-120144: Make it possible to use sys.monitoring for bdb and make it default for pdb #124533

Uh oh!

Conversation

gaogaotiantian commented Sep 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaogaotiantian commented Sep 25, 2024

Uh oh!

terryjreedy commented Sep 28, 2024

Uh oh!

terryjreedy commented Sep 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaogaotiantian commented Sep 29, 2024

Uh oh!

gaogaotiantian commented Oct 16, 2024

Uh oh!

gaogaotiantian commented Oct 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pyscripter commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaogaotiantian commented Jan 17, 2025

Uh oh!

pyscripter commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaogaotiantian commented Jan 17, 2025

Uh oh!

pyscripter commented Jan 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaogaotiantian commented Jan 17, 2025

Uh oh!

pyscripter commented Jan 17, 2025

Uh oh!

markshannon commented Feb 19, 2025

Uh oh!

gaogaotiantian commented Feb 19, 2025

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

gaogaotiantian commented Feb 27, 2025

Uh oh!

pyscripter commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Suggested fix

Uh oh!

gaogaotiantian commented Mar 7, 2025

Uh oh!

gaogaotiantian commented Mar 10, 2025

Uh oh!

gaogaotiantian commented Mar 17, 2025

Uh oh!

terryjreedy commented Mar 17, 2025

Uh oh!

gaogaotiantian commented Mar 17, 2025

Uh oh!

Uh oh!

Uh oh!

gh-120144: Make it possible to use `sys.monitoring` for bdb and make it default for pdb #124533

gh-120144: Make it possible to use `sys.monitoring` for bdb and make it default for pdb #124533

gaogaotiantian commented Sep 25, 2024 •

edited

Loading

terryjreedy commented Sep 29, 2024 •

edited

Loading

gaogaotiantian commented Oct 17, 2024 •

edited

Loading

pyscripter commented Jan 17, 2025 •

edited

Loading

pyscripter commented Jan 17, 2025 •

edited

Loading

pyscripter commented Jan 17, 2025 •

edited

Loading

pyscripter commented Mar 6, 2025 •

edited

Loading