Skip to content

Graph split event tracker #159795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

haowu14
Copy link

@haowu14 haowu14 commented Aug 4, 2025

Summary:
A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode.

Different modes of the event tracker:
"0": Tracker not enabled (by default)
"1": Tracker enabled but no dumps. Information available by setting breakpoints and visually inspect in pdb.
"2": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"3": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES recusrively and dump to DUMP_PREFIX_nodex.txt
"4:: In addition to events dump, track all nodes with more than 1 event recusrively and dump to DUMP_PREFIX_nodex.txt

FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for ~.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "3".

Test Plan: New unit test

Reviewed By: georgiaphillips

Differential Revision: D79203595

cc @ezyang @SherlockNoMad @EikanWang @jgong5 @wenzhe-nrv

Copy link

pytorch-bot bot commented Aug 4, 2025

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @haowu14, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

Copy link

linux-foundation-easycla bot commented Aug 4, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@pytorch-bot pytorch-bot bot added the release notes: fx release notes category label Aug 4, 2025
Copy link

pytorch-bot bot commented Aug 4, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/159795

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 10f081f with merge base 9a0f7a3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 7, 2025
Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker not enabled (by default)
"1": Tracker enabled but no dumps. Information available by setting breakpoints and visually inspect in pdb.
"2": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"3": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES recusrively and dump to DUMP_PREFIX_nodex.txt
"4:: In addition to events dump, track all nodes with more than 1 event recusrively and dump to DUMP_PREFIX_nodex.txt
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "3".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
FX_NET_ACC_SPLITTER_TRACKER_MODE=4 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 7, 2025
Summary:
Pull Request resolved: pytorch#159795

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode.
```
Different modes of the event tracker:
"0": Tracker not enabled (by default)
"1": Tracker enabled but no dumps. Information available by setting breakpoints and visually inspect in pdb.
"2": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"3": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES recusrively and dump to DUMP_PREFIX_nodex.txt
"4:: In addition to events dump, track all nodes with more than 1 event recusrively and dump to DUMP_PREFIX_nodex.txt
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "3".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```

----

```
FX_NET_ACC_SPLITTER_TRACKER_MODE=4 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
"""
TRACKER_MODE: Literal["0", "1", "2", "3", "4"] = os.environ.get(
ENV_FX_NET_ACC_SPLITTER_TRACKER_MODE, "0"
) # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing it by an envvar, can we just dump it by default to tlparse using trace_structured?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can make a default dump of all events flat via trace_structured, while preserving the local dump for a quick debug process.
Later we can followup to visualize in fancier ways via tlparse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even when you have a wordy debug mode it can be helpful to use dtrace to put it in a structured log as you can conveniently get the logs out of a MAST job in that case

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the dump via trace_structed and removed the "off" mode. Please see the test plan in D79203595 for example dumped.

haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 8, 2025
Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker not enabled (by default)
"1": Tracker enabled but no dumps. Information available by setting breakpoints and visually inspect in pdb.
"2": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"3": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES recusrively and dump to DUMP_PREFIX_nodex.txt
"4:: In addition to events dump, track all nodes with more than 1 event recusrively and dump to DUMP_PREFIX_nodex.txt
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "3".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
FX_NET_ACC_SPLITTER_TRACKER_MODE=4 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 11, 2025
Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker enabled but no local dumps. Information available by setting breakpoints and visually inspect in pdb.
"1": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"2": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES
     recursively and dump to DUMP_PREFIX_nodex.txt
"3": In addition to events dump, track all nodes with more than 1 event recursively and dump to DUMP_PREFIX_nodex.txt
Regardless of the modes, tracker is always enabled and dumps by trace_structured.
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "2".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
TORCH_TRACE=~/my_trace_log_dir FX_NET_ACC_SPLITTER_TRACKER_MODE=3 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Also in torch_trace you can find all events: https://www.internalfb.com/intern/paste/P1897874179/

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 12, 2025
Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker enabled but no local dumps. Information available by setting breakpoints and visually inspect in pdb.
"1": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"2": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES
     recursively and dump to DUMP_PREFIX_nodex.txt
"3": In addition to events dump, track all nodes with more than 1 event recursively and dump to DUMP_PREFIX_nodex.txt
Regardless of the modes, tracker is always enabled and dumps by trace_structured.
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "2".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
TORCH_TRACE=~/my_trace_log_dir FX_NET_ACC_SPLITTER_TRACKER_MODE=3 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Also in torch_trace you can find all events: https://www.internalfb.com/intern/paste/P1897874179/

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

haowu14 added a commit to haowu14/pytorch that referenced this pull request Aug 12, 2025
Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker enabled but no local dumps. Information available by setting breakpoints and visually inspect in pdb.
"1": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"2": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES
     recursively and dump to DUMP_PREFIX_nodex.txt
"3": In addition to events dump, track all nodes with more than 1 event recursively and dump to DUMP_PREFIX_nodex.txt
Regardless of the modes, tracker is always enabled and dumps by trace_structured.
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "2".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
TORCH_TRACE=~/my_trace_log_dir FX_NET_ACC_SPLITTER_TRACKER_MODE=3 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Also in torch_trace you can find all events: https://www.internalfb.com/intern/paste/P1897874179/

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

Summary:

A tool to track events in graph split, specifically on how nodes being end up in acc or cpu subgraphs.

Usage: use env var to specify a mode and necessary arguments.

FX_NET_ACC_SPLITTER_TRACKER_MODE: Tracker mode. 
```
Different modes of the event tracker:
"0": Tracker enabled but no local dumps. Information available by setting breakpoints and visually inspect in pdb.
"1": Tracker enabled and dumps all events to DUMP_PREFIX_all.txt
"2": In addition to events dump, track nodes specified by ENV_FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES
     recursively and dump to DUMP_PREFIX_nodex.txt
"3": In addition to events dump, track all nodes with more than 1 event recursively and dump to DUMP_PREFIX_nodex.txt
Regardless of the modes, tracker is always enabled and dumps by trace_structured.
```
FX_NET_ACC_SPLITTER_TRACKER_DUMP_PATH: overriding dump path. Leave empty for `~`.
FX_NET_ACC_SPLITTER_TRACKER_TRACKED_NODES: Nodes to track for mode "2".

Test Plan:
New unit test

```
buck test caffe2/test:fx -- test_fx_split_node_finder
```


---- 


```
TORCH_TRACE=~/my_trace_log_dir FX_NET_ACC_SPLITTER_TRACKER_MODE=3 ../buck-out/v2/gen/fbcode/6f6fe98d41631b2e/inference_enablement/model_processing/infra/components/lowering/re/__re_cinder__/re_cinder.par -r '{"aot_inductor":{"serialized_inference_model_input_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/input.merge.61759375","serialized_inference_model_output_path":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/4/gpu_lowering/inductor_output.merge.61759375","submodule_names_to_lower":["merge"],"inductor_lowering_context":{"aot_inductor_lowering_settings":{"max_batch_size":2048,"min_acc_module_size":10,"workdir":"/tmp/local","name":"merge","dll_name":"inductor_engine.so","use_scripting":true,"preset_lowerer":"gr;disable_new_lowering_weights;disable_dper_passes:passes=fuse_parallel_linear_no_weight_change|fuse_parallel_linear","precision":4,"output_precision":4,"remote_cache_file_path_folder":"ads_storage_fblearner/tree/user/facebook/fblearner/predictor/895540436/","save_remote_cache":true,"aot_inductor_config":"{\"max_autotune\":True,\"comprehensive_padding\":False}","disable_dynamic_shapes":false,"remove_unexpected_type_cast":false,"disable_constraint_solver":false,"sample_input_tile_factor":32,"disable_acc_tracer":true,"generate_sample_inputs":true,"tile_sample_input_by_dynamic_shape":false,"node_replacement_dict":"","dynamic_shapes_strategy":73728,"auto_dynamic_shapes":false,"auto_dynamic_shapes_min_size":1,"auto_dynamic_shapes_max_size":1048576,"max_acc_splits":-1,"dynamic_size":-1,"pre_dispatch_export":true,"merge_split_optimization":false,"use_dynamic_dim_hints":false,"allow_refine_dynamic_shapes_on_constants":false,"use_sigmoid":false}},"model_entity_id":895540436,"model_snapshot_id":4,"add_sample_inputs":false,"platform_arch":0,"lowering_lib_pkg":"ien.lower:prod","dense_in_place_format":2}}'
```

Events dump: P1896093119
Nodes track dump:  P1896110514
The above files are generated locally
```
? _fx_net_tracker_all.txt
? _fx_net_tracker_nodes.txt
```

Also in torch_trace you can find all events: https://www.internalfb.com/intern/paste/P1897874179/

Rollback Plan:

Reviewed By: georgiaphillips

Differential Revision: D79203595
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D79203595

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants