-
Notifications
You must be signed in to change notification settings - Fork 146
Closed
Description
There is quite a bit of slowdown in the validation routine of the SPDX document, one potentially offender seems to be this function which loads in the entire list of IDs in a document over and over for each call, with a linear search for the ID each time.
tools-python/src/spdx_tools/spdx/validation/spdx_id_validators.py
Lines 25 to 28 in f15a64f
def is_spdx_id_present_in_document(spdx_id: str, document: Document) -> bool: | |
all_spdx_ids_in_document: List[str] = get_list_of_all_spdx_ids(document) | |
return spdx_id in all_spdx_ids_in_document |
This came up due to slowdown when running ntia-checker
133911027 function calls (133748641 primitive calls) in 31.696 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
585/1 0.007 0.000 31.700 31.700 {built-in method builtins.exec}
1 0.000 0.000 31.700 31.700 ntia-checker:1(<module>)
1 0.000 0.000 31.454 31.454 main.py:42(main)
1 0.000 0.000 31.446 31.446 sbom_checker.py:19(__init__)
1 0.000 0.000 30.759 30.759 document_validator.py:19(validate_full_spdx_document)
1 0.007 0.007 25.766 25.766 relationship_validator.py:12(validate_relationships)
3996 0.035 0.000 25.758 0.006 relationship_validator.py:22(validate_relationship)
15428 0.080 0.000 25.724 0.002 spdx_id_validators.py:46(validate_spdx_id)
7992 0.363 0.000 25.589 0.003 spdx_id_validators.py:25(is_spdx_id_present_in_document)
7993 0.067 0.000 25.229 0.003 spdx_id_validators.py:31(get_list_of_all_spdx_ids)
7993 0.032 0.000 24.981 0.003 document_utils.py:11(get_contained_spdx_element_ids)
7993 8.097 0.001 23.264 0.003 document_utils.py:12(<listcomp>)
59639927 9.758 0.000 16.273 0.000 dataclass_with_properties.py:46(get_field)
60112853 6.556 0.000 6.556 0.000 {built-in method builtins.getattr}
1 0.000 0.000 4.771 4.771 package_validator.py:22(validate_packages)
435 0.002 0.000 4.771 0.011 package_validator.py:36(validate_package_within_document)
7871 0.019 0.000 4.748 0.001 license_expression_validator.py:26(validate_license_expression)
148 0.054 0.000 3.700 0.025 __init__.py:812(get_spdx_licensing)
148 0.001 0.000 3.004 0.020 __init__.py:860(build_spdx_licensing)
<TRUNCATED>
Ask:
Could there be a function that would be able to do this on multiple invocations that uses a dictionary?
Metadata
Metadata
Assignees
Labels
No labels