Skip to content

Specify Units for Mutual Information Metrics #18288

Closed
@amrcode

Description

@amrcode

Mutual Information-based score functions do not specify their units

If scikit-learn is used by non-ML practitioners, they may not expect the *_mutual_info_score functions to use units of nats. In other fields (such as communication), bits are more common. While there is a note about the natural log being used later and it is easy to convert the two units, the documentation for these functions does not specify in the return value which logarithm is used to calculate the MI, creating the potential for unit confusion.

Potential Fix: Specify MI Units in Documentation

The current documentation for, e.g. mutual_info_score, says that the return value is "Mutual information, a non-negative value." Simply adding "measured in nats" would help immensely. Or, to make explicit the difference in formula, something like "Mutual information calculated using the natural logarithm, giving a non-negative value measured in nats." This would combine the units with the note about the natural log currently below the return value.

Adding a mention of the natural logarithm use earlier in all three forms' descriptions would help too.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions