Skip to content

[ENH?] EngFormatter: add the possibility to remove the space before the SI prefix #6533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
afvincent opened this issue Jun 4, 2016 · 10 comments
Milestone

Comments

@afvincent
Copy link
Contributor

Discussing with some of my colleagues who are using other plotting softwares than Matplotlib, I noticed that their formatting of the engineering notation is a bit different from the one implemented in ticker.EngFormatter. Some softwares append directly the prefix/unit to the tick value (e.g. "1.23µs", vs "1.23 µs" with Matplotlib). I am not sure it is really OK from the (English) typography point of view but I guess they are doing this due to the rather limited ticklablel space.

I wonder how interesting it may be to add such a possibility to the EngFormatter in Matplotlib. As some users may prefer "1.23µs" over "1.23 µs", I would say it's worth adding it to EngFormatter.

If it is added to EngFormatter, I guess the major pitfall would be the default values… IMO, the current behavior of Matplotlib is the best one when EngFormatter is instantiated with a unit. However, when it is instantatied without unit (unit=""), I wouldn't be categorical about the fact that "1.23 µ" is better than "1.23µ". So I don't really know if one should use by default a space separator between the value and the prefix/unit, or not…

I wrote a small demonstration of what could be easily done with the EngFormatterclass (keeping the current Matplotlib behavior as the default one). It is > 100 lines because I directly copy-pasted the source code of ticker.EngFormatter. I've put the changes between <ENH> and <\ENH> tags. NB: the code includes a bug fix similar to PR #6014 .

from __future__ import division, print_function, unicode_literals
import decimal
import math
import numpy as np
from matplotlib.ticker import Formatter


# Proposed "enhancement" of the EngFormatter
class EnhancedEngFormatter(Formatter):
    """
    Formats axis values using engineering prefixes to represent powers of 1000,
    plus a specified unit, e.g., 10 MHz instead of 1e7.
    """

    # the unicode for -6 is the greek letter mu
    # commeted here due to bug in pep8
    # (https://github.com/jcrocholl/pep8/issues/271)

    # The SI engineering prefixes
    ENG_PREFIXES = {
        -24: "y",
        -21: "z",
        -18: "a",
        -15: "f",
        -12: "p",
         -9: "n",
         -6: "\u03bc",
         -3: "m",
          0: "",
          3: "k",
          6: "M",
          9: "G",
         12: "T",
         15: "P",
         18: "E",
         21: "Z",
         24: "Y"
    }

    def __init__(self, unit="", places=None, space_sep=True):
        """ Parameters
            ----------
            unit: str (default: "")
                Unit symbol to use.

            places: int (default: None)
                Number of digits after the decimal point.
                If it is None, falls back to the floating point format '%g'.

            space_sep: boolean (default: True)
                If True, a (single) space is used between the value and the
                prefix/unit, else the prefix/unit is directly appended to the
                value.
        """
        self.unit = unit
        self.places = places
        # <ENH>
        if space_sep is True:
            self.sep = " "  # 1 space
        else:
            self.sep = ""  # no space
        # <\ENH>

    def __call__(self, x, pos=None):
        s = "%s%s" % (self.format_eng(x), self.unit)
        return self.fix_minus(s)

    def format_eng(self, num):
        """ Formats a number in engineering notation, appending a letter
        representing the power of 1000 of the original number. Some examples:

        >>> format_eng(0)       # for self.places = 0
        '0'

        >>> format_eng(1000000) # for self.places = 1
        '1.0 M'

        >>> format_eng("-1e-6") # for self.places = 2
        u'-1.00 \u03bc'

        @param num: the value to represent
        @type num: either a numeric value or a string that can be converted to
                   a numeric value (as per decimal.Decimal constructor)

        @return: engineering formatted string
        """

        dnum = decimal.Decimal(str(num))

        sign = 1

        if dnum < 0:
            sign = -1
            dnum = -dnum

        if dnum != 0:
            pow10 = decimal.Decimal(int(math.floor(dnum.log10() / 3) * 3))
        else:
            pow10 = decimal.Decimal(0)

        pow10 = pow10.min(max(self.ENG_PREFIXES.keys()))
        pow10 = pow10.max(min(self.ENG_PREFIXES.keys()))

        prefix = self.ENG_PREFIXES[int(pow10)]

        mant = sign * dnum / (10 ** pow10)

        # <ENH>
        if self.places is None:
            format_str = "%g{sep:s}%s".format(sep=self.sep)
        elif self.places == 0:
            format_str = "%i{sep:s}%s".format(sep=self.sep)
        elif self.places > 0:
            format_str = "%.{p:i}f{sep:s}%s".format(p=self.places,
                                                    sep=self.sep)
        # <\ENH>

        formatted = format_str % (mant, prefix)

        formatted = formatted.strip()
        if (self.unit != "") and (prefix == self.ENG_PREFIXES[0]):
            # <ENH>
            formatted = formatted + self.sep
            # <\ENH>

        return formatted


# DEMO
def demo_formatter(**kwargs):
    """ Print the strings produced by the EnhancedEngFormatter for a list of
        arbitrary test values.
    """
    TEST_VALUES = [1.23456789e-6, 0.1, 1, 999.9, 1001]
    unit = kwargs.get('unit', "")
    space_sep = kwargs.get('space_sep', True)
    formatter = EnhancedEngFormatter(**kwargs)

    print("\n[Case: unit='{u:s}' & space_sep={s}]".format(u=unit, s=space_sep))
    print(*["{tst};".format(tst=formatter(value)) for value in TEST_VALUES])

if __name__ == '__main__':
    """ Matplotlib current behavior (w/ space separator) """
    demo_formatter(unit="s", space_sep=True)
    # >> 1.23457 μs; 100 ms; 1 s; 999.9 s; 1.001 ks;
    demo_formatter(unit="", space_sep=True)
    # >> 1.23457 μ; 100 m; 1; 999.9; 1.001 k;

    """ New possibility (w/o space separator) """
    demo_formatter(unit="s", space_sep=False)
    # >> 1.23457μs; 100ms; 1s; 999.9s; 1.001ks;
    demo_formatter(unit="", space_sep=False)
    # >> 1.23457μ; 100m; 1; 999.9; 1.001k;
@tacaswell tacaswell added this to the 2.1 (next point release) milestone Jun 4, 2016
@tacaswell
Copy link
Member

I would be OK with adding this option to the existing formatter.

@afvincent
Copy link
Contributor Author

With the defaults unit="" and space_sep=True which keep the current (default) behavior and return strings like "1.23 µ", "4.56", "78.9 k"? I'm asking because I am not sure what defaults the users would expect from such a formatter.

@tacaswell
Copy link
Member

Yes, unless it is actively broken we try to always preserve existing default functionality.

I personally would expect there to be a space there, but see the use of adding the knob to change it.

If you want to make the case that space should default to False, then this can be done in 2.0, but you have a short window left on that.

@QuLogic
Copy link
Member

QuLogic commented Jun 5, 2016

siunitx uses a space between number and unit. Since it's trying to follow the SI standard, I'd say keeping the space is the better default.

@QuLogic QuLogic closed this as completed Jun 5, 2016
@QuLogic QuLogic reopened this Jun 5, 2016
@QuLogic
Copy link
Member

QuLogic commented Jun 5, 2016

Oops, sorry for the misclick there.

@afvincent
Copy link
Contributor Author

I'm totally fine with keeping the space separator as default, in particular because it's what typography says to be correct when there is a unit. Unfortunately, I didn't find any typographic reference about what's correct in the case with only a SI prefix (= without a unit symbol after). And it is less obvious to me that "1.23 µ" is definitively better (or at least less worse) than "1.23µ" from the typographic viewpoint¹. And as currently the default EngFormatter is “unitless” (unit=""), I was wondering about the opportunity to change the default behavior if there were strong support for "1.23µ" over "1.23 µ"…

I guess the more rational thing to do is to keep the current behavior (corresponding to space_sep=True) and to advertise for the new option (space_sep=False) by updating the example engineering_formatter.

If this solution is fine, I will submit a PR with the proposed changes and defaults. By the way, in that case, should I wait for the PR #6014 to be merged to avoid having unnecessary conflicts to resolve²?

¹ : I'm not even sure one of the two solutions of considered correct by typographers…
² : in particular, the line formatted = formatted + " " in PR #6014 would become formatted = formatted + self.sep in the new PR.

@tacaswell
Copy link
Member

Just merged #6014

@QuLogic
Copy link
Member

QuLogic commented Jun 6, 2016

All I can find is the SI guide 6.2.6:

Prefix symbols cannot stand alone and thus cannot be attached to the number 1, the symbol for the unit one.

which gives no indication of what to do without a unit, but of course it also insists that should never happen...

@afvincent
Copy link
Contributor Author

I've just submitted a PR, with what has been discussed here as default.

@tacaswell tacaswell modified the milestones: 2.1 (next point release), 2.2 (next next feature release) Oct 3, 2017
@QuLogic
Copy link
Member

QuLogic commented Aug 12, 2021

Implemented in #6542.

@QuLogic QuLogic closed this as completed Aug 12, 2021
@QuLogic QuLogic modified the milestones: needs sorting, v2.1 Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants