ENH log1pexp for binomial loss in loss module #21814

lorentzenchr · 2021-11-29T10:09:57Z

Reference Issues/PRs

Follow-up of #20567

What does this implement/fix? Explain your changes.

This PR improves helper function log1pexp a bit, which speeds up HalfBinomialLoss.loss(..)

Any other comments?

Ideally, this PR is merged before #21808 and #20811. It will improve their benchmarks.

sklearn/_loss/_loss.pyx.tp

thomasjpfan · 2021-11-29T12:48:40Z

sklearn/_loss/_loss.pyx.tp

 cdef inline double log1pexp(double x) nogil:
    if x <= -37:
        return exp(x)
-    elif x <= 18:
+    elif x <= -2:


How was -2 chosen here?

I'll add a shorter note.
The longer story is at the end of section 2 in https://cran.r-project.org/web/packages/Rmpfr/vignettes/log1mexp-note.pdf, arguing for log(2) as cutoff for the function log(1-exp(-x)).
I tested this -2 on our function and it gives a difference of 1e-16.

import numpy as np def diff(x): """Return abs diff and rel diff""" use_log = np.log(1 + np.exp(x)) use_log1p = np.log1p(np.exp(x)) return use_log - use_log1p, use_log / use_log1p - 1 for x in [0, -1, -2, -3]: print(f"x={x}: {diff(x)}")

results in

x=0: (0.0, 0.0) x=-1: (0.0, 0.0) x=-2: (1.1102230246251565e-16, 8.881784197001252e-16) x=-3: (-1.0408340855860843e-16, -2.1094237467877974e-15)

lorentzenchr · 2021-11-29T13:22:46Z

Some timings give

rng = np.random.default_rng(0)
y_true = rng.binomial(1, 0.5, size=100_000).astype(np.float64)
raw = rng.standard_normal(100_000, dtype=np.float64)

%%timeit -r10 -n100
cy_logloss_stable(y_true, raw)

2.22 ms ± 93.8 µs

%%timeit -r10 -n100
cy_logloss_stable_fast(y_true, raw)

1.72 ms ± 81.9 µs

The reason is of course that raw is drawn from a standard normal and therefore is most of the time in the range -2 < raw < 18, where this change takes effect.

%%cython -3
# distutils: extra_compile_args = -O3
# cython: cdivision=True
# cython: boundscheck=False
# cython: wraparound=False


import cython
from cython.parallel import prange
import numpy as np
from libc.math cimport exp, log, log1p
cimport numpy as np

np.import_array()


# Numerically stable log(1 + exp(x))
cdef inline double log1pexp(double x) nogil:
    if x <= -37:
        return exp(x)
    elif x <= 18:
        return log1p(exp(x))
    elif x <= 33.3:
        return x + exp(-x)
    else:
        return x


# Faster version of numerically stable log(1 + exp(x))
cdef inline double log1pexp_fast(double x) nogil:
    if x <= -37:
        return exp(x)
    elif x <= -2:
        return log1p(exp(x))
    elif x <= 18:
        return log(1 + exp(x))
    elif x <= 33.3:
        return x + exp(-x)
    else:
        return x



cdef inline double c_logloss(double y_true, double raw) nogil:
    return log1pexp(raw) - y_true * raw
    

cdef inline double c_logloss_fast(double y_true, double raw) nogil:
    return log1pexp_fast(raw) - y_true * raw


def cy_logloss_stable(double[::1] y_true, double[::1] raw):
    cdef:
        int n_samples
        int i
        double[::1] out = np.empty_like(y_true)
    
    n_samples = y_true.shape[0]
    for i in range(n_samples):
        out[i] = c_logloss(y_true[i], raw[i])
        
    return np.asarray(out)


def cy_logloss_stable_fast(double[::1] y_true, double[::1] raw):
    cdef:
        int n_samples
        int i
        double[::1] out = np.empty_like(y_true)
    
    n_samples = y_true.shape[0]
    for i in range(n_samples):
        out[i] = c_logloss_fast(y_true[i], raw[i])
        
    return np.asarray(out)

ogrisel

LGTM

lorentzenchr · 2021-11-30T17:16:39Z

@jjerphan @thomasjpfan This is a tiny change and would help me to move forward.

thomasjpfan

LGTM

lorentzenchr · 2021-11-30T17:57:07Z

Thanks a lot!

lorentzenchr added 2 commits November 29, 2021 11:05

ENH log1pexp

d703fa2

MNT set loss class default to n_classes=None

3c3faa2

github-actions bot added the cython label Nov 29, 2021

lorentzenchr added No Changelog Needed Waiting for Reviewer Performance labels Nov 29, 2021

thomasjpfan reviewed Nov 29, 2021

View reviewed changes

MNT comment for reason of further cuttof at -2

e1b9179

DOC wording: accurate instead of precise

9400640

lorentzenchr changed the title ~~ENH log1pexp for binomial loss in loss module~~ [MRG] ENH log1pexp for binomial loss in loss module Nov 29, 2021

ogrisel approved these changes Nov 29, 2021

View reviewed changes

jjerphan self-requested a review November 29, 2021 22:38

thomasjpfan approved these changes Nov 30, 2021

View reviewed changes

thomasjpfan changed the title ~~[MRG] ENH log1pexp for binomial loss in loss module~~ ENH log1pexp for binomial loss in loss module Nov 30, 2021

thomasjpfan merged commit 72fc7c6 into scikit-learn:main Nov 30, 2021

lorentzenchr deleted the improve_log1pexp branch November 30, 2021 17:52

lorentzenchr mentioned this pull request Dec 2, 2021

ENH Replace loss module HGBT #20811

Merged

lorentzenchr mentioned this pull request Apr 13, 2023

ENH: Add softplus implementation scipy/scipy#17905

Closed

lorentzenchr added Numerical Stability and removed Waiting for Reviewer labels May 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH log1pexp for binomial loss in loss module #21814

ENH log1pexp for binomial loss in loss module #21814

Uh oh!

lorentzenchr commented Nov 29, 2021 •

edited

Loading

Uh oh!

Uh oh!

thomasjpfan Nov 29, 2021

Uh oh!

lorentzenchr Nov 29, 2021

Uh oh!

lorentzenchr Nov 29, 2021

Uh oh!

lorentzenchr commented Nov 29, 2021 •

edited

Loading

Uh oh!

ogrisel left a comment

Uh oh!

lorentzenchr commented Nov 30, 2021

Uh oh!

thomasjpfan left a comment

Uh oh!

lorentzenchr commented Nov 30, 2021

Uh oh!

Uh oh!

Uh oh!

ENH log1pexp for binomial loss in loss module #21814

ENH log1pexp for binomial loss in loss module #21814

Uh oh!

Conversation

lorentzenchr commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Uh oh!

thomasjpfan Nov 29, 2021

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Nov 29, 2021

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Nov 29, 2021

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Nov 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Nov 30, 2021

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

lorentzenchr commented Nov 30, 2021

Uh oh!

Uh oh!

lorentzenchr commented Nov 29, 2021 •

edited

Loading

lorentzenchr commented Nov 29, 2021 •

edited

Loading