Skip to content

Diabetes data set description. Possibly inaccurate? #18940

Closed
@reubengann

Description

@reubengann

I think the feature descriptions for the diabetes set are incorrect. Referring to https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/datasets/descr/diabetes.rst

s1 tc, T-Cells (a type of white blood cells)
s2 ldl, low-density lipoproteins
s3 hdl, high-density lipoproteins
s4 tch, thyroid stimulating hormone
s5 ltg, lamotrigine
s6 glu, blood sugar level

I think tc is probably total cholesterol.

ltg is indicated as "lamotrigine", which I think is a drug for treating epilepsy. My guess is that it stands for triglyceride level.

The Efron et al. paper does not describe the features in any real way, nor where the data came from.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions