0% found this document useful (0 votes)
44 views

Decision Stump Algorithm 1

The document analyzes the time and space complexity of the Decision Stump algorithm, which is a simple one-level decision tree used for binary classification. The overall time complexity is O(d.n log n) and the space complexity is O(n.d), making it more efficient than more complex algorithms like Decision Trees and Logistic Regression. An example with a small dataset illustrates the algorithm's efficiency in handling such cases.

Uploaded by

Pardeep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Decision Stump Algorithm 1

The document analyzes the time and space complexity of the Decision Stump algorithm, which is a simple one-level decision tree used for binary classification. The overall time complexity is O(d.n log n) and the space complexity is O(n.d), making it more efficient than more complex algorithms like Decision Trees and Logistic Regression. An example with a small dataset illustrates the algorithm's efficiency in handling such cases.

Uploaded by

Pardeep Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Decision Stump Algorithm: Time and Space Complexity Analysis

Pardeep Singh

MSc Data science, University of Europe for applied science

AEBS_DC_MAVe_P2ML01: Machine Learning

Prof. Dr. Eman Abukhousa


Decision Stump Algorithm: Time and Space Complexity Analysis

1. Introduction

A Decision Stump is a simple, one-level decision tree used as a weak learner in machine
learning, particularly for binary classification.

2. Time Complexity Analysis

To understand the Decision Stump's efficiency, we look at its main steps: sorting feature
values, evaluating split thresholds, and calculating classification error. Each step depends on
the number of data points (n) and features (d)

2.1 Sorting Feature Values

For each of the d features, the algorithm sort’s n value, which takes O (n log n) time.

Total time for shorting all features:

O (d. n log n)

2.2 Threshold Evaluation

After sorting, the algorithm checks possible thresholds between sorted values. This requires
O (n) time for each feature.
Total time for evaluating thresholds:
O (d. n)
2.3 Classification Error Calculation

 The algorithm calculation how well each threshold classifies the data, which again
takes O (n) per feature.
 Total time for error calculation:

O (d. n)

2.4 Overall Time Complexity

Combining all steps, the total time complexity is:


O (d. n log n)

3. Space Complexity Analysis

The space needed for the algorithm can be summarized as follows:

3.1 Data Storage

 The algorithm needs to store n data points with d features, requiring:

O (n. d)

3.2 Predictions and Thresholds

 Space for storing predictions: O(n)


 Space for storing thresholds: O(d)

3.3 Overall Space Complexity

Combining these, the overall space complexity is:

O (n.d)

4. Complexity Comparison with Other Algorithms

4.1 Decision Trees

 Decision Trees with multiple levels have higher complexity. If a tree has depth k, the
complexity can be:

O (n. d. k)

This makes them much more computationally intensive than Decision Stumps.

4.2 Logistic Regression

 Logistic Regression requires iterative optimization, generally needing O (n.d) time per
iteration, with multiple iterations for convergence. This can result in a higher overall
complexity compared to Decision Stumps.

5. Example Complexity Analysis

Consider a dataset with 10 data points and 3 features:


1. Sorting each feature takes about O (10 log 10 ≈ O 33) for each feature, totalling O
(99).
2. Threshold Evaluation takes O (30).
3. Classification Error Calculation also takes O (30).

Thus, the total time complexity for this small dataset is approximately:

O (159)

This shows that Decision Stumps handle small datasets efficiently.

You might also like