Decision Stump Algorithm 1
Decision Stump Algorithm 1
Pardeep Singh
1. Introduction
A Decision Stump is a simple, one-level decision tree used as a weak learner in machine
learning, particularly for binary classification.
To understand the Decision Stump's efficiency, we look at its main steps: sorting feature
values, evaluating split thresholds, and calculating classification error. Each step depends on
the number of data points (n) and features (d)
For each of the d features, the algorithm sort’s n value, which takes O (n log n) time.
O (d. n log n)
After sorting, the algorithm checks possible thresholds between sorted values. This requires
O (n) time for each feature.
Total time for evaluating thresholds:
O (d. n)
2.3 Classification Error Calculation
The algorithm calculation how well each threshold classifies the data, which again
takes O (n) per feature.
Total time for error calculation:
O (d. n)
O (n. d)
O (n.d)
Decision Trees with multiple levels have higher complexity. If a tree has depth k, the
complexity can be:
O (n. d. k)
This makes them much more computationally intensive than Decision Stumps.
Logistic Regression requires iterative optimization, generally needing O (n.d) time per
iteration, with multiple iterations for convergence. This can result in a higher overall
complexity compared to Decision Stumps.
Thus, the total time complexity for this small dataset is approximately:
O (159)