How A Perfect Machine Model Should Be Done

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

CMT307 Coursework 1 Overall Feedback

Mark distribution
Marks of this coursework has a very good distribution:

average: 68
std 15.6
median 70
min 10
max 93

Q1
Confusion matrix:

PREDICTED CLASS Total


points
True False
ACTUAL
CLASS True TP=7 FN=2 P=9
False FP=4 TN=7 N=11

𝑇𝑃 7
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 (𝑝) = = = 63.6%
𝑇𝑃 + 𝐹𝑃 7 + 4

𝑇𝑃 7
𝑅𝑒𝑐𝑎𝑙𝑙 (𝑟) = = = 78%
𝑇𝑃 + 𝐹𝑁 7 + 2

2 ∗ 𝑃𝑟𝑒𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙 2 ∗ 0.636 ∗ 0.78


𝐹1 = = = 0.70
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙 0.636 + 0.78

𝑇𝑃 + 𝑇𝑁 7 + 7
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = = = 70%
𝐴𝑙𝑙 20
Q2
We expect the submission contains right code for all elements and a good summary of your
work in the report. In addition to the individual feedback you have received, below is general
feedback for question 2 to help you identify what you have done well and what you need to do
to improve in future assessments. So you are advised to read your individual feedback in
conjunction with this overall feedback.

i) Data exploration [10%]:


Most of you have done well on this part, but some with incomplete exploration. Data exploration
should generally consider these aspects:

• Size of dataset
• Missing value check (no missing value in this data)
• Features:
• Categorical: categories of each feature & summary
• Numerical: min, max, mean, std (e.g., data.describe())
• Histogram plot of each feature + summary in report
• Scatter plot of pairs of features + summary in report
• Boxplot + summary
• Correlation/relationship + observation summary in report
• between each pair of features
• between each feature and the target (Revenue)
• Further useful and insightful exploration and analysis

ii) Data pre-processing [30%]:


Data pre-processing should consider:

• Discussion of necessary pre-processing based on data exploration from above (e.g.,


scaling, encoding, and even imbalance), justification of the choice of pre-processing
methods
• Data split with stratified sampling (this may be included in implementation)
• Categorical feature encoding for non DT based classifiers
• Most have encoded ‘Month’, ‘VisitorType’, ‘Weekend’.
• Better: Some also attempted encoding ‘OperatingSystems’, ‘Browser’, ‘Region’,
‘TrafficType’ to eliminate the unintended/potential effect of importance/order of
different values.
• Numerical feature scaling for non-DT based classifiers
• Further carefully thought pre-processing
• This could include a meanful/useful investigation of the effects of further pre-
processing techniques (e.g., dealing with imbalanced data, feature selection,
feature extraction/dimensionality reduction, etc) on model performance.
• Data split and pre-processing
• Generally, data split should be done before pre-processing.
• Pre-processing methods should be fitted on the training set only, they shouldn’t be
fitted using test set.
• The same fitted models on training set are used to transform training set and test
set.
Common issues/mistakes

• Preprocessing should be normally fitted on training data set only, and then use the fitted
model to transfer both the training data and test data.
o Bad: Fitted the preprocessing methods on the whole dataset.
o Complete wrong: Fitted the preprocessing methods on training data and test data
separately.
• Preprocessing is done, but it isn’t actually used for the model implementation.
• Insufficient pre-processing
• Irrelevant preprocessing: some students checked the data with no missing values, but
still did imputation.

iii) Model implementation [30%]:


This part should consider:

Three classification methods are carefully selected with a good justification of


selection, which give a good representation of various classification
techniques:
• Simple methods: linear model, logistic regression, kNN,
• Kernel methods: SVM
• Decision tree based methods: DT, random forest, XGBoost, lightGBM,
Catboost, etc.
Correct implementation of the 3 chosen methods
Systematic hyperparameter optimisation considering
• Key hyperparameters
• Range of each hyperparameter
• A suitable way of optimisation: grid search, random search, or own
code using for loop.
Further creativities leading to better and reliable implementation

Common issues/mistakes:

• Arbitrary selection of classifiers without justification/explanation of the choice


• No or little optimisation
• Incorrect hyperparameter search space

iv) Performance evaluation [10%]:


Use most suitable performance metrics to assist the selection of the best model on
characteristics of this dataset. For example, use a combination of metrics to evaluate
performance from different perspectives, which is correctly implemented and well justified in the
report:

• Confusion matrix
• Accuracy
• Precision
• Recall
• F measure (F1 or F-β)

Some students also used ROC and AUC, which is good.

Common issues:

• not adequate attention to the selection of performance metrics


• not suitable metrics are used. Note accuracy isn’t enough for this imbalanced
data.

v) Result analysis and discussion [10%]:


Insightful result analysis and discussion, conclusion clearly drawn.

Common issues:

• very little result analysis and comparison


• vague or no conclusions were drawn from the results.

You might also like