Training en test data 1: same data for training en testing

Training en test data 2: holdout / percentage split

Training en test data 3: k-fold cross-validation

1.94M

Похожие презентации:

Concepts and Techniques

Assessing and comparing classification algorithms

The E-Discovery Games

Prediction of Postoperative Complications in Cardiac Surgery

Logistic regression

Overview of data mining

Anomaly detection

Understanding Dark Web and facilitates organized crime

Sideview models

OS Fingerprinting and Tethering Detection in Mobile Networks

W5.1 Evaluation

1. Evaluation

Data Mining Concepts and Techniques
Chapter 9.5
Partly based on slides prepared by Jiawei Han
1

2. Evaluation

• Why?
• What?
• How?
• Measures
• Training and test data
• Significance
2

3. Confusion matrix

4. Two classes

• Two classes: T/F, Positive/Negative
Predicted
positive
Predicted
negative
Actual positive
Actual negative
4

5. Two classes

• Two classes: T/F, Positive/Negative
Predicted
positive
Predicted
negative
Actual positive
True positives
False negatives
Actual negative
False positives
True negatives
5

6. Two class measures

True positive / false positive / true negative / false negative
• Accuracy
(TP+TN) /(P+N)
• Error rate
(FP+FN) / (P+N)
• Sensitivity
TP / P
• Specificity
TN / N
• Precision
TP / (TP + FP)
• Recall
TP / P
• F-score
(2 * precision * recall)/(precision + recall)
6

7. Multi-class measures?

8. Evaluation

• Why?
• What?
• How?
• Measures
• Training and test data
• Significance
8

9. Training en test data 1: same data for training en testing

Bad idea => why?
9

10. Training en test data 2: holdout / percentage split

Complete data set
x
x
x
x
x
x
x
x
x
x
Randomly select x% as test data
train
test
train
train
train
test
train
train
test
train
Risk?
Atypical test set
10

11. Training en test data 3: k-fold cross-validation

Complete data set
x
x
x
x
x
x
x
x
x
x
Fold 1:
test
test
train
train
train
train
train
train
train
train
Fold 2:
train
train
test
test
train
train
train
train
train
train
Fold 3:
train
train
train
train
test
test
train
train
train
train
Fold 4:
train
train
train
train
train
train
test
test
train
train
Fold 5:
train
train
train
train
train
train
train
train
test
test
Average results over folds
11

12. More cross-validation

• Leave-one-out
• Stratified cross-validation
12

13. Evaluation

• Why?
• What?
• How?
• Measures
• Training and test data
• Significance
13

14. Method M1 significantly better than M2?

• 10-fold cross-validation => n=10
• Paired t-test
• H0: performance M1 same as M2
• H1: performance M1 differs from M2
14

15. 16. Other aspects of performance

• Efficiency
• Scalability
• Robustness
• Interpretability
16

17. And now…

• Do exercise evaluation
17

English Русский Правила