如何用Confusion matrix,classification report,ROC curve (AUC)分析一个二分类问题

发布时间 2023-08-04 11:40:36作者: 笨笨和呆呆

ROC

https://zhuanlan.zhihu.com/p/246444894  

 

Sure, let's create a random confusion matrix as an example, and then I'll explain what each element in the matrix means:

Suppose we have a binary classification problem, where the true labels are as follows:

True Positive (TP) = 25
False Positive (FP) = 15
False Negative (FN) = 10
True Negative (TN) = 50

The corresponding confusion matrix would be:

```
Predicted Positive Predicted Negative
Actual Positive 25 (TP) 10 (FN)
Actual Negative 15 (FP) 50 (TN)
```

Let's now break down the elements of this confusion matrix:

1. **True Positive (TP)**:
The number of samples that are correctly predicted as positive (belonging to the positive class). In this example, we correctly predicted 25 positive instances.

2. **False Positive (FP)**:
The number of samples that are incorrectly predicted as positive, but they are actually negative (belonging to the negative class). In this example, we incorrectly predicted 15 negative instances as positive.

3. **False Negative (FN)**:
The number of samples that are incorrectly predicted as negative, but they are actually positive (belonging to the positive class). In this example, we incorrectly predicted 10 positive instances as negative.

4. **True Negative (TN)**:
The number of samples that are correctly predicted as negative (belonging to the negative class). In this example, we correctly predicted 50 negative instances.

Now, let's calculate some performance metrics using this confusion matrix:

- **Accuracy**:
Accuracy is the overall performance of the model and is calculated as the ratio of correctly predicted samples to the total number of samples.

Accuracy = (TP + TN) / (TP + TN + FP + FN) = (25 + 50) / (25 + 15 + 10 + 50) = 75 / 100 = 0.75 or 75%

- **Precision**:
Precision measures the proportion of true positive predictions among the total predicted positive samples. It is a measure of how many of the predicted positive instances are actually positive.

Precision = TP / (TP + FP) = 25 / (25 + 15) = 25 / 40 = 0.625 or 62.5%

- **Recall (Sensitivity)**:
Recall, also known as sensitivity or true positive rate, measures the proportion of true positive predictions among the actual positive samples. It is a measure of how many of the actual positive instances were correctly predicted.

Recall = TP / (TP + FN) = 25 / (25 + 10) = 25 / 35 = 0.714 or 71.4%

- **Specificity (True Negative Rate)**:
Specificity measures the proportion of true negative predictions among the actual negative samples. It is a measure of how many of the actual negative instances were correctly predicted.

Specificity = TN / (TN + FP) = 50 / (50 + 15) = 50 / 65 = 0.769 or 76.9%

These performance metrics provide insights into the model's ability to correctly classify positive and negative instances and overall accuracy. Different metrics may be more important depending on the specific use case and requirements.

如何利用classification_report评估模型的表现

https://www.python100.com/html/86808.html