diff --git a/Chapter_6_ImbalancedLearning/Resampling.ipynb b/Chapter_6_ImbalancedLearning/Resampling.ipynb index 68dadd8..da6fb68 100644 --- a/Chapter_6_ImbalancedLearning/Resampling.ipynb +++ b/Chapter_6_ImbalancedLearning/Resampling.ipynb @@ -986,7 +986,7 @@ "\n", "The number of neighbors $k$ is by default set to $k=3$. It is worth noting that, contrary to RUS, the number of majority class samples that are removed depends on the degree of overlap between the two classes. The method does not allow to specify an imbalanced ratio. \n", "\n", - "The `imblearn` sampler for RUS is [`imblearn.under_sampling.EditedNearestNeighbours`](https://imbalanced-learn.org/stable/references/generated/imblearn.under_sampling.EditedNearestNeighbours.html). Let us illustrate its use and its impact on the classifier decision boundary and the classification performances. \n" + "The `imblearn` sampler for ENN is [`imblearn.under_sampling.EditedNearestNeighbours`](https://imbalanced-learn.org/stable/references/generated/imblearn.under_sampling.EditedNearestNeighbours.html). Let us illustrate its use and its impact on the classifier decision boundary and the classification performances. \n" ] }, { @@ -1071,7 +1071,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "On this dataset, the performances of ENN are poor compared to the previsouly tested techniques. The balanced accuracy was slightly improved compared to the baseline classifier. The performance in terms of AP is however lower than the baseline, and the AUC ROC is the worst of all tested tecniques (and on par with ROS). " + "On this dataset, the performances of ENN are poor compared to the previously tested techniques. The balanced accuracy was slightly improved compared to the baseline classifier. The performance in terms of AP is however lower than the baseline, and the AUC ROC is the worst of all tested tecniques (and on par with ROS). " ] }, { @@ -1196,7 +1196,7 @@ "source": [ "### Combining over and undersampling\n", "\n", - "Oversampling and undersampling are often complementary. On the one hand, oversampling techniques allow to generate synthetic samples from the minority class, and help a classifier in identifying more precisely the decision boundary between the two classes. On the other hand, undersampling techniques reduce the size of the training set, and allow to speed-up the classifier training time. Combining over and undersampling techniques has often been reported to successfully improve the classifier performances (Chapter 5, Section 6 in {cite}fernandez2018learning).\n", + "Oversampling and undersampling are often complementary. On the one hand, oversampling techniques allow to generate synthetic samples from the minority class, and help a classifier in identifying more precisely the decision boundary between the two classes. On the other hand, undersampling techniques reduce the size of the training set, and allow to speed-up the classifier training time. Combining over and undersampling techniques has often been reported to successfully improve the classifier performances (Chapter 5, Section 6 in {cite}`fernandez2018learning`).\n", "\n", "In terms of implementation, the combination of samplers is obtained by chaining the samplers in a `pipeline`. The samplers can then be chained to a classifer. We illustrate below the chaining of an SMOTE oversampling to a random undersampling to a decision tree classifier. \n", "\n",