TY - GEN
T1 - RAMEN: A Ratio-Weighted Majority Entropy-Based Decision Tree Algorithm for Classifying Imbalanced Datasets
AU - Afolabi, Doyinsola
AU - Sennaike, Oladipupo
AU - Ogunseye, Shawn
AU - Philips, Adewole
PY - 2023
Y1 - 2023
N2 - Dealing with imbalanced datasets is an essential part of most analytic lifecycles. However, an inadequate classification of imbalance datasets may result in the loss of information and insights, as well as in the reduced repurposability of the data. While most well-known classification algorithms, including the decision tree algorithm, can efficiently make predictions from balanced datasets, these algorithms are inefficient when classifying imbalanced datasets. To address this concern, the present study, building on a traditional decision tree classification algorithm – namely, the ID3 algorithm –proposes a ratio-weighted majority entropy decision tree algorithm (RAMEN). RAMEN removes bias toward the majority class in imbalanced datasets. The RAMEN algorithm is then tested on two imbalanced datasets: a cancer case dataset and a banknote verification dataset. Comparing the performances of the RAMEN technique with those of the traditional ID3 decision tree algorithm and the minority entropy-based decision tree algorithm, we find that RAMEN outperforms both aforementioned algorithms for the two datasets.
AB - Dealing with imbalanced datasets is an essential part of most analytic lifecycles. However, an inadequate classification of imbalance datasets may result in the loss of information and insights, as well as in the reduced repurposability of the data. While most well-known classification algorithms, including the decision tree algorithm, can efficiently make predictions from balanced datasets, these algorithms are inefficient when classifying imbalanced datasets. To address this concern, the present study, building on a traditional decision tree classification algorithm – namely, the ID3 algorithm –proposes a ratio-weighted majority entropy decision tree algorithm (RAMEN). RAMEN removes bias toward the majority class in imbalanced datasets. The RAMEN algorithm is then tested on two imbalanced datasets: a cancer case dataset and a banknote verification dataset. Comparing the performances of the RAMEN technique with those of the traditional ID3 decision tree algorithm and the minority entropy-based decision tree algorithm, we find that RAMEN outperforms both aforementioned algorithms for the two datasets.
UR - https://ieeexplore.ieee.org/document/10252999
M3 - Conference contribution
BT - 3rd International Conference on Electrical, Computer, Communications (IEEE)
PB - IEEE
ER -