Skin cancer is one of the most common cancers in the world. However, the disease is curable if detected in the beginning stage. Early detection of malignant lesions through accurate techniques and innovative technologies has a significant impact on reducing skin cancer mortality rates.
Recently, artificial intelligence has come to the forefront to facilitate skin cancer diagnosis based on medical images. Many deep learning models have been studied and developed, but the imbalance of performance among classes in the multi-class classification is still a challenging problem. This study proposes a hybrid method for handling class imbalance of skin-disease classification.
This method combines the data level method of balanced mini-batch logic followed by real-time image augmentation with the algorithm level method of designing new loss function. The training dataset includes 24,530 dermoscopic images of seven skin disease categories, which is by far the largest dataset of skin cancer.
The performance metrics of six proposed methods are evaluated on a test dataset of 2,453 images. Our proposed EfficientNetB4-CLF model achieves the highest accuracy of 89.97% and also the highest mean recall of 86.13% with the smallest recalls’ standard deviations of 7.60%. Compared to the original methods, our proposed solution not only surpasses 4.65% (86.13% vs 81.48%) of mean recalls but also reduces 4.24% of the recalls’ standard deviations (from ±11.84% to ±7.60%).
This result indicates that our hybrid method is highly effective in training the Deep CNN network on the skin-disease imbalanced dataset. It addresses the problem of slow learning of the minority classes in the networks by combining the data level method of balanced mini-batch logic followed by the real-time image augmentation with the algorithm level method of the newly designed loss function.