Deep Learning-Based Blood Cell Classification with Enhanced Data Preprocessing and Augmentation

Main Article Content

marwa raid

Abstract

Classification of blood cells accurately is an extremely important task in the field of hematology, for the diagnosis of blood disorders and for guiding decision making in the clinical context. In this paper, we use a high-quality dataset of 17,092 microscopic peripheral blood cell images from the Hospital Clinic of Barcelona, encompassing eight different cell types, all of which have been annotated by expert pathologists. To improve model performance and to tackle the class imbalance in the dataset, we developed a strong data preprocessing and data augmentation pipeline which includes contrast enhancement, normalization, geometric and photometric transformations, injection of noise, and mixup style synthetic data. We develop two state-of-the-art deep learning models (EfficientNet-B0, ResNet50) to enable benchmarking of the proposed pipeline. In our experimental results, EfficientNet-B0 achieved overall accuracy of approximately 98.3% and ResNet50 achieved accuracy of 98.6%, with very good precision, recall, and F1-scores for all classes. These preliminary results demonstrate the effectiveness of the designed data preprocessing and data augmentation strategies, as well as provide a benchmark for managing blood cell images in hematology for future research.


 

Downloads

Download data is not yet available.

Article Details

How to Cite
raid, marwa. (2025). Deep Learning-Based Blood Cell Classification with Enhanced Data Preprocessing and Augmentation. AlKadhim Journal for Computer Science, 3(4), 22–36. https://doi.org/10.61710/kjcs.v3i4.122
Section
Computer Science

References

Mustafa, M.E.; Mansoor, M.M.; Mohammed, A.; Babker, A.A. Evaluation of Platelets Count and Coagulation Parameters among Patients with Liver Disease. World J. Pharm. Res. 2015, 4, 360–368.

Isbister, J.P. Common Presentations of Haematological Diseases. Available online: https://journals.co.za/doi/pdf/10.10520/AJA02599333_2849 (accessed on 20 June 2024).

Goliwas, K.F.; Richter, J.R.; Pruitt, H.C.; Araysi, L.M.; Anderson, N.R.; Samant, R.S.; Lobo-Ruppert, S.M.; Berry, J.L.; Frost, A.R. Methods to Evaluate Cell Growth, Viability, and Response to Treatment in a Tissue Engineered Breast Cancer Model. Sci. Rep. 2017, 7, 14167.

Mohammed, E.A.; Mohamed, M.M.; Far, B.H.; Naugler, C. Peripheral Blood Smear Image Analysis: A Comprehensive Review. J. Pathol. Inform. 2014, 5, 9.

Chen, S.; Zhao, M.; Wu, G.; Yao, C.; Zhang, J. Recent Advances in Morphological Cell Image Analysis. Comput. Math. Methods Med. 2012, 2012, 101536.

Santos-Silva, M.A.; Sousa, N.; Sousa, J.C. Artificial Intelligence in Routine Blood Tests. Front. Med. Eng. 2024, 2, 1369265.

Farfour, E.; Clichet, V.; Péan de Ponfilly, G.; Carbonnelle, E.; Vasse, M. Impact of COVID-19 Pandemic on Blood Culture Practices and Bacteremia Epidemiology. Diagn. Microbiol. Infect. Dis. 2023, 107, 116002. [PubMed]

Xu, Y.; Liu, X.; Cao, X.; Huang, C.; Liu, E.; Qian, S.; Liu, X.; Wu, Y.; Dong, F.; Qiu, C.-W.; et al. Artificial Intelligence: A Powerful Paradigm for Scientific Research. Innovation 2021, 2, 100179. [PubMed]

Maturana, C.R.; de Oliveira, A.D.; Nadal, S.; Bilalli, B.; Serrat, F.Z.; Soley, M.E.; Igual, E.S.; Bosch, M.; Lluch, A.V.; Abelló, A.; et al. Advances and Challenges in Automated Malaria Diagnosis Using Digital Microscopy Imaging with Artificial Intelligence Tools: A Review. Front. Microbiol. 2022, 13, 1006659.

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Commun. ACM 2020, 63, 139–144.

Zhang, J.; Xie, Y.; Wu, Q.; Xia, Y. Medical Image Classification Using Deep Learning. IEEE Trans. Med. Imaging 2018, 37, 1249–1258.

Li, M.; Jiang, Y.; Zhang, Y.; Zhu, H. Medical Image Analysis Using Deep Learning Algorithms. Front. Public Health 2023, 11, 1273253.

Berryman, S.; Matthews, K.; Lee, J.H.; Duffy, S.P.; Ma, H. Image-Based Phenotyping of Disaggregated Cells Using Deep Learning. Commun. Biol. 2020, 3, 1399. [PubMed]

Yao, K.; Rochman, N.D.; Sun, S.X. Cell Type Classification and Unsupervised Morphological Phenotyping from Low-Resolution Images Using Deep Learning. Sci. Rep. 2019, 9, 50010.

Sandfort, V.; Yan, K.; Pickhardt, P.J.; Summers, R.M. Data Augmentation Using Generative Adversarial Networks (CycleGAN) to Improve Generalizability in CT Segmentation Tasks. Sci. Rep. 2019, 9, 16884.

Perez, L.; Wang, J. The Effectiveness of Data Augmentation in Image Classification Using Deep Learning. arXiv 2017, arXiv:1712.04621.

Yi, X.; Walia, E.; Babyn, P. Generative Adversarial Network in Medical Imaging: A Review. Med. Image Anal. 2019, 58, 101552.

Frid-Adar, M.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. Synthetic Data Augmentation Using GAN for Improved Liver Lesion Classification. IEEE Trans. Med. Imaging 2018, 38, 809–818.

Salehinejad, H.; Colak, E.; Dowdell, T.; Barfett, J.; Georgescu, B. Synthesizing Chest X-ray Pathology for Training Deep Convolutional Neural Networks. arXiv 2018, arXiv:1807.07514.

Abidoye, I.; Ikeji, F.; Sousa, E. Automatic Classification of Platelets Images: Augmented and Non-Augmented Comparison of Pre-Trained versus Custom Models [Poster]. Presented at ResearchGate. 2025. Available online: https://www.researchgate.net/publication/385884476_Automatic_Classification_of_Platelets_Images_Augmented_and_Non-augmented_Comparison_of_Pre-trained_Versus_Custom_Models (accessed on 20 June 2024).

Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708.

Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556.

Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016.

Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90.

Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:1701.07875.

Arjovsky, M.; Bottou, L. Towards Principled Methods for Training Generative Adversarial Networks. arXiv 2017, arXiv:1701.04862.

Shahinfar, S.; Meek, P.; Falzon, G. How Many Images Do I Need? Understanding How Sample Size per Class Affects Deep Learning Model Performance Metrics for Balanced Designs in Autonomous Wildlife Monitoring. Ecol. Inform. 2020, 57, 101085.

Abidoye, I., Ikeji, F., Coupland, C.A., Calaminus, S.D.J., Sander, N. & Sousa, E. (2025). Platelets Image Classification Through Data Augmentation: A Comparative Study of Traditional Imaging Augmentation and GAN-Based Synthetic Data Generation Techniques Using CNNs. J. Imaging, 11(6), 183. https://doi.org/10.3390/jimaging11060183

Patel, T.S. (2024). Enhanced Blood Cell Classification Performance and Conditional Image Generation With Transformer Based Models. PhD Dissertation, Bowie State University, ProQuest Dissertations & Theses, 31636970.

Ahmed, M.A.O., Alotaibi, R., Abdel Satar, Y., Gaber, N., Omran, N.F. & Reyad, O. (2025). Fast Detection of Acute Lymphoblastic Leukemia Through Stacked Pre-trained Ensemble Learning and Efficient Segmentation. Arabian Journal for Science and Engineering. https://doi.org/10.1007/s13369-025-XXXXX

Haque, R., Al Sakib, A., Hossain, M.F., Islam, F., Ibne Aziz, F., Ahmed, M.R., Kannan, S., Rohan, A. & Hasan, M.J. (2024). Advancing Early Leukemia Diagnostics: A Comprehensive Study Incorporating Image Processing and Transfer Learning. BioMedInformatics, 4(2), 966-991. https://doi.org/10.3390/biomedinformatics4020054

Nitya, E., Nassa, V.K., Singh, A., Priyanka, Maithili, K. & Agarwal, V. (2024). Transforming Hematological Data Interpretation: A Deep Learning and NLP Framework for Blood Cancer Prognostics. In: 2024 International Conference on Artificial Intelligence and Emerging Technology (Global AI Summit), 04–06 September 2024, IEEE.

Naouali, S. & El Othmani, O. (2025). AI-Driven Automated Blood Cell Anomaly Detection: Enhancing Diagnostics and Telehealth in Hematology. J. Imaging, 11(5), 157. https://doi.org/10.3390/jimaging11050157

Acevedo, A., Merino, A., Alferez, S., Molina, Á., Boldú, L., & Rodellar, J. (2020). A dataset for microscopic peripheral blood cell images for development of automatic recognition systems [Data set]. Mendeley Data. https://doi.org/10.17632/snkd93bnjr.1

Sysmex Europe SE. (n.d.). Scientific image gallery: Blood cell images. Sysmex. Retrieved November 30, 2025, from https://www.sysmex.no