Efficient Image Segmentation and Machine Learning Algorithms for Improved Malaria Detection from Blood Smear Images


  • Kirthi Kumar
  • Soraya Ngarnim
  • Nathalia Peixoto
  • Padmanabhan Seshaiyer




Malaria is one of the deadliest endemic diseases in developing countries with millions of cases recorded each year. In 2018, there were over 200 million malaria cases and over 400,000 deaths worldwide. Having widespread access to efficient malaria detection would help reduce malaria cases in these countries. In recent years, several computational algorithms have been applied to detect malaria from a variety of approaches, including analysis of blood smear images and presence of protein biomarkers in urine and saliva. For blood smear images, recent literature employs a combination of image segmentation techniques and machine learning classifiers. While multiple techniques have been suggested, and advances are being made, there is still work that needs to be done to identify the best combination of these algorithms. The aim of this study was to perform a comparative analysis of four different image segmentation techniques and five machine learning classifiers on a dataset of roughly 27,000 blood smear images, half of which were infected with malaria. The focus of the work involved identifying combinations of image segmentation and machine learning algorithms that resulted in the highest accuracy in classifying a blood smear image as infected or uninfected, leading to improved malaria detection. The Python programming language with the libraries OpenCV and Scikit-learn were used for the implementation of the computational algorithms developed in this work. The resulting confusion matrices helped to identify efficient combinations of image segmentation techniques and machine learning classifiers. Using a random forest classifier with binary thresholding resulted in the highest accuracy, with around 90%, but some other combinations also performed reasonably well. Our computational experiments also found that adaptive and K-means thresholding were not efficient, averaging slightly above 50% accuracy. The findings from this work provide an insight to identifying the best combination of an image segmentation technique and a machine learning classifier that can potentially be programmed into a future real-time mobile app that can be deployed in developing countries. 





College of Science: Department of Mathematical Sciences