Scaling Multi-Instance Support Vector Machine to Breast Cancer Detection on the BreaKHis Dataset
Hoon Seo, Lodewijk Brand, Lucia Saldana Barco, Hua Wang
Bioinformatics - ISMB - 2022
Breast cancer is a type of cancer that develops in breast tissue, and, after skin cancer, it is the most commonly diagnosed cancer in women in the United States. Given that an early diagnosis is imperative to prevent breast cancer progression, many machine learning models have automated the histopathological classification of the different types of carcinomas. However, many of them are not scalable to the large dataset. In this study, we propose the novel Primal-Dual Multi-Instance Support Vector Machine (pdMISVM) to determine which tissue segments in an image exhibit an indication of an abnormality. We also derive the efficient optimization approach for the proposed method by bypassing the quadratic programming and least-squares problems, which are commonly employed to optimize Support Vector Machine (SVM) models in multi-instance learning. The proposed method is scalable to large datasets, and it is computationally efficient. We applied our method to the public BreaKHis dataset and achieved promising prediction performance and scalability for histopathological classification. Software is publicly available at: https://1drv.ms/u/s!AiFpD21bgf2wgRLbQq08ixD0SgRD?e=OpqEmY
Links
- View publications from Lucia Saldana Barco
- View publications from Hoon Seo
- View publications from Lodewijk Brand
- View publications from Hua Wang
- View publications in the project, An Intelligence-Driven Patient Care Approach to Reduce Medical Errors
- View publications in the project, Intelligent Prediction of Traffic Conditions via Integrated Data-Driven Crowdsourcing and Learning
- View publications in the project, Mining Brain Imaging Genomics Data for Improved Cognitive Health
- View publications researching Multiple-Instance Learning
- View publications applied to Bioinformatics
- View publications applied to Computer Vision