Feature Selection Benchmarks for Breast Cancer Diagnosis: A Comparative Machine Learning Study

Authors

R. Rossa Alfi Nur Institut Teknologi Sepuluh Nopember
Nashir Abbas Husaini Institut Teknologi Sepuluh Nopember
Moch. Arjunnaja Institut Teknologi Sepuluh Nopember
Az-Zahra Batrisyia Juniarto Institut Teknologi Sepuluh Nopember
Yuri Pamungkas Institut Teknologi Sepuluh November

DOI:

https://doi.org/10.31937/sk.v18i1.4636

Abstract

Breast cancer remains one of the most common causes of death among women, making early and precise detection essential. Yet conventional diagnosis can be limited by specialist shortages, cost, and slow workflows. We therefore assess machine-learning classification with feature selection to streamline diagnosis. Our contribution is a comparative benchmark of feature-selection strategies and classifiers on the WDBC dataset. We evaluated five models (SVM, neural-networks, decision tree, bagged-tree, and boosted-tree). Chi², mRMR, and ReliefF selected 5, 10, 15, and 30 features, and performance was measured across multiple train–test splits using accuracy, precision, recall, specificity, and F1-score. SVM was overall the top performer and stable across splits. The best SVM setting reached 97.81% accuracy, with strong precision and F1-score, indicating reliable benign–malignant separation. Neural-networks usually ranked second but were more sensitive to the split. Bagged trees generally improved on a single decision tree, while boosted trees showed mixed gains depending on the subset. ReliefF and mRMR often matched or exceeded Chi² with smaller subsets, showing that careful feature reduction can retain accuracy while lowering dimensionality. In conclusion, combining effective feature selection with an appropriate classifier improves breast cancer classification, and SVM with a compact feature set is a practical choice.

Downloads

Download data is not yet available.

Author Biographies

R. Rossa Alfi Nur, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Nashir Abbas Husaini, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Moch. Arjunnaja, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Az-Zahra Batrisyia Juniarto, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Yuri Pamungkas, Institut Teknologi Sepuluh November

Department of Medical Technology

Downloads

PDF

Published

2026-06-29

How to Cite

Nur, R. R. A., Husaini, N. A., Arjunnaja, M., Juniarto, A.-Z. B., & Pamungkas, Y. (2026). Feature Selection Benchmarks for Breast Cancer Diagnosis: A Comparative Machine Learning Study. Ultima Computing : Jurnal Sistem Komputer, 18(1), 28–37. https://doi.org/10.31937/sk.v18i1.4636

Download Citation

Issue

Vol. 18 No. 1 (2026): Ultima Computing: Jurnal Sistem Komputer

Section

Articles

License

Copyright (c) 2026 R. Rossa Alfi Nur, Nashir Abbas Husaini, Moch. Arjunnaja, Az-Zahra Batrisyia Juniarto, Yuri Pamungkas

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike International License (CC-BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.

Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.

Copyright without Restrictions

The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.

The submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the ULTIMA Computing or its Editorial Staff. The main (first/corresponding) author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission.