Feature Selection Benchmarks for Breast Cancer Diagnosis: A Comparative Machine Learning Study

Authors

  • R. Rossa Alfi Nur Institut Teknologi Sepuluh Nopember
  • Nashir Abbas Husaini Institut Teknologi Sepuluh Nopember
  • Moch. Arjunnaja Institut Teknologi Sepuluh Nopember
  • Az-Zahra Batrisyia Juniarto Institut Teknologi Sepuluh Nopember
  • Yuri Pamungkas Institut Teknologi Sepuluh November

DOI:

https://doi.org/10.31937/sk.v18i1.4636

Abstract

Breast cancer remains one of the most common causes of death among women, making early and precise detection essential. Yet conventional diagnosis can be limited by specialist shortages, cost, and slow workflows. We therefore assess machine-learning classification with feature selection to streamline diagnosis. Our contribution is a comparative benchmark of feature-selection strategies and classifiers on the WDBC dataset. We evaluated five models (SVM, neural-networks, decision tree, bagged-tree, and boosted-tree). Chi2, mRMR, and ReliefF selected 5, 10, 15, and 30 features, and performance was measured across multiple train–test splits using accuracy, precision, recall, specificity, and F1-score. SVM was overall the top performer and stable across splits. The best SVM setting reached 97.81% accuracy, with strong precision and F1-score, indicating reliable benign–malignant separation. Neural-networks usually ranked second but were more sensitive to the split. Bagged trees generally improved on a single decision tree, while boosted trees showed mixed gains depending on the subset. ReliefF and mRMR often matched or exceeded Chi2 with smaller subsets, showing that careful feature reduction can retain accuracy while lowering dimensionality. In conclusion, combining effective feature selection with an appropriate classifier improves breast cancer classification, and SVM with a compact feature set is a practical choice.

Downloads

Download data is not yet available.

Author Biographies

R. Rossa Alfi Nur, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Nashir Abbas Husaini, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Moch. Arjunnaja, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Az-Zahra Batrisyia Juniarto, Institut Teknologi Sepuluh Nopember

Department of Medical Technology

Yuri Pamungkas, Institut Teknologi Sepuluh November

Department of Medical Technology

Downloads

Published

2026-06-29

How to Cite

Nur, R. R. A., Husaini, N. A., Arjunnaja, M., Juniarto, A.-Z. B., & Pamungkas, Y. (2026). Feature Selection Benchmarks for Breast Cancer Diagnosis: A Comparative Machine Learning Study. Ultima Computing : Jurnal Sistem Komputer, 18(1), 28–37. https://doi.org/10.31937/sk.v18i1.4636