Prostate Cancer Screening for Specific Races Using Bioinformatics and Artificial Intelligence on Genomic Data
Abstract
Prostate cancer is one of a deathly cancer worldwide. The higher incidence and mortality rate shows that it is an urgent call for all of us to fight against it in our own way. This study develops an artificial intelligence system to screening prostate cancer from normal patients in a specific race. Gene expression and its phenotype dataset was downloaded from xenabrowser.net Data preprocessing and filtering based on a particular race, bioinformatics computational analysis to determine the features and machine learning algorithm such as decision tree and random forest are used to develop AI model. All the procedure and analysis was performed using python programming The result show that only White and Black African American that has a proper number of dataset while Asian and American Indian has a very lack dataset. Differentially expression gene (DEG) analysis was performed to both White and Black African American cancer and normal dataset as a reference. 143 and 1 DEG are found in White and Black African American race respectively. ENSG00000225937.1 (PCA3) is identified as the highest up-regulated gene expression in cancer in both White and Black African American race. The results of DEG analysis then become features to develop Artificial Intelligence (AI) classification system. AI model was developed using decision tree and random forest with GriDSearch parameters optimization and stratified 10-fold cross validation. Both Decision tree and random forest model yield 96% accuracy in training dataset and 93% and 91% accuracy in testing dataset for decision tree and random forest, respectively.
Downloads

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike International License (CC-BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Copyright without Restrictions
The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
The submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the ULTIMATICS or its Editorial Staff. The main (first/corresponding) author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission.