Acute leukemia, including Acute Lymphoblastic Leukemia (ALL) and Acute Myeloid Leukemia (AML), presents challenges in accurate subtype classification due to the high dimensionality of gene expression data. High-dimensional gene expression datasets, such as those used in leukemia research, are frequently analyzed using machine learning methods to overcome the challenges of feature selection and classification. In recent years, hybrid approaches that combine feature selection algorithms with classification models have been increasingly employed to enhance classification performance in biomedical applications. In this study, a hybrid machine learning framework combining Least Absolute Shrinkage and Selection Operator (LASSO) regression for feature selection and Support Vector Machine (SVM) for classification was applied to the publicly available leukemia dataset, which includes gene expression profiles from 72 bone marrow samples (47 ALL, 25 AML) across 3,571 genes. LASSO regression identified the most informative genes, reducing the dataset’s dimensionality, and these genes were subsequently used as input features for the SVM classifier. The classification achieved an overall accuracy of 93.33%, demonstrating the robustness of the selected features. The findings confirm the applicability of combining LASSO and SVM in gene expression-based classification tasks.
Key words: Leukemia Classification, LASSO Regression, Support Vector Machine (SVM), Gene Expression Analysis
|