In today's media landscape, where information on current events and specialized topics inundates newspapers, social media, and broadcast channels globally, distinguishing between fact and fiction has become increasingly challenging due to the surge in online content. The proliferation of fake news presents a significant obstacle in ensuring that consumers receive accurate information. To address this issue, this study investigates the efficacy of machine learning models in classifying news as genuine or fake using a dataset comprising 23,481 records of fake news and 21,417 records of real news sourced from Kaggle. Employing Random Forest (RF), Linear Regression (LR), and Decision Tree (DT) classifiers, alongside four feature selection strategies including Feature Significance, the study identifies the two most influential features out of five. Experimental results demonstrate that the proposed models outperforms existing techniques in classification accuracy. Additionally, SHAP (SHapley Additive exPlanations), an explainable AI approach, is utilized to interpret the models' decisions and highlight critical features influencing classification outcomes. This comprehensive approach not only enhances understanding of fake news classification but also emphasizes the necessity of robust methodologies to combat misinformation in the digital age.
Key words: Online Fake News, Text classification, Machine learning, Fake news, and Social Media
|