We live in an era when time is a precious resource. Thus, dealing with the vast amount of data collected from different resources for various purposes requires creating systems that can process the data in a reasonable time to make it worthwhile. Accessing big data in machine learning and artificial intelligence models creates efficient, robust models. In this work, we present a method to create a multi-class classification model using Apache-spark. The model is built and trained with the CIC-DDOS2019 dataset to build a DDoS Attack detection model. Ensemble modeling was used to improve the accuracy and robustness of the model. At the same time, Apache-spark was used to distribute the vast amount of training and testing data over the models used to create the intrusion detection model. The proposed model has achieved high accuracy (99.94%) while reducing the execution time to almost the half when applied without Apache-spark.
Key words: Ensemble Model, Random Forest (RF), XGBoost (XGB), Apache-Spark, PySpark, Big Data, CIC-DDoS2019, DDoS Attacks
|