Investigating machine and deep-learning model combinations for a two-stage IDS for IoT networks.
Van der Walt, André.
MetadataShow full item record
By 2025, there will be upwards of 75 billion IoT devices connected to the internet. Notable security incidents have shown that many IoT devices are insecure or misconfigured, leaving them vulnerable, often with devastating results. AI’s learning, adaptable and flexible nature can be leveraged to provide networking monitoring for IoT networks. This work proposes a novel two-stage IDS, using layered machine- and deep-learning models. The applicability of seven algorithms is investigated using the BoT-IoT dataset. After replicating four algorithms from literature, modifications to these algorithms' application are then explored along with their ability to classify in three scenarios: 1) binary attack/benign, 2) multi-class attack with benign and 3) multi-class attack only. Three additional algorithms are also considered. The modifications are shown to achieve higher F1-scores by 22.75% and shorter training times by 35.68 seconds on average than the four replicated algorithms. Potential benefits of the proposed two-stage system are examined, showing a reduction of threat detection/identification time by 0.51s on average and an increase of threat classification F1-score by 0.05 on average. In the second half of the dissertation, algorithm combinations, layered in the two-stage system, are investigated. To facilitate comparison of time metrics, the classification scenarios from the first half of the dissertation are re-evaluated on the test PC CPU. All two-stage combinations are then tested. The results show a CNN binary classifier at stage one and a KNN 4-Class model at stage two performs best, outperforming the 5-Class (attack and benign) system of either algorithm. This system's first stage improves upon the 5-Class system's classification time by 0.25 seconds. The benign class F1-score is improved by 0.23, indicating a significant improvement in false positive rate. The system achieves an overall F1-score of 0.94. This shows the two-stage system would perform well as an IDS. Additionally, investigations arising from findings during the evaluation of the two-stage system are presented, namely GPU data-transfer overhead, the effect of data scaling and the effect of benign samples on stage two, giving a better understanding of how the dataset interacts with AI models and how they may be improved in future work.