Selecting effective and significant features for Hidden Markov Model (HMM) is a very important for detecting anomalies in database. The goal of this research is to identify the most salient and important features in building HMM. In order to improve the performance of HMM, an approach of feature pruning is proposed. This approach is effective in detecting and classifying anomalies, very simple and easy implemented. Also, it is able to reduce the computational complexity and time without compromising the model accuracy. In this work, the proposed approach is applied to NSL-KDD (the new version of KDD Cup 99), DDoS, IoTPOT and UNSW_NB15 data sets. Those data sets is used to perform a comparative study that involves full Feature set and a subset of significant features. The experimental results shows better performance in terms of efficiency and provides higher accuracy and lower false positive rate with reduced number of features and eliminating irrelevant redundant or noisy features.
Key words: Hidden Markov Models; NSLKDD; DDoS; UNSW_NB15;IoTPOT.
|