Research Article
BibTex RIS Cite

Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı

Year 2021, Volume: 14 Issue: 2, 109 - 119, 22.12.2021
https://doi.org/10.54525/tbbmd.1018465

Abstract

İnternet üzerindeki uygulamalar kodlama kaynaklı bir takım güvenlik endişelerini barındırırlar. Zayıflıklar veya güvenlik açıkları, suçluların hassas verileri çalmak için veri tabanlarına doğrudan ve genel erişim elde etmesine olanak tanır. Bu çalışmada, web uygulama saldırılarının hibrit saldırı tespit sistemleri ile daha kolay ve daha doğru tespiti için sezgisel öznitelik seçimi ve makine öğrenmesine dayanan bir yaklaşım önerilmektedir. CIC-IDS2017 ve CSE-CIC-IDS2018 veri setlerindeki web uygulama saldırıları ve normal akış örnekleri bir dizi veri ön işleme aşaması sonrası birleştirilerek ve yeni bir veri seti oluşturuldu. Genetik Algoritma ve Lojistik Regresyon kullanılarak ortalama karesel hata ve öznitelik sayısı optimizasyonu gerçekleştirilip sonuçlar beş farklı makine öğrenmesi algoritması ile test edildi. Elde edilen sonuçlar incelendiğinde, öznitelik sayısının %85 oranında azaltılmasına rağmen sınıflandırmadaki başarım oranlarının %99 seviyesinde kaldığı gözlemlenmiştir.

References

  • K. Seyhan, T. N. Nguyen, S. Akleylek, K. Cengiz, and S. K. H. Islam, “Bi-GISIS KE: Modified key exchange protocol with reusable keys for IoT security,” Journal of Information Security and Applications, vol. 58, p. 102788, May 2021, doi: 10.1016/J.JISA.2021.102788.
  • H. Ahmetoglu and R. Das, “Derin Öǧrenme ile Büyük Veri Kumelerinden Saldiri Türlerinin Siniflandirilmasi,” 2019. doi: 10.1109/IDAP.2019.8875872.
  • “IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2017.html (accessed Oct. 27, 2021).
  • “IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2018.html (accessed Oct. 27, 2021).
  • S. M. Kasongo, “Genetic Algorithm Based Feature Selection Technique for Optimal Intrusion Detection,” no. June, pp. 1–22, 2021, doi: 10.20944/preprints202106.0710.v1.
  • C. Khammassi and S. Krichen, “A GA-LR wrapper approach for feature selection in network intrusion detection,” Computers & Security, vol. 70, pp. 255–277, Sep. 2017, doi: 10.1016/J.COSE.2017.06.005.
  • Y. Zhu, J. Liang, J. Chen, and Z. Ming, “An improved NSGA-III algorithm for feature selection used in intrusion detection,” Knowledge-Based Systems, vol. 116, pp. 74–85, Jan. 2017, doi: 10.1016/J.KNOSYS.2016.10.030.
  • H. Ahmetoglu and R. Das, “Analysis of Feature Selection Approaches in Large Scale Cyber Intelligence Data with Deep Learning,” 2021. doi: 10.1109/siu49456.2020.9302200.
  • H. Wang, J. Gu, and S. Wang, “An effective intrusion detection framework based on SVM with feature augmentation,” Knowledge-Based Systems, vol. 136, pp. 130–139, Nov. 2017, doi: 10.1016/J.KNOSYS.2017.09.014.
  • H. Xu, Y. Fu, C. Fang, Q. Cao, J. Su, and S. Wei, “An improved binary whale optimization algorithm for feature selection of network intrusion detection,” Proceedings of the 2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems, IDAACS-SWS 2018, pp. 10–15, Nov. 2018, doi: 10.1109/IDAACS-SWS.2018.8525539.
  • H. Gharaee and H. Hosseinvand, “A new feature selection IDS based on genetic algorithm and SVM,” 2016 8th International Symposium on Telecommunications, IST 2016, pp. 139–144, Mar. 2017, doi: 10.1109/ISTEL.2016.7881798.
  • A. Thakkar and R. Lohiya, “Role of swarm and evolutionary algorithms for intrusion detection system: A survey,” Swarm and Evolutionary Computation, vol. 53, p. 100631, Mar. 2020, doi: 10.1016/J.SWEVO.2019.100631.
  • S. Hosseini, “A new machine learning method consisting of GA-LR and ANN for attack detection,” Wireless Networks, vol. 26, no. 6, pp. 4149–4162, 2020, doi: 10.1007/s11276-020-02321-3.
  • J. O. Onah, S. M. Abdulhamid, M. Abdullahi, I. H. Hassan, and A. Al-Ghusham, “Genetic Algorithm based feature selection and Naïve Bayes for anomaly detection in fog computing environment,” Machine Learning with Applications, vol. 6, no. April, p. 100156, 2021, doi: 10.1016/j.mlwa.2021.100156.
  • Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Computers and Security, vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.
  • N. Moustafa and J. Slay, “A hybrid feature selection for network intrusion detection systems: Central points,” pp. 5–13, Jul. 2017, doi: 10.4225/75/57a84d4fbefbb.
  • B. A. Tama, M. Comuzzi, and K. H. Rhee, “TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System,” IEEE Access, vol. 7, pp. 94497–94507, 2019, doi: 10.1109/ACCESS.2019.2928048.
  • S. M. Kasongo and Y. Sun, “A deep learning method with wrapper based feature extraction for wireless intrusion detection system,” Computers & Security, vol. 92, p. 101752, May 2020, doi: 10.1016/J.COSE.2020.101752.
  • A. Nazir and R. A. Khan, “A novel combinatorial optimization based feature selection method for network intrusion detection,” Computers and Security, vol. 102, p. 102164, 2021, doi: 10.1016/j.cose.2020.102164.
  • Ö. Kasim, “An ensemble classification-based approach to detect attack level of SQL injections,” Journal of Information Security and Applications, vol. 59, p. 102852, Jun. 2021, doi: 10.1016/J.JISA.2021.102852.
  • I. Tariq, M. A. Sindhu, R. A. Abbasi, A. S. Khattak, O. Maqbool, and G. F. Siddiqui, “Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning,” Expert Systems with Applications, vol. 168, p. 114386, Apr. 2021, doi: 10.1016/J.ESWA.2020.114386.
  • A. B. Puthuparambil and J. J. Thomas, “Freestyle, a randomized version of ChaCha for resisting offline brute-force and dictionary attacks,” Journal of Information Security and Applications, vol. 49, p. 102396, Dec. 2019, doi: 10.1016/J.JISA.2019.102396.
  • D. Ö. Şahin, O. E. Kural, S. Akleylek, and E. Kılıç, “A novel Android malware detection system: adaption of filter-based feature selection methods,” Journal of Ambient Intelligence and Humanized Computing 2021, vol. 1, pp. 1–15, Jul. 2021, doi: 10.1007/S12652-021-03376-6.
  • M. DASH and H. LIU, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, no. 1–4, pp. 131–156, Jan. 1997, doi: 10.1016/S1088-467X(97)00008-5.
  • I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in ICISSP 2018 - Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018, vol. 2018-Janua, pp. 108–116. doi: 10.5220/0006639801080116.
  • R. Zuech, J. Hancock, and T. M. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00460-8.
  • A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of tor traffic using time based features,” ICISSP 2017 - Proceedings of the 3rd International Conference on Information Systems Security and Privacy, vol. 2017-Janua, pp. 253–262, 2017, doi: 10.5220/0006105602530262.
  • “Applications | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/research/applications.html#CICFlowMeter (accessed Oct. 28, 2021).

Genetic Feature Selection Approach in Detection of Web Application Attacks Using Machine Learning Methods

Year 2021, Volume: 14 Issue: 2, 109 - 119, 22.12.2021
https://doi.org/10.54525/tbbmd.1018465

Abstract

Applications on the Internet have some coding-related security concerns. Weaknesses or vulnerabilities allow criminals to gain direct and public access to databases to steal sensitive data. This study proposes an approach based on heuristic feature selection and machine learning for easier and more accurate detection of web application attacks with hybrid intrusion detection systems. Web application attacks and benign flow examples in CIC-IDS2017 and CSE-CIC-IDS2018 datasets were combined after a series of data preprocessing stages, and a new dataset was created. Using Genetic Algorithm and Logistic Regression, mean square error and feature count optimization were performed, and the results were tested with five different machine learning algorithms. When the results obtained were examined, it was observed that the success rate in classification remained at the level of 99%, although the number of features was reduced by 85%

References

  • K. Seyhan, T. N. Nguyen, S. Akleylek, K. Cengiz, and S. K. H. Islam, “Bi-GISIS KE: Modified key exchange protocol with reusable keys for IoT security,” Journal of Information Security and Applications, vol. 58, p. 102788, May 2021, doi: 10.1016/J.JISA.2021.102788.
  • H. Ahmetoglu and R. Das, “Derin Öǧrenme ile Büyük Veri Kumelerinden Saldiri Türlerinin Siniflandirilmasi,” 2019. doi: 10.1109/IDAP.2019.8875872.
  • “IDS 2017 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2017.html (accessed Oct. 27, 2021).
  • “IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/datasets/ids-2018.html (accessed Oct. 27, 2021).
  • S. M. Kasongo, “Genetic Algorithm Based Feature Selection Technique for Optimal Intrusion Detection,” no. June, pp. 1–22, 2021, doi: 10.20944/preprints202106.0710.v1.
  • C. Khammassi and S. Krichen, “A GA-LR wrapper approach for feature selection in network intrusion detection,” Computers & Security, vol. 70, pp. 255–277, Sep. 2017, doi: 10.1016/J.COSE.2017.06.005.
  • Y. Zhu, J. Liang, J. Chen, and Z. Ming, “An improved NSGA-III algorithm for feature selection used in intrusion detection,” Knowledge-Based Systems, vol. 116, pp. 74–85, Jan. 2017, doi: 10.1016/J.KNOSYS.2016.10.030.
  • H. Ahmetoglu and R. Das, “Analysis of Feature Selection Approaches in Large Scale Cyber Intelligence Data with Deep Learning,” 2021. doi: 10.1109/siu49456.2020.9302200.
  • H. Wang, J. Gu, and S. Wang, “An effective intrusion detection framework based on SVM with feature augmentation,” Knowledge-Based Systems, vol. 136, pp. 130–139, Nov. 2017, doi: 10.1016/J.KNOSYS.2017.09.014.
  • H. Xu, Y. Fu, C. Fang, Q. Cao, J. Su, and S. Wei, “An improved binary whale optimization algorithm for feature selection of network intrusion detection,” Proceedings of the 2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems, IDAACS-SWS 2018, pp. 10–15, Nov. 2018, doi: 10.1109/IDAACS-SWS.2018.8525539.
  • H. Gharaee and H. Hosseinvand, “A new feature selection IDS based on genetic algorithm and SVM,” 2016 8th International Symposium on Telecommunications, IST 2016, pp. 139–144, Mar. 2017, doi: 10.1109/ISTEL.2016.7881798.
  • A. Thakkar and R. Lohiya, “Role of swarm and evolutionary algorithms for intrusion detection system: A survey,” Swarm and Evolutionary Computation, vol. 53, p. 100631, Mar. 2020, doi: 10.1016/J.SWEVO.2019.100631.
  • S. Hosseini, “A new machine learning method consisting of GA-LR and ANN for attack detection,” Wireless Networks, vol. 26, no. 6, pp. 4149–4162, 2020, doi: 10.1007/s11276-020-02321-3.
  • J. O. Onah, S. M. Abdulhamid, M. Abdullahi, I. H. Hassan, and A. Al-Ghusham, “Genetic Algorithm based feature selection and Naïve Bayes for anomaly detection in fog computing environment,” Machine Learning with Applications, vol. 6, no. April, p. 100156, 2021, doi: 10.1016/j.mlwa.2021.100156.
  • Z. Halim et al., “An effective genetic algorithm-based feature selection method for intrusion detection systems,” Computers and Security, vol. 110, p. 102448, 2021, doi: 10.1016/j.cose.2021.102448.
  • N. Moustafa and J. Slay, “A hybrid feature selection for network intrusion detection systems: Central points,” pp. 5–13, Jul. 2017, doi: 10.4225/75/57a84d4fbefbb.
  • B. A. Tama, M. Comuzzi, and K. H. Rhee, “TSE-IDS: A Two-Stage Classifier Ensemble for Intelligent Anomaly-Based Intrusion Detection System,” IEEE Access, vol. 7, pp. 94497–94507, 2019, doi: 10.1109/ACCESS.2019.2928048.
  • S. M. Kasongo and Y. Sun, “A deep learning method with wrapper based feature extraction for wireless intrusion detection system,” Computers & Security, vol. 92, p. 101752, May 2020, doi: 10.1016/J.COSE.2020.101752.
  • A. Nazir and R. A. Khan, “A novel combinatorial optimization based feature selection method for network intrusion detection,” Computers and Security, vol. 102, p. 102164, 2021, doi: 10.1016/j.cose.2020.102164.
  • Ö. Kasim, “An ensemble classification-based approach to detect attack level of SQL injections,” Journal of Information Security and Applications, vol. 59, p. 102852, Jun. 2021, doi: 10.1016/J.JISA.2021.102852.
  • I. Tariq, M. A. Sindhu, R. A. Abbasi, A. S. Khattak, O. Maqbool, and G. F. Siddiqui, “Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning,” Expert Systems with Applications, vol. 168, p. 114386, Apr. 2021, doi: 10.1016/J.ESWA.2020.114386.
  • A. B. Puthuparambil and J. J. Thomas, “Freestyle, a randomized version of ChaCha for resisting offline brute-force and dictionary attacks,” Journal of Information Security and Applications, vol. 49, p. 102396, Dec. 2019, doi: 10.1016/J.JISA.2019.102396.
  • D. Ö. Şahin, O. E. Kural, S. Akleylek, and E. Kılıç, “A novel Android malware detection system: adaption of filter-based feature selection methods,” Journal of Ambient Intelligence and Humanized Computing 2021, vol. 1, pp. 1–15, Jul. 2021, doi: 10.1007/S12652-021-03376-6.
  • M. DASH and H. LIU, “Feature selection for classification,” Intelligent Data Analysis, vol. 1, no. 1–4, pp. 131–156, Jan. 1997, doi: 10.1016/S1088-467X(97)00008-5.
  • I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in ICISSP 2018 - Proceedings of the 4th International Conference on Information Systems Security and Privacy, 2018, vol. 2018-Janua, pp. 108–116. doi: 10.5220/0006639801080116.
  • R. Zuech, J. Hancock, and T. M. Khoshgoftaar, “Detecting web attacks using random undersampling and ensemble learners,” Journal of Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00460-8.
  • A. H. Lashkari, G. D. Gil, M. S. I. Mamun, and A. A. Ghorbani, “Characterization of tor traffic using time based features,” ICISSP 2017 - Proceedings of the 3rd International Conference on Information Systems Security and Privacy, vol. 2017-Janua, pp. 253–262, 2017, doi: 10.5220/0006105602530262.
  • “Applications | Research | Canadian Institute for Cybersecurity | UNB.” https://www.unb.ca/cic/research/applications.html#CICFlowMeter (accessed Oct. 28, 2021).
There are 28 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Makaleler(Araştırma)
Authors

Hüseyin Ahmetoğlu 0000-0002-4320-0198

Resul Daş 0000-0002-6113-4649

Publication Date December 22, 2021
Published in Issue Year 2021 Volume: 14 Issue: 2

Cite

APA Ahmetoğlu, H., & Daş, R. (2021). Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, 14(2), 109-119. https://doi.org/10.54525/tbbmd.1018465
AMA Ahmetoğlu H, Daş R. Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı. TBV-BBMD. December 2021;14(2):109-119. doi:10.54525/tbbmd.1018465
Chicago Ahmetoğlu, Hüseyin, and Resul Daş. “Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi 14, no. 2 (December 2021): 109-19. https://doi.org/10.54525/tbbmd.1018465.
EndNote Ahmetoğlu H, Daş R (December 1, 2021) Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 14 2 109–119.
IEEE H. Ahmetoğlu and R. Daş, “Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı”, TBV-BBMD, vol. 14, no. 2, pp. 109–119, 2021, doi: 10.54525/tbbmd.1018465.
ISNAD Ahmetoğlu, Hüseyin - Daş, Resul. “Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi 14/2 (December 2021), 109-119. https://doi.org/10.54525/tbbmd.1018465.
JAMA Ahmetoğlu H, Daş R. Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı. TBV-BBMD. 2021;14:109–119.
MLA Ahmetoğlu, Hüseyin and Resul Daş. “Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı”. Türkiye Bilişim Vakfı Bilgisayar Bilimleri Ve Mühendisliği Dergisi, vol. 14, no. 2, 2021, pp. 109-1, doi:10.54525/tbbmd.1018465.
Vancouver Ahmetoğlu H, Daş R. Makine Öğrenmesi Yöntemleri Kullanarak Web Uygulama Saldırılarının Tespitinde Genetik Öznitelik Seçimi Yaklaşımı. TBV-BBMD. 2021;14(2):109-1.

Article Acceptance

Use user registration/login to upload articles online.

The acceptance process of the articles sent to the journal consists of the following stages:

1. Each submitted article is sent to at least two referees at the first stage.

2. Referee appointments are made by the journal editors. There are approximately 200 referees in the referee pool of the journal and these referees are classified according to their areas of interest. Each referee is sent an article on the subject he is interested in. The selection of the arbitrator is done in a way that does not cause any conflict of interest.

3. In the articles sent to the referees, the names of the authors are closed.

4. Referees are explained how to evaluate an article and are asked to fill in the evaluation form shown below.

5. The articles in which two referees give positive opinion are subjected to similarity review by the editors. The similarity in the articles is expected to be less than 25%.

6. A paper that has passed all stages is reviewed by the editor in terms of language and presentation, and necessary corrections and improvements are made. If necessary, the authors are notified of the situation.

0

.   This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.