مدل تشخیص فیشینگ URL ها بر اساس یادگیری ماشین
محورهای موضوعی : امنیت شبکه
1 - دانشیار،دانشکده فنی مهندسی،دانشگاه آزاد اسلامی- واحد تهران غرب،تهران،ایران
2 - گروه کامپیوتر،دانشکده فنی مهندسی،دانشگاه آزاد اسلامی واحد تهران غرب،تهران، ایران
کلید واژه: فیشینگ, آدرس URL, یادگیری عمیق, لایه کانوالی, خود توجهی چندسر,
چکیده مقاله :
حمله های فیشینگ همیشه تهدیدات قابل توجهی برای امنیت اینترنت بوجود آورده اند. یکی از معمول ترین شکل های فیشینگ، از طریق URL ها می باشد، جایی که مهاجمان URL های تقلبی را به شکل URL های معتبر در می آورند تا کاربران گول بخورند و برروی آنها کلیک کنند. فنون یادگیری ماشینی، امیدهایی برای شناسایی URL های فیشینگ بوجود آورده اند، اما اثربخشی آنها براساس رویکرد استفاده شده می تواند تغییر کند. اهداف: هدف این پژوهش، پیشنهاد دو روش یادگیری ماشینی، «شبکه های عصبی کانوالی» (CNN) و «خود توجهی چندسره» (MHSA)، برای شناسایی URL های فیشینگ است. علاوه بر آن، ارزیابی و مقایسه اثربخشی این رویکرد در مقایسه با روش ها و مدل های دیگر است. روش تحقیق: یک مجموعه داده از URL ها گردآوری شد و به آنها برچسب فیشینگ یا معتبر داده شد. عملکرد چندین مدل استفاده کننده از روش های یادگیری ماشینی مختلف، شامل CNN و MHSA، برای دسته بندی این URL ها با استفاده از معیارهای مختلف، مانند دقت، صحت، فراخوانی و نمره F1، ارزیابی شد. نتایج: نتایج نشان می دهند که ترکیب مدل های CNN و MHSA عملکرد بهتری نسبت به دیگر مدل های انفرادی دارد و به دقت 98.3% می رسد. که در مقایسه با روش های نوین موجود، بهبود قابل توجهی در شناسایی URL های فیشینگ فراهم می کند. نتیجه گیری: ترکیب CNN و MHSA رویکردی موثر برای آشکارسازی URL های فیشینگ است. این روش نسبت به روش های نوین موجود عملکرد بهتری دارد و روشی دقیق و مطمئن تر برای آشکارسازی URL های فیشینگ فراهم می کند. نتایج این مطالعه، پتانسیل استفاده از روش های ترکیبی در بهبود دقت و اطمینان روش های آشکارسازی URL فیشینگ مبتنی بر یادگیری ماشینی را نشان می دهند.
Phishing attacks have always posed significant threats to Internet security. One of the most common forms of phishing is through URLs, where attackers disguise fake URLs as valid URLs to trick users into clicking on them. Machine learning techniques have shown promise for identifying phishing URLs, but their effectiveness can vary based on the approach used. Objectives: The objective of this research is to propose two machine learning methods, "Convolutional Neural Networks" (CNN) and "Multiple Self-Awareness" (MHSA), to identify phishing URLs. In addition, to evaluate and compare the effectiveness of this approach compared to other methods and models. Research method: A dataset of URLs was collected and labeled as phishing or legitimate. The performance of several models using different machine learning methods, including CNN and MHSA, was evaluated to classify these URLs using different criteria, such as accuracy, precision, recall and F1 score. Results: The results show that the combination of CNN and MHSA models performs better than other individual models and reaches 98.3% accuracy. which provides a significant improvement in identifying phishing URLs compared to existing modern methods. Conclusion: The combination of CNN and MHSA is an effective approach to detect phishing URLs. This method performs better than existing modern methods and provides a more accurate and reliable method for detecting phishing URLs. The results of this study show the potential of using hybrid methods in improving the accuracy and reliability of phishing URL detection methods based on machine learning.
[1] James, L. (2006). Banking on phishing. In James, L. (Ed.), Phishing Exposed (pp. 1-35). Syngress. ISBN 9781597490306
[2] Sundara Pandiyan, S., Selvaraj, P., Burugari, V. K., Benadit P, J., & Kanmani, P. (2022). Phishing attack detection using Machine Learning. Measurement: Sensors, 24, 100476. ISSN 2665-9174
[3] Ahammad, S. K. H., Kale, S. D., Upadhye, G. D., Pande, S. D., Babu, E. V., Dhumane, A. V., & Bahadur, M. D. K. J. (2022). Phishing URL detection using machine learning methods. Advances in Engineering Software, 173, 103288. ISSN 0965-9978
[4] Berners-Lee, T., Masinter, L., & McCahill, M. (Eds.). (1994). Uniform Resource Locators (URL). Request for Comments: 1738. Network Working Group. CERN. Standards Track. Updated by: 1808, 2368, 2396, 3986, 6196, 6270, 8089. Obsoleted by: 4248, 4266. Errata Exist
[5] L. Wenyin, G. Liu, B. Qiu and X. Quan, "Antiphishing through Phishing Target Discovery," in IEEE Internet Computing, vol. 16, no. 2, pp. 52-61, March- April 2012, doi: 10.1109/MIC.2011.103
[6] Safi, A., & Singh, S. (2023). A systematic literature review on phishing website detection techniques. Journal of King Saud University - Computer and Information Sciences, 35(2), 590-611. ISSN 1319-1578
[7] Vrbančič, G., Fister, I., & Podgorelec, V. (2020). Datasets for phishing websites detection. Data in Brief, 33, 106438. ISSN 2352-3409
[8] Zheng, F., Yan, Q., Leung, V. C. M., Yu, F. R., & Ming, Z. (2022). HDP-CNN: Highway deep pyramid convolution neural network combining wordlevel and character-level representations for phishing website detection. Computers & Security, 114, 102584. ISSN 0167-4048
[9] Wei, W., Ke, Q., Nowak, J., Korytkowski, M., Scherer, R., & Woźniak, M. (2020). Accurate and fast URL phishing detector: A convolutional neural network approach. Computer Networks, 178, 107275. ISSN 1389-1286
[10] Sahingoz, O. K., Baykal, S. I., & Bulut, D. (2018). Phishing detection from urls by using neural networks. Computer Science & Information Technology (CS & IT), 41-54.
[11] Remmide, M. A., Boumahdi, F., Boustia, N., Feknous, C. L., & Della, R. (2022). Detection of Phishing URLs Using Temporal Convolutional Network. Procedia Computer Science, 212, 74-82. ISSN 1877-0509.
[12] Marwa M. Emam, Nagwan Abdel Samee, Mona M. Jamjoom, Essam H. Houssein, Optimized deep learning architecture for brain tumor classification using improved Hunger Games Search Algorithm, Computers in Biology and Medicine, Volume 160, 2023, 106966, ISSN 0010-4825
[13] Sundara Pandiyan S, Prabha Selvaraj, Vijay Kumar Burugari, Julian Benadit P, Kanmani P, Phishing attack detection using Machine Learning, Measurement: Sensors, Volume 24, 2022, 100476, ISSN 2665-9174,
[14] Kai Florian Tschakert, Sudsanguan Ngamsuriyaroj, Effectiveness of and user preferences for security awareness training methodologies, Heliyon, Volume 5, Issue 6, 2019, e02010, ISSN 2405-8440
[15] Mohsen Soori, Behrooz Arezoo, Roza Dastres, Machine learning and artificial intelligence in CNC machine tools, A review, Sustainable Manufacturing and Service Economics, 2023, 100009, ISSN 2667-3444,
[16] Tianyuan Liu, Hangbin Zheng, Pai Zheng, Jinsong Bao, Junliang Wang, Xiaojia Liu, Changqi Yang, An expert knowledge-empowered CNN approach for welding radiographic image recognition, Advanced Engineering Informatics, Volume 56, 2023, 101963, ISSN 1474-0346,
[17] Jun Ma, Guolin Yu, Weizhi Xiong, Xiaolong Zhu, Safe semisupervised learning for pattern classification, Engineering Applications of Artificial Intelligence, Volume 121, 2023, 106021, ISSN 0952-1976
[18] Benavides-Astudillo, E., Fuertes, W., Sanchez-Gordon, S., Rodriguez- Galan, G., Martínez-Cepeda, V., Nuñez-Agurto, D. (2023). Comparative Study of Deep Learning Algorithms in the Detection of Phishing Attacks Based on HTML and Text Obtained from Web Pages. In: Botto-Tobar, M., Zambrano Vizuete, M., Montes León, S., Torres-Carrión, P., Durakovic, B. (eds) Applied Technologies. ICAT 2022. Communications in Computer and Information Science, vol 1755. Springer, Cham. https://doi.org/10.1007/978-3-031-24985-3_28
[19] J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, "Phishing Website Classification and Detection Using Machine Learning," 2020 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2020, pp. 1-6, doi: 10.1109/ICCCI48352.2020.9104161.
[20] Do, Q.N.; Selamat, A.; Krejcar, O.; Yokoi, T.; Fujita, H. Phishing Webpage Classification via Deep Learning-Based Algorithms: An Empirical Study. Appl. Sci. 2021, 11, 9210. https://doi.org/10.3390/ app11199210
[21] M. N. Alam, D. Sarma, F. F. Lima, I. Saha, R. -E. -. Ulfath and S. Hossain, "Phishing Attacks Detection using Machine Learning Approach," 2020 Third International Conference on Smart Systems and Inventive Technology (ICSSIT), Tirunelveli, India, 2020, pp. 1173-1179, doi: 10.1109/ICSSIT48917.2020.9214225.
[22] Y. Huang, Q. Yang, J. Qin and W. Wen, "Phishing URL Detection via CNN and Attention-Based Hierarchical RNN," 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), Rotorua, New Zealand, 2019, pp. 112-119, doi: 10.1109/TrustCom/BigDataSE.2019.00024.
[23] M. A. Adebowale, K. T. Lwin and M. A. Hossain, "Deep Learning with Convolutional Neural Network and Long Short-Term Memory for Phishing Detection," 2019 13th International Conference on Software, Knowledge, Information Management and Applications (SKIMA), Island of Ulkulhas, Maldives, 2019, pp. 1-8, doi: 10.1109/SKIMA47702.2019.8982427.
[24] Bahnsen, A. C., Bohorquez, C. E., Villegas, S., Vargas, J., & González, F. A. (2017). Classifying phishing URLs using recurrent neural networks. In 2017 APWG symposium on electronic crime research (eCrime) (pp. 1–8). Scottsdale, AZ, USA.
[25] Bahnsen, A. C., Bohorquez, C. E., Villegas, S., Vargas, J., & González, F. A. (2017). Classifying phishing URLs using recurrent neural networks. In 2017 APWG symposium on electronic crime research (eCrime) (pp. 1–8). Scottsdale, AZ, USA.
[26] Zhang J., Li X. Phishing detection method based on borderline-smote deep belief network security, privacy, and anonymity in computation, communication, and storage. SpaCCS 2017, Lecture notes in computer science, vol. 10658, Springer, Cham (2017), pp. 45-53
[27] Yang P., Zhao G., Zeng P. Phishing website detection based on multidimensional features driven by deep learning IEEE Access, 7 (2019), pp. 15196-15209