تشخیص و کاهش خرابی ساکت داده براساس پیش بینی نرخ رخداد خرابی بدون تزریق اشکال
الموضوعات :
سامانههای پردازشی و ارتباطی چندرسانهای هوشمند
مونا یخچی
1
,
مهدی فاضلی
2
,
سید امیر اصغری توچائی
3
1 - دانشجوی دکتری، گروه کامپیوتر، واحد بروجرد، دانشگاه آزاد اسلامی، بروجرد، ایران.
2 - دانشیار، گروه کامپیوتر ، دانشکده فناوری اطلاعات، دانشگاه هالمستاد، هالمستاد، سوند.
3 - استادیار، گروه کامپیوتر، دانشکده مهندسی برق و کامپیوتر، دانشگاه صنعتی خوارزمی، تهران، ایران.
تاريخ الإرسال : 27 الجمعة , صفر, 1444
تاريخ التأكيد : 18 الجمعة , شعبان, 1444
تاريخ الإصدار : 29 الجمعة , جمادى الأولى, 1444
الکلمات المفتاحية:
تزریق اشکال,
خطاهای چند بیتی,
خطاهای نرم,
یادگیری ماشین,
خرابی ساکت داده,
ملخص المقالة :
خرابی ساکت داده (SDC) به طور جدی قابلیت اطمینان یک سیستم را به مخاطره میاندازد. رویکردهای فعلی با استفاده از یادگیری ماشین نرخ رخداد SDC برای هر دستورالعمل را پیش بینی میکنند. در حالیکه اکثر آنها فاقد دقت مناسب و نیازمند مجموعه داده برای آموزش هستند و به دلیل مصرف منابع زیاد دستیابی به آنها دشوار است. از سوی دیگر نرخ رخداد اشکالات چندبیتی در قطعات نیمه هادی افزایش چشمگیری داشتهاند. لذا تشخیص دستورات آسیب پذیر در حضور اشکال اهمیت یافته است. اما خلاء تحقیقات موجود عدم وجود یک روش نرم افزاری با دقت بالا بدون نیاز به تزریق اشکال است؛ به طوریکه تشخیص اشکال در SDC با منشاء داده و دستورالعمل مورد بررسی قرار بگیرد. بدین منظور، در این پژوهش با محاسبه نرخ رخداد SDC برای هر دستورالعملها، مدل درخت تصمیم گیری M5rule پیشنهاد گردیده است. سپس از روش تشخیص خطا، با کپی کردن دستورالعملهای حیاتی بوسیله مرتبسازی استفاده شده و در نهایت مدل ارائه شده بر روی معیار Mibench با برنامههای آزمایشی متعدد ارزیابی گردیده است. نتایج ارزیابی نشان میدهد روش ارائه شده در مقایسه با سایر روشهای پیشرفته به دقت تشخیص بهتری با سربار در حدود 99 درصد برای 58 درصد نرخ پوشش SDC رسیده است.
المصادر:
[1] A. Asghari, M. Binesh Marvasti, and M. Daneshtalab, “A software implemented comprehensive soft error detection method for embedded systems,” Microprocess. Microsyst., vol. 77, p. 103161, Sep. 2020, doi: 10.1016/J.MICPRO.2020.103161.
[2] A. Asghari, H. Taheri, H. Pedram, and O. Kaynak, “Software-based control flow checking against transient faults in industrial environments,” IEEE Trans. Ind. Informatics, vol. 10, no. 1, pp. 481–490, Feb. 2014, doi: 10.1109/TII.2013.2248373.
[3] Sangchoolie, K. Pattabiraman, and J. Karlsson, “One Bit is (Not) Enough: An Empirical Study of the Impact of Single and Multiple Bit-Flip Errors,” Proc. - 47th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Networks, DSN 2017, pp. 97–108, Aug. 2017, doi: 10.1109/DSN.2017.30.
[4] Lu, G. Li, K. Pattabiraman, M. S. Gupta, and J. A. Rivers, “Configurable Detection of SDC-causing Errors in Programs,” ACM Trans. Embed. Comput. Syst., vol. 16, no. 3, Mar. 2017, doi: 10.1145/3014586.
[5] Yakhchi, M. Fazeli, and . A. Asghari, “Investigation of the Effect of Burst Multi-bit Soft Errors on Control Flow and Data Error Behaviors of Embedded Systems,” J. Soft Comput. Inf. Technol., vol. 10, no. 2, pp. 68–81, 2021.
[6] Yakhchi, M. Fazeli, and S. A. Asghari, “Silent Data Corruption Estimation and Mitigation Without Fault Injection,” IEEE Can. J. Electr. Comput. Eng., vol. 45, no. 3, pp. 318–327, 2022.
[7] Fang, K. Pattabiraman, M. Ripeanu, and S. Gurumurthi, “GPU-Qin: A methodology for evaluating the error resilience of GPGPU applications,” ISPASS 2014 - IEEE Int. Symp. Perform. Anal. Syst. Softw., pp. 221–230, 2014, doi: 10.1109/ISPASS.2014.6844486.
[8] Wei, A. Thomas, G. Li, and K. Pattabiraman, “Quantifying the accuracy of high-level fault injection techniques for hardware faults,” Proc. Int. Conf. Dependable Syst. Networks, pp. 375–382, Sep. 2014, doi: 10.1109/DSN.2014.2.
[9] FengShuguang, GuptaShantanu, AnsariAmin, and MahlkeScott, “Shoestring,” ACM SIGARCH Comput. Archit. News, vol. 38, no. 1, pp. 385–396, Mar. 2010, doi: 10.1145/1735970.1736063.
[10] K. Sastry Hari, R. Venkatagiri, S. V. Adve, and H. Naeimi, “GangES,” ACM SIGARCH Comput. Archit. News, vol. 42, no. 3, pp. 61–72, Jun. 2014, doi: 10.1145/2678373.2665685.
[11] Li, Q. Lu, and K. Pattabiraman, “Fine-Grained Characterization of Faults Causing Long Latency Crashes in Programs,” Proc. Int. Conf. Dependable Syst. Networks, vol. 2015-September, pp. 450–461, Sep. 2015, doi: 10.1109/DSN.2015.36.
[12] Pal, “M5 model tree for land cover classification,” http://dx.doi.org/10.1080/01431160500256531, vol. 27, no. 4, pp. 825–831, 2007, doi: 10.1080/01431160500256531.
[13] Adams and L. Sterling, “AI ’92,” pp. 1–410, Dec. 1992, doi: 10.1142/9789814536271.
[14] Wang and I. H. Witten, “Induction of model trees for predicting continuous classes,” 1996, Accessed: Aug. 22, 2022. [Online]. Available: https://researchcommons.waikato.ac.nz/handle/10289/1183
[15] J. Wang, A. Mahesri, and S. J. Patel, “Examining ACE analysis reliability estimates using fault-injection,” ACM SIGARCH Comput. Archit. News, vol. 35, no. 2, pp. 460–469, Jun. 2007, doi: 10.1145/1273440.1250719.
[16] S. Mukherjee, J. Emer, and S. K. Reinhardt, “The soft error problem: An architectural perspective,” Proc. - Int. Symp. High-Performance Comput. Archit., pp. 243–247, 2005, doi: 10.1109/HPCA.2005.37.
[17] Ghavami and M. Raji, “Soft Error Rate Estimation of VLSI Circuits,” Soft Error Reliab. VLSI Circuits, pp. 9–23, 2021, doi: 10.1007/978-3-030-51610-9_2.
[18] Li, K. Pattabiraman, S. K. S. Hari, M. Sullivan, and T. Tsai, “Modeling Soft-Error propagation in programs,” Proc. - 48th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Networks, DSN 2018, pp. 27–38, Jul. 2018, doi: 10.1109/DSN.2018.00016.
[19] Li and K. Pattabiraman, “Modeling Input-Dependent error propagation in programs,” Proc. - 48th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Networks, DSN 2018, pp. 279–290, Jul. 2018, doi: 10.1109/DSN.2018.00038.
[20] Ma, Z. Duan, and L. Tang, “A Methodology to Assess Output Vulnerability Factors for Detecting Silent Data Corruption,” IEEE Access, vol. 7, pp. 118135–118145, 2019.
[21] Fang, Q. Lu, K. Pattabiraman, M. Ripeanu, and S. Gurumurthi, “EPVF: An enhanced program vulnerability factor methodology for cross-layer resilience analysis,” Proc. - 46th Annu. IEEE/IFIP Int. Conf. Dependable Syst. Networks, DSN 2016, pp. 168–179, Sep. 2016, doi: 10.1109/DSN.2016.24.
[22] Thomas and K. Pattabiraman, “Error Detector Placement for Soft Computing Applications,” ACM Trans. Embed. Comput. Syst., vol. 15, no. 1, Jan. 2016, doi: 10.1145/2801154.
[23] Wei, R. Zhang, Y. Liu, H. Yue, and J. Tan, “Evaluating the soft error resilience of instructions for GPU applications,” Proc. - 22nd IEEE Int. Conf. Comput. Sci. Eng. 17th IEEE Int. Conf. Embed. Ubiquitous Comput. CSE/EUC 2019, pp. 459–464, Aug. 2019, doi: 10.1109/CSE/EUC.2019.00091.
[24] James, H. Quinn, M. Wirthlin, and J. Goeders, “Applying Compiler-Automated Software Fault Tolerance to Multiple Processor Platforms,” IEEE Trans. Nucl. Sci., vol. 67, no. 1, pp. 321–327, Jan. 2020, doi: 10.1109/TNS.2019.2959975.
[25] -L. Li, P. Ramachandran, S. K. Sahoo, S. V. Adve, V. S. Adve, and Y. Zhou, “Understanding the propagation of hard errors to software and implications for resilient system design,” p. 265, 2008, doi: 10.1145/1346281.1346315.
[26] B. Thati, J. Vankeirsbilck, J. Boydens, and D. Pissort, “Selective Duplication and Selective Comparison for Data Flow Error Detection,” 2019 4th Int. Conf. Syst. Reliab. Safety, ICSRS 2019, pp. 10–15, Nov. 2019, doi: 10.1109/ICSRS48664.2019.8987731.
[27] Ayatolahi, B. Sangchoolie, R. Johansson, and J. Karlsson, “A study of the impact of single bit-flip and double bit-flip errors on program execution,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8153 LNCS, pp. 265–276, 2013, doi: 10.1007/978-3-642-40793-2_24/COVER.
[28] Sangchoolie, F. Ayatolahi, R. Johansson, and J. Karlsson, “A study of the impact of bit-flip errors on programs compiled with different optimization levels,” Proc. - 2014 10th Eur. Dependable Comput. Conf. EDCC 2014, pp. 146–157, 2014, doi: 10.1109/EDCC.2014.30.
[29] Sangchoolie, K. Pattabiraman, and J. Karlsson, “An Empirical Study of the Impact of Single and Multiple Bit-Flip Errors in Programs,” IEEE Trans. Dependable Secur. Comput., 2020.
[30] Narayanamurthy, K. Pattabiraman, and M. Ripeanu, “Finding Resilience-Friendly Compiler Optimizations Using Meta-Heuristic Search Techniques,” Proc. - 2016 12th Eur. Dependable Comput. Conf. EDCC 2016, pp. 1–12, Dec. 2016, doi: 10.1109/EDCC.2016.26.
[31] K. S. Hari, S. V. Adve, and H. Naeimi, “Low-cost program-level detectors for reducing silent data corruptions,” Proc. Int. Conf. Dependable Syst. Networks, 2012, doi: 10.1109/DSN.2012.6263960.
[32] Lu, K. Pattabiraman, M. S. Gupta, and J. A. Rivers, “SDCTune: A model for predicting the SDC proneness of an application for configurable protection,” 2014 Int. Conf. Compil. Archit. Synth. Embed. Syst. CASES 2014, Oct. 2014, doi: 10.1145/2656106.2656127.
[33] Liu, L., Ci, L., Liu, W., Yang, H., 2019,"Identifying SDC-causing Instructions based on Random forests algorithm", KSII Transactions on Internet and Information Systems. Vol. 13.
[34] Yang and Y. Wang, “Identify Silent Data Corruption Vulnerable Instructions Using SVM,” IEEE Access, vol. 7, pp. 40210–40219, 2019, doi: 10.1109/ACCESS.2019.2905842.
[35] A. Rink and J. Castrillon, “Trading fault tolerance for performance in AN encoding,” ACM Int. Conf. Comput. Front. 2017, CF 2017, pp. 183–190, May 2017, doi: 10.1145/3075564.3075565.
[36] Fang, J. Gu, Z. Yan, and Q. Wang, “SDC Error Detection by Exploring the Importance of Instruction Features,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12937 LNCS, pp. 351–363, 2021, doi: 10.1007/978-3-030-85928-2_28/COVER.
[37] Liu, J. Gu, Z. Yan, F. Zhuang, and Y. Wang, “SDC-causing Error Detection Based on Lightweight Vulnerability Prediction.” PMLR, pp. 1049–1064, Oct. 15, 2019. Accessed: Aug. 22, 2022. [Online]. Available: https://proceedings.mlr.press/v101/liu19c.html
[38] Wang, N. Dryden, F. Cappello, and M. Snir, “Neural Network Based Silent Error Detector,” Proc. - IEEE Int. Conf. Clust. Comput. ICCC, vol. 2018-September, pp. 168–178, Oct. 2018, doi: 10.1109/CLUSTER.2018.00035.
[39] Laguna, M. Schulz, D. F. Richards, J. Calhoun, and L. Olson, “Ipas: Intelligent protection against silent output corruption in scientific applications,” in 2016 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2016, pp. 227–238.
[40] Ebrahimi, P. M. B. Rao, R. Seyyedi, and M. B. Tahoori, “Low-Cost Multiple Bit Upset Correction in SRAM-Based FPGA Configuration Frames,” IEEE Trans. Very Large Scale Integr. Syst., vol. 24, no. 3, pp. 932–943, Mar. 2016, doi: 10.1109/TVLSI.2015.2425653.
[41] Frank et al., “Weka-a machine learning workbench for data mining,” in Data mining and knowledge discovery handbook, Springer, 2009, pp. 1269–1277.
[42] Banaiyanmofrad, M. Ebrahimi, F. Oboril, M. B. Tahoori, and N. Dutt, “Protecting caches against multi-bit errors using embedded erasure coding,” Proc. - 2015 20th IEEE Eur. Test Symp. ETS 2014, Jun. 2015, doi: 10.1109/ETS.2015.7138735.
[43] R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown, “MiBench: A free, commercially representative embedded benchmark suite,” 2001 IEEE Int. Work. Workload Charact. WWC 2001, pp. 3–14, 2001, doi: 10.1109/WWC.2001.990739.
[44] Gu, W. Zheng, Y. Zhuang, and Q. Zhang, “Vulnerability Analysis of Instructions for SDC-Causing Error Detection,” IEEE Access, vol. 7, pp. 168885–168898, 2019, doi: 10.1109/ACCESS.2019.2950598.
_||_