مخاطره الگوریتم متروپلیس هستینگز روبینز مونرو در مدلهای چندبعدی نظریه سؤال پاسخ در دادههای دو ارزشی با در نظر گرفتن طول آزمون
محورهای موضوعی : تحقیقمهدی مولایی یساولی 1 , علی دلاور 2 , محمد عسگری 3 , جلیل یونسی 4 , وحید رضایی تبار 5
1 - دانشجوی دکتری سنجش و اندازه گیری ، دانشکده روانشناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران
2 - استاد تمام گروه سنجش و اندازه گیری، دانشکده روانشناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران.
3 - ایشان در گروه سنجش و اندازهگیری دانشکده روانشناسی و علوم تربیتی عضویت دارد
4 - گروه سنجش و اندازه گیری، دانشکده روانشناسی و علوم تربیتی، دانشگاه علامه طباطبائی، تهران، ایران.
5 - دانشیار گروه آمار، دانشکده آمار، ریاضی و رایانه، دانشگاه علامه طباطبائی، تهران، ایران.
کلید واژه: الگوریتم MHRM, مخاطره, طول آزمون, مدلهای چند بعدی نظریه سؤال پاسخ, دادههای دو ارزشی,
چکیده مقاله :
پژوهش حاضر با هدف بررسی مخاطره الگوریتم MHRM در مدل های چند بعدی نظریه سؤال پاسخ در داده های دو ارزشی با در نظر گرفتن ابعاد و طول آزمون متفاوت مورد بررسی قرار گرفت. روش پژوهش مورد استفاده آزمایشی واقعی و با استفاده از طرح پسآزمون چند گروهی بود. نمونه مورد مطالعه براساس مطالعات شبیهسازی تحت شرایط مختلف متغیرهای مستقل در 27 حالت با 100 تکرار برای هر کدام ایجاد شد. مدل مورد استفاده مدل دو پارامتری چندبعدی لوجستیک و پارامترهای مورد بررسی شیب و دشواری سؤالات بود. جهت بررسی مخاطره هر یک از پارامترها در حالتهای مختلف آزمایشی شاخص میانگین توان دوم خطاها مورد استفاده قرار گرفت. جهت تولید و تحلیل داده ها ار نرمافزارهای آماری R بسته های mirt، interactions، car و psych استفاده شد. نتایج پژوهش نشان داد الگوریتم MHRM در قیاس با الگوریتمهای EM و MCEM دارای مخاطره کمتری است. این موضوع بویژه تحت شرایط دادههایی با ابعاد بالا (5 بعد) و طول آزمون کوتاه (15 سؤال) بیشتر مشهود بود. همچنین نتایج پژوهش نشان داد زمانی که ابعاد آزمون افزایش و طول آزمون کاهش می یابد، مخاطره برآورد پارامترها به طور معنیداری افزایش مییابد. در نتیجه میتوان گفت کاربرد الگوریتم MHRM در دادههای با تعداد ابعاد بالا و طول آزمون کوتاه ضروری است و به پژوهشگران توصیه می شود که از آن در تحلیل داده های با ساختار پیچیده از قبیل تعداد ابعاد بالا بهره گیرند
The present study was conducted with the aim of investigating the risk of MHRM algorithm in multi-dimensional models of item-response theory in binary data, taking into account different test dimensions and lengths. The research method used was a real experiment using a multi-group post-test design. The studied sample was created based on simulation studies under different conditions of independent variables in 27 modes with 100 repetitions for each. The model used was the two-parameter multidimensional model of logistics and the investigated parameters were the slope and difficulty of the items. In order to check the risk of each of the parameters in different experimental conditions, the average squared error index was used. R statistical software packages mirt, interactions, car and psych were used for data generation and analysis. The results of the research showed that the MHRM algorithm has less risk compared to the EM and MCEM algorithms. This issue was especially evident under the conditions of high dimensional data (5 dimensions) and short test length (15 questions). Also, the results of the research showed that when the dimensions of the test increase and the length of the test decreases, the risk of parameter estimation increases significantly. As a result, it can be said that the application of the MHRM algorithm in data with a high number of dimensions and a short test length is necessary, and researchers are advised to use it in the analysis of data with a complex structure such as a high number of dimensions
دلاور، علی (۱۴۰۱). کتاب احتمالات و آمار کاربردی در روان شناسی و علوم تربیتی. تهران: نشر رشد.
Asparouhov, T., & Muthén, B. (2012). General random effect latent variable modeling: Random subjects, items, contexts, and parameters. In annual meeting of the National Council on Measurement in Education, Vancouver, British Columbia.
Bartolucci, F., Bacci, S., & Gnaldi, M. (2015). Statistical analysis of questionnaires: A unified approach based on R and Stata (Vol. 34). CRC Press.
Bashkov, B. M., & DeMars, C. E. (2017). Examining the performance of the Metropolis–Hastings Robbins–Monro algorithm in the estimation of multilevel multidimensional IRT models. Applied psychological measurement, 41(5): 323-337.
Bulut, O., & SÜNBÜL, Ö. (2017). Monte Carlo Simulation Studies in Item Response Theory with the R Programming Language R Programlama Dili ile Madde Tepki Kuramında Monte Carlo Simülasyon Çalışmaları. Journal of Measurement and Evaluation in Education and Psychology, 8(3): 266-287.
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75(1): 33-57.
Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahway.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1): 1-22.
Delavar, A. (2021). Applied probability and statistics in psychology and educational sciences. Tehran: Roshd Published [In Persian].
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
Gibbons, R. D., Weiss, D. J., Frank, E., & Kupfer, D. (2016). Computerized adaptive diagnosis and testing of mental health disorders. Annual Review of Clinical Psychology, 12, 83-104.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1985). Principles and applications of item response theory.
Han, K. T., & Hambleton, R. K. (2014). User's Manual for WinGen 3: Windows Software that Generates IRT Model Parameters and Item Responses (Center for Educational Assessment Report No. 642). Amherst, MA: University of Massachusetts.
Liu, Q., & Pierce, D. A. (1994). A note on Gauss—Hermite quadrature. Biometrika, 81(3), 624-629.
Kuo, F. Y., & Sloan, I. H. (2005). Lifting the curse of dimensionality. Notices of the AMS, 52(11), 1320-1328.
Kasim, M. F., Bott, A. F. A., Tzeferacos, P., Lamb, D. Q., Gregori, G., & Vinko, S. M. (2019). Retrieving fields from proton radiography without source profiles. Physical Review E, 100(3): 033208.
Kuo, T. C., & Sheng, Y. (2016). A comparison of estimation methods for a multi-unidimensional graded response IRT model. Frontiers in psychology, 7, 880.), 1-29.
Lesaffre, E., & Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random effects model: an example. Journal of the Royal Statistical Society: Series C (Applied Statistics), 50(3): 325-335.
Linden, W. J., & van der, & Hambleton, RK (1997). Handbook of modern item response theory, 9-39.
Meng, X. L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association, 91(435): 1254-1267.
Naylor, J. C., & Smith, A. F. (1982). Applications of a method for the efficient computation of posterior distributions. Journal of the Royal Statistical Society: Series C (Applied Statistics), 31(3): 214-225.
Patz, R. J., & Junker, B. W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of educational and behavioral statistics, 24(4): 342-366.
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The annals of mathematical statistics, 400-407.
Yang, J. S., & Cai, L. (2014). Estimation of contextual effects through nonlinear multilevel latent variable modeling with a Metropolis–Hastings Robbins–Monro algorithm. Journal of Educational and Behavioral Statistics, 39(6): 550-582.
Sahin, A., & Anil, D. (2017). The effects of test length and sample size on item parameters in item response theory.
Patsula, L. (1995). A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes. University of Ottawa (Canada).
_||_Asparouhov, T., & Muthén, B. (2012). General random effect latent variable modeling: Random subjects, items, contexts, and parameters. In annual meeting of the National Council on Measurement in Education, Vancouver, British Columbia.
Bartolucci, F., Bacci, S., & Gnaldi, M. (2015). Statistical analysis of questionnaires: A unified approach based on R and Stata (Vol. 34). CRC Press.
Bashkov, B. M., & DeMars, C. E. (2017). Examining the performance of the Metropolis–Hastings Robbins–Monro algorithm in the estimation of multilevel multidimensional IRT models. Applied psychological measurement, 41(5): 323-337.
Bulut, O., & SÜNBÜL, Ö. (2017). Monte Carlo Simulation Studies in Item Response Theory with the R Programming Language R Programlama Dili ile Madde Tepki Kuramında Monte Carlo Simülasyon Çalışmaları. Journal of Measurement and Evaluation in Education and Psychology, 8(3): 266-287.
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75(1): 33-57.
Chalmers, R. P. (2012). Mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahway.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1): 1-22.
Delavar, A. (2021). Applied probability and statistics in psychology and educational sciences. Tehran: Roshd Published [In Persian].
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
Gibbons, R. D., Weiss, D. J., Frank, E., & Kupfer, D. (2016). Computerized adaptive diagnosis and testing of mental health disorders. Annual Review of Clinical Psychology, 12, 83-104.
Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1985). Principles and applications of item response theory.
Han, K. T., & Hambleton, R. K. (2014). User's Manual for WinGen 3: Windows Software that Generates IRT Model Parameters and Item Responses (Center for Educational Assessment Report No. 642). Amherst, MA: University of Massachusetts.
Liu, Q., & Pierce, D. A. (1994). A note on Gauss—Hermite quadrature. Biometrika, 81(3), 624-629.
Kuo, F. Y., & Sloan, I. H. (2005). Lifting the curse of dimensionality. Notices of the AMS, 52(11), 1320-1328.
Kasim, M. F., Bott, A. F. A., Tzeferacos, P., Lamb, D. Q., Gregori, G., & Vinko, S. M. (2019). Retrieving fields from proton radiography without source profiles. Physical Review E, 100(3): 033208.
Kuo, T. C., & Sheng, Y. (2016). A comparison of estimation methods for a multi-unidimensional graded response IRT model. Frontiers in psychology, 7, 880.), 1-29.
Lesaffre, E., & Spiessens, B. (2001). On the effect of the number of quadrature points in a logistic random effects model: an example. Journal of the Royal Statistical Society: Series C (Applied Statistics), 50(3): 325-335.
Linden, W. J., & van der, & Hambleton, RK (1997). Handbook of modern item response theory, 9-39.
Meng, X. L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association, 91(435): 1254-1267.
Naylor, J. C., & Smith, A. F. (1982). Applications of a method for the efficient computation of posterior distributions. Journal of the Royal Statistical Society: Series C (Applied Statistics), 31(3): 214-225.
Patz, R. J., & Junker, B. W. (1999). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of educational and behavioral statistics, 24(4): 342-366.
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The annals of mathematical statistics, 400-407.
Yang, J. S., & Cai, L. (2014). Estimation of contextual effects through nonlinear multilevel latent variable modeling with a Metropolis–Hastings Robbins–Monro algorithm. Journal of Educational and Behavioral Statistics, 39(6): 550-582.
Sahin, A., & Anil, D. (2017). The effects of test length and sample size on item parameters in item response theory.
Patsula, L. (1995). A comparison of item parameter estimates and ICCs produced with TESTGRAF and BILOG under different test lengths and sample sizes. University of Ottawa (Canada).