DEVELOPMENT AND VALIDATION OF A SENIOR HIGH SCHOOL STATISTICS AND PROBABILITY ACHIEVEMENT TEST USING ITEM RESPONSE THEORY

Jerome L. Buhay

doi:10.18173/2354-1075.2026-0029

Authors

Jerome L. Buhay Department of Mathematics and Statistics, De La Salle University-Dasmariñas, Cavite, Philippines

DOI:

https://doi.org/10.18173/2354-1075.2026-0029

Keywords:

Item Response Theory (IRT), cognitive assessment, Statistics and Probability, test development, senior high school, 3PL IRT models, Rasch model

Abstract

This study addressed the critical need for sophisticated diagnostic tools in Philippine mathematics education by developing and validating the Senior High School Statistics and Probability Achievement Test (SHSTAT). Validation in this study refers to established internal structural evidence and psychometric calibration, assessed through unidimensionality, local independence, and item-model fit. Utilizing a descriptive-developmental research design grounded in Item Response Theory (IRT), the study transcended the limitations of Classical Test Theory to provide precise measurement across the ability continuum. The instrument was administered to 1,703 Grade 11 students from public and private schools in Cavite. Results from a Modified Parallel Analysis (MPA) confirmed the instrument's unidimensionality, with all item factor loadings (λ ≥ 0.90) substantially exceeding the 0.30 criterion. Comparative model fitting identified the Three-Parameter Logistic (3PL) model as the superior fit for multiple-choice data, effectively accounting for item discrimination (a), difficulty (b), and pseudo-guessing (c). Iterative refinement resulted in a 30-item scale that satisfied the assumption of local independence (|Q3| < 0.20) and exhibited excellent item-fit indices (RMSEA < 0.05). Distinct from traditional procedural assessments, item- and test-level IRT analyses indicate that the SHSTAT demonstrates high conditional measurement precision, with the Test Information Function (TIF) peaking at 17.5 around θ ≈ 1.3, indicating optimal precision at moderately high proficiency levels. This precision is driven by highly discriminating items, whose Item Characteristic Curves (ICCs) exhibit steep slopes and whose Item Information Functions (IIFs) show concentrated information between 0.8 < θ < 1.8. Consistently, the Test Characteristic Curve (TCC) displays a pronounced slope within this same range, confirming strong differentiation among higher-ability examinees. This study contributes a psychometrically robust instrument for competitive academic placement and offers educators a reliable and accurate means of assessing students’ cognitive skills in Statistics and Probability, supporting data-driven instruction and curriculum refinement.

Downloads

Download data is not yet available.

References

[1] Department of Education, (2013). K to 12 Basic Education Curriculum: Senior high school core subject—Statistics and probability (Curriculum guide). Republic of the Philippines. https://www.deped.gov.ph/wp-content/uploads/2022/02/SHS-Core_Statistics-and-Probability-CG.pdf.

[2] Mamba M, Tamayao A & Vecaldo R, (2020). College readiness of Filipino K to 12 graduates: Insights from a criterion-referenced test. International Journal of Education and Practice, 8(4), 625-637. https://doi.org/10.18488/journal.61.2020.84.625.637.

[3] Paat FMG, (2023). School motivation, learning strategies and college readiness of senior high school graduates in the Philippines. Journal for Educators, Teachers and Trainers, 14(3), 1-8. https://jett.labosfor.com/index.php/jett/article/view/1620.

[4] Calma JD, Salvador IGO & Supan AM, (2022). Knowledge and attitude toward statistics and probability of senior high school students. Asia Pacific Journal of Educational Perspective, 9(1). https://research.lpubatangas.edu.ph/wp-content/uploads/2022/09/3-APJEP-2022-38-Calma-et-al..pdf.

[5] Tan SH & Vighnarajah V, (2025). Exploring students’ misconceptions in the probability topic of Form 4 mathematics. Malaysian Journal of Social Sciences and Humanities, 9(S1), 251-262. https://doi.org/10.47405/mjssh.v9iS1.3005.

[6] Sari DP, Suryadi D & Dasari G, (2024). Learning obstacle of probability learning based on the probabilistic thinking level. Journal on Mathematics Education, 15(1), 207-228. https://doi.org/10.22342/jme.v15i1.

[7] Makonye JP & Fakude J, (2016). A study of errors and misconceptions in the learning of addition and subtraction of directed numbers in Grade 8. SAGE Open, 6(4), 1-10. https://doi.org/10.1177/2158244016671375.

[8] Organisation for Economic Co-operation and Development, (2023). PISA 2022 results (Volume I): The state of learning and equity in education. OECD Publishing. https://doi.org/10.1787/53f23881-en.

[9] Batanero C & Álvarez-Arroyo R, (2024). Teaching and learning of probability. ZDM–Mathematics Education, 56, 5-17. https://doi.org/10.1007/s11858-023-01511-5.

[10] Mamolo LA, (2021). Development of an achievement test to measure students’ competency in general mathematics. Anatolian Journal of Education, 6(1), 79-90. https://doi.org/10.29333/aje.2021.616a.

[11] de Ayala RJ, (2022). The theory and practice of item response theory (2nd ed.). Guilford Press.

Bezirhan U & von Davier M, (2024). TIMSS achievement scaling methodology: Item response theory and population models. In von Davier M, Fishbein B & Kennedy AM (Eds.), “TIMSS 2023 technical report: Methods and procedures”. International Association for the Evaluation of Educational Achievement (IEA).

[13] American Educational Research Association, American Psychological Association & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. American Educational Research Association.

[14] Calderon JF & Gonzales EC, (2019). Methods of research and thesis writing. MG Reprographics Supply & Services Inc.

[15] OECD, (2023). PISA 2022 Results (Volume I): The State of Learning and Equity in Education. PISA, OECD Publishing, Paris. https://doi.org/10.1787/53f2383c-en.

[16] DeMars C, (2010). Item response theory. Oxford University Press.

[17] Glorfeld LW, (1995). An improvement on Horn’s parallel analysis methodology for selecting the correct number of factors to retain. Educational and Psychological Measurement, 55(3), 377-393. https://doi.org/10.1177/0013164495055003002.

[18] Baker FB & Kim SH, (2017). The basics of item response theory using R. Springer. https://doi.org/10.1007/978-3-319-54205-8.

[19] Browne MW & Cudeck R, (1993). Alternative ways of assessing model fit. In Bollen KA & Long JS (Eds.), “Testing structural equation models”, p. 136-162. Sage Publications.

[20] Christensen KB, Makransky G & Horton M, (2017). Critical values for Yen’s Q3: Identification of local dependence in the Rasch model using residual correlations. Applied Psychological Measurement, 41(3), 178-194. https://doi.org/10.1177/0146621616677520.

[21] Lim H & Jahng S, (2019). Scale linking for the testlet item response theory model. Educational and Psychological Measurement, 79(6), 1081-1106. https://doi.org/10.1177/0013164419844287.

[22] Kang T & Cohen AS, (2007). IRT model selection methods for dichotomous items. Applied Psychological Measurement, 31(4), 331-358. https://doi.org/10.1177/0146621606292213.

[23] Embretson SE & Reise SP, (2025). Item response theory: Foundations for psychologists and social scientists (2nd ed.). Routledge. https://doi.org/10.4324/9781315726557.

[24] Gyamfi A & Acquaye R, (2023). Parameters and models of item response theory (IRT): A review of literature. Acta Educationis Generalis, 13(3), 68-78. https://doi.org/10.2478/atd-2023-0022.

[25] Zhang S, Wang W & Tao J, (2018). Estimating the 3PL model parameters with the maximum likelihood method and the Bayesian method. Journal of Applied Statistics, 45(12), 2244-2261. https://doi.org/10.1080/02664763.2017.1414163.

[26] Orlando M & Thissen D, (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24(1), 50-64. https://doi.org/10.1177/01466216000241003.

[27] MacCallum RC, Browne MW & Sugawara HM, (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1(2), 130-149. https://doi.org/10.1037/1082-989X.1.2.130.

[28] Baker FB, (2001). The basics of item response theory (2nd ed.). ERIC Clearinghouse on Assessment and Evaluation. https://eric.ed.gov/?id=ED458219.

DEVELOPMENT AND VALIDATION OF A SENIOR HIGH SCHOOL STATISTICS AND PROBABILITY ACHIEVEMENT TEST USING ITEM RESPONSE THEORY

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

How to Cite

Similar Articles

Indexing

Membership

Tools

Information