AI-Supported Authentic Assessment in Science Education: Overcoming Logistical Barriers and Enhancing Outcomes

Marwan  Abualrob; Deema  Ghannam

Authors

Marwan Abualrob
Deema Ghannam

Keywords:

authentic assessment; generative AI; student engagement; academic achievement; instructional design; teacher-support tools

Abstract

As modern education shifts toward 21st-century skills, practical application often struggles to keep pace. To support the implementation of Authentic Assessment (AA), this study investigates its impact on grade 9 science students’ achievement and engagement in Palestine, relative to Traditional Assessment (TA), while exploring the role of generative AI as a teacher-support tool. Adopting an explanatory sequential mixed-methods design with a purposive sample of 59 female students, the study utilized quantitative testing (Mann-Whitney U) alongside qualitative thematic analysis (interviews, focus groups, and structured teacher reflections). To ensure rigor and replicability, a prompt engineering strategy was used alongside blind grading to reduce teacher bias. The results indicated that the AA group significantly outperformed the TA group in academic achievement (p = 0.007) while also displaying higher cognitive, behavioral, and emotional engagement. Qualitatively, the findings revealed that students learned concepts more deeply by creating tangible products to present to the class rather than through memorization. In addition, teacher reflections revealed that implementing AA posed significant logistical and time-related challenges, particularly in rubric construction and instructional planning. Importantly, Artificial Intelligence (AI) helped overcome these obstacles by simplifying the design of rubrics, clarifying performance criteria, and supporting lesson preparation. The study concludes that teacher-mediated, AI-supported AA offers a practical model for enhancing educational quality and student agency, proposing a scalable solution for settings with logistical constraints, although further studies are needed to extend these findings beyond this teacher- and gender-specific context.

https://doi.org/10.26803/ijlter.25.5.11

References

Abualrob, M. M. A., & Al-Saadi, S. H. (2019). Performance-based assessment: Approach and obstacles by higher-elementary science teachers in Palestine. Journal of Education and Learning, 8(2), 1–11. https://doi.org/10.5539/jel.v8n2p1

Abualrob, M. (2022). Fifth and ninth grade students’ engagement in science classes in Palestine. South African Journal of Education, 42(2), Article 2070, 1–11. https://doi.org/10.15700/saje.v42n2a2070

Abualrob, M. M. (2025). Innovative teaching: How pre-service teachers use artificial intelligence to teach science to fourth graders. Contemporary Educational Technology, 17(1), Article ep547. https://doi.org/10.30935/cedtech/15686

Ajjawi, R., Tai, J., Dollinger, M., Dawson, P., Boud, D., & Bearman, M. (2024). From authentic assessment to authenticity in assessment: Broadening perspectives. Assessment & Evaluation in Higher Education, 49(4), 499–510. https://doi.org/10.1080/02602938.2023.2271193

Aladini, A., Bayat, S., & Abdellatif, M. S. (2024). Performance-based assessment in virtual versus non-virtual classes: Impacts on academic resilience, motivation, teacher support, and personal best goals. Asian-Pacific Journal of Second and Foreign Language Education, 9, Article 5. https://doi.org/10.1186/s40862-023-00230-4

Al Umri, N., Karnyoto, A. S., & Pardamean, B. (2025, December). The impact of AI-generated hallucinations in educational settings: Trends, gaps, and future directions [Paper presentation]. 2025 International Conference on Information Technology, Information Systems, and Electrical Engineering, Purwokerto, Indonesia. https://doi.org/10.1109/ICITISEE68184.2025.11355145

Amrein, A. L., & Berliner, D. C. (2002). High-stakes testing & student learning. Education Policy Analysis Archives, 10, Article 18. https://doi.org/10.14507/epaa.v10n18.2002

Andrade, H. L., & Heritage, M. (2018). Using formative assessment to enhance learning, achievement, and academic self-regulation. Routledge. https://doi.org/10.4324/9781315623856

Baidoo-anu, D., & Owusu Ansah, L. (2023). Education in the era of generative artificial intelligence (AI): Understanding the potential benefits of ChatGPT in promoting teaching and learning. Journal of AI, 7(1), 52–62. https://doi.org/10.61969/jai.1337500

Bearman, M., Nieminen, J. H., & Ajjawi, R. (2023). Designing assessment in a digital world: An organising framework. Assessment & Evaluation in Higher Education, 48(3), 291–304. https://doi.org/10.1080/02602938.2022.2069674

Boucher, E. M., Harake, N. R., Ward, H. E., Stoeckl, S. E., Vargas, J., Minkel, J., Parks, A. C., & Zilca, R. (2021). Artificially intelligent chatbots in digital mental health interventions: A review. Expert Review of Medical Devices, 18(Suppl. 1), 37–49. https://doi.org/10.1080/17434440.2021.2013200

Boud, D., & Bearman, M. (2024). The assessment challenge of social and collaborative learning in higher education. Educational Philosophy and Theory, 56(5), 459–468. https://doi.org/10.1080/00131857.2022.2114346

Bond, M., Khosravi, H., & De Laat, M. (2024). A meta systematic review of artificial intelligence in higher education: A call for increased ethics, collaboration, and rigour. International Journal of Educational Technology in Higher Education, 21, Article 4. https://doi.org/10.1186/s41239-023-00436-z

Burden, P. R., & Byrd, D. M. (2018). Methods for effective teaching: Meeting the needs of all students (8th ed.). Pearson.

Chen, L., Chen, P., & Lin, Z. (2020). Artificial intelligence in education: A review. IEEE Access, 8, 75264–75278. https://doi.org/10.1109/ACCESS.2020.2988510

Chi, S., Liu, X., & Wang, Z. (2021). Comparing student science performance between hands-on and traditional item types: A many-facet Rasch analysis. Studies in Educational Evaluation, 70, Article 100998. https://doi.org/10.1016/j.stueduc.2021.100998

Chiu, T. K. F. (2021a). Digital support for student engagement in blended learning is based on self-determination theory. Computers in Human Behavior, 124, 106909. https://doi.org/10.1016/j.chb.2021.106909

Chiu, T. K. F. (2021b). Student engagement in K-12 online learning amid COVID-19: A qualitative approach from a self-determination theory perspective. Interactive Learning Environments. https://doi.org/10.1080/10494820.2021.1926289

Chiu, T. K. F. (2022). Applying the self-determination theory (SDT) to explain student engagement in online learning during the COVID-19 pandemic. Journal of Research on Technology in Education, 54(Suppl. 1), S14–S30. https://doi.org/10.1080/15391523.2021.1891998

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.

Colley, K. (2008). Performance-based assessment. Science Teacher, 75(8), 68–72.

Creswell, J. W., & Creswell, J. D. (2018). Research design: Qualitative, quantitative, and mixed methods approach (5th ed.). SAGE Publications.

Diab, M. (2005). The impact of using portfolios in the development of science thinking and retention for grade students [Unpublished master’s thesis]. Islamic University.

European Commission. (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union. http://data.europa.eu/eli/reg/2024/1689/oj

Espinosa, L. F. (2015). Effective use of performance-based assessments to identify English knowledge and skills of EFL students in Ecuador. Theory and Practice in Language Studies, 5(12), 2441–2447. https://doi.org/10.17507/tpls.0512.02

Fawns, T., Bearman, M., Dawson, P., Nieminen, J. H., Ashford-Rowe, K., Willey, K., Jensen, L. X., Dam?a, C., & Press, N. (2025). Authentic assessment: From panacea to criticality. Assessment & Evaluation in Higher Education, 50(3), 396–408. https://doi.org/10.1080/02602938.2024.2404634

Field, A. (2024). Discovering statistics using IBM SPSS Statistics (6th ed.). SAGE Publications.

Frederiksen, J. R. (1984). The real test bias: Influences of testing on teaching and learning. American Psychologist, 39, 193–202. https://doi.org/10.1037/0003-066X.39.3.193

García-Carreño, I. (2025). A systematic review of emerging trends in education: Exploring the risks and benefits of generative artificial intelligence applications. European Educational Researcher, 8(3), 1–20. https://doi.org/10.31757/euer.834

Gravett, K. (2025). Authentic assessment as relational pedagogy. Teaching in Higher Education, 30(3), 608–622. https://doi.org/10.1080/13562517.2024.2380997

Harris, L. R., & Brown, G. T. L. (2018). Using self-assessment to improve student learning. Routledge. https://doi.org/10.4324/9781351036979

Herrera, S. G., Murry, K., & Cabral, R. M. (2013). Assessment accommodations for classroom teachers of culturally and linguistically diverse students (2nd ed.). Allyn & Bacon.

Ilieva, G., Yankova, T., Ruseva, M., & Kabaivanov, S. (2025). A framework for generative AI-driven assessment in higher education. Information, 16(6), Article 472. https://doi.org/10.3390/info16060472

Joseph, S. (2025). Rethinking assessment: How AI is changing the way we measure student success? AI & Society, 40, 5543–5545. https://doi.org/10.1007/s00146-025-02255-4

Lo, C. K. (2023). What is the impact of ChatGPT on education? A rapid review of the literature. Education Sciences, 13(4), Article 410. https://doi.org/10.3390/educsci13040410

Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23. https://doi.org/10.3102/0013189X023002013

Mirzaei, M., Hebblethwaite, D., & Yates, A. (2024). Exploring business students’ perceptions of authentic project-based and work integrated assessments. International Journal of Work-Integrated Learning, 25(2), 183–199. https://hdl.handle.net/10652/6615

Nieminen, J. H., Bearman, M., & Ajjawi, R. (2023). Designing the digital in authentic assessment: Is it fit for purpose? Assessment & Evaluation in Higher Education, 48(4), 529–543. https://doi.org/10.1080/02602938.2022.2089627

Ogunleye, B., Zakariyyah, K. I., Ajao, O., Olayinka, O., & Sharma, H. (2024). A systematic review of generative AI for teaching and learning practice. Education Sciences, 14(6), Article 636. https://doi.org/10.3390/educsci14060636

Papanastasiou, E. C., Giallousi, M., & Pitri, E. (2025). Re-introducing authentic assessment in classroom assessment courses: Finding its place in the 21st century. Education Sciences, 15(11), Article 1564. https://doi.org/10.3390/educsci15111564

Resnick, L. B. (1996). Performance puzzles: Issues in measuring capabilities and certifying accomplishments. CRESST/University of Pittsburgh, LRDC. https://doi.org/10.1037/e652072011-001

Ryan, R. M., & Deci, E. L. (2000). Intrinsic and extrinsic motivations: Classic definitions and new directions. Contemporary Educational Psychology, 25(1), 54–67. https://doi.org/10.1006/ceps.1999.1020

Salinas-Navarro, D. E., Vilalta-Perdomo, E., Michel-Villarreal, R., & Montesinos, L. (2024). Designing experiential learning activities with generative artificial intelligence tools for authentic assessment. Interactive Technology and Smart Education, 21(4), 708–734. https://doi.org/10.1108/ITSE-12-2023-0236

Sokhanvar, Z., Salehi, K., & Sokhanvar, F. (2021). Advantages of authentic assessment for improving the learning experience and employability skills of higher education students: A systematic literature review. Studies in Educational Evaluation, 70, Article 101030. https://doi.org/10.1016/j.stueduc.2021.101030

Timperley, C., & Schick, K. (2025). Assessment as pedagogy: Inviting authenticity through relationality, vulnerability and wonder. Teaching in Higher Education, 30(3), 592–607. https://doi.org/10.1080/13562517.2024.2367662

Torrance, H. (Ed.). (1995). Evaluating authentic assessment: Problems and possibilities in new approaches to assessment. Open University Press.

Trust, T., Whalen, J., & Mouza, C. (2023). Editorial: ChatGPT: Challenges, opportunities, and implications for teacher education. Contemporary Issues in Technology and Teacher Education, 23(1), 1–23.

Van den Berg, G., & du Plessis, E. (2023). ChatGPT and generative AI: Possibilities for its contribution to lesson planning, critical thinking and openness in teacher education. Education Sciences, 13(10), Article 998. https://doi.org/10.3390/educsci13100998

Villarroel, V., Bloxham, S., Bruna, D., Bruna, C., & Herrera-Seda, C. (2018). Authentic assessment: Creating a blueprint for course design. Assessment & Evaluation in Higher Education, 43(5), 840–854. https://doi.org/10.1080/02602938.2017.1412396

Vlachopoulos, D., & Makri, A. (2024). A systematic literature review on authentic assessment in higher education: Best practices for the development of 21st century skills, and policy considerations. Studies in Educational Evaluation, 83, Article 101425. https://doi.org/10.1016/j.stueduc.2024.101425

Volante, L. (2004). Teaching to the test: What every teacher and policy maker should know. Canadian Journal of Administration and Policy, 35(1), 1–6.

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.

Wang, M. T., Fredricks, J. A., Ye, F., Hofkens, T. L., & Linn, J. S. (2016). The math and science engagement scales: Scale development, validation, and psychometric properties. Learning and Instruction, 43, 16–26. https://doi.org/10.1016/j.learninstruc.2016.01.008

Wiggins, G. (1990). The case for authentic assessment. Practical Assessment Research and Evaluation, 2(2). https://doi.org/10.7275/FFB1-MM19

Yan, Z., Lao, H., Panadero, E., Fernández-Castilla, B., Yang, L., & Yang, M. (2022). Effects of self-assessment and peer-assessment interventions on academic performance: A meta-analysis. Educational Research Review, 37, Article 100484. https://doi.org/10.1016/j.edurev.2022.100484

Zhan, Y., Boud, D., & Du, Z. (2025). Designing for authentic assessment: A scoping review. Higher Education. https://doi.org/10.1007/s10734-025-01588-9

Zhang, K., & Aslan, A. B. (2021). AI technologies for education: Recent research & future directions. Computers and Education: Artificial Intelligence, 2, Article 100025. https://doi.org/10.1016/j.caeai.2021.100025

AI-Supported Authentic Assessment in Science Education: Overcoming Logistical Barriers and Enhancing Outcomes

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)