摘要:In practical educational assessments, different assessments targeting the same ability usually do not anchor to the same set of questions. Achieving score equating in the absence of anchor questions is currently a challenge that has not been fully overcome. This study proposes a low-cost solution for score equating without anchor questions based on large language models. The main principle is to utilize large language models to construct a linking group between assessments that need to be equated, thereby achieving equating. This study will take two sets of reading comprehension questions as examples, selecting GPT3.5, GPT4.0, and iFlytek Spark 3.5 to construct linking groups, and compare their effectiveness in equating from two aspects: prompt engineering (zero-shot, one-shot, and few-shot) and the number of samples in the linking group (500 and 1 000). The study found that GPT4.0 performs well in the one-shot and few-shot scenarios with a large linking group sample, indicating that the design scheme of using large language model agents as a linking group without anchor items is feasible.
关键词:test equating;no common item;large language models
摘要:As an important part of comprehensive student quality, the accurate assessment of physical fitness is helpful to promote the reform of educational evaluation. The traditional methods of physical fitness evaluation mostly rely on manual experience, which often leads to problems such as lagging results and difficult large-scale application. To solve the above problems, this paper proposes an automatic physical fitness assessment model for students that integrates text and video information. Firstly, the model establishes the connotation and index dimension of physical fitness through literature analysis and expert scoring method. Secondly, in order to solve the problems of complex assessment environment and subject confusion in the open environment, the plug-and-play bi-directional feature enhancement approach for moving object detection is proposed, text features are introduced to eliminate the ambiguity of visual features, and the region sequence of assessment subjects is accurately deduced. Finally, based on the action recognition network, the valid frame sequence was obtained by filtering the invalid frame sequence of the subject area, and the computable metrics of physical fitness was calculated by analyzing the changes of skeletal keypoints, and the physical fitness score was obtained. Tests on the self-constructed data set of physical fitness automatic assessment show that the combined text and video automatic physical fitness assessment model processes 1 minute video in 2.6 s, and the average accuracy is 91.22%. The model has been applied to the physical fitness assessment of 2.8 million students with good accuracy and robustness.
关键词:comprehensive quality of primary and secondary school students;physical fitness automatic assessment;bidirectional feature enhancement;deep learning;multimodal information
YANG Kaifang, XIAO Zhe, WANG Zihao, GONG Yanchao, LÜ Zhanghao
DOI:10.14188/j.1671-8836.2025.0030
摘要:The effective assessment of normal university students’ competence in ideological and political education is a key factor in safeguarding the quality of curriculum-based ideological and political reform in basic education. In response to the lack of an indicator system for ideological and political education competence of normal university students, insufficient digital and intelligent evaluation methods, and a lack of digital evaluation platforms, this article first constructs an evaluation indicator system for ideological and political education competence of normal university students, and designs a corresponding evaluation questionnaire for ideological and political education competence. Next, a digital evaluation method for ideological and political education competence of normal university students is constructed. Based on the evaluation results, a BERT-TextCNN cascaded feature enhancement network (BERT-CFEN) and an educational intelligent agent for analyzing ideological and political education competence have been built, respectively. Furthermore, a dual-channel evaluation model for ideological and political education competence has been proposed to be established to enable quantitative assessment of ideological and political education competence of normal university students. The test on the constructed AIPEA dataset shows that the F1 of our method in identifying educational elements is 95.45%. Finally, a digital platform is constructed for the digital evaluation and visual analysis of ideological and political education competence of normal university students, and the evaluation method and platform are applied to evaluate the ideological and political education competence of three different grades of normal university students, verifying the effectiveness of the proposed method in evaluating the ideological and political education competence of normal university students.
关键词:normal university students;ideological and political education competence;evaluation system;digital evaluation