Article Preview
TopIntroduction
With the continuous development of artificial intelligence (AI) intelligence gradually into all aspects of production and life, computers can replace humans to perform a variety of tasks, significantly reducing the dependence on human labor (Wang et al., 2023). This study applies this technology to the field of education, through the analysis of student behavior in the classroom to evaluate the teaching management quality of the classroom. Smart education is also gradually moving from theory to campus as one of the inevitable trends of AI (Yao & Liu, 2024; Lü & Hou, 2024).
The manual evaluation method relies on teachers to determine student attention levels based on classroom performance and thus assess the quality of teaching management, which not only puts high demands on teachers but is also limited by their subjective perceptions, leading to possible problems such as overburdening of teachers, inaccurate evaluations, and inefficiencies, which are mostly determined by the teacher-student ratio (Schwartz et al., 2012). The after-school question-and-answer method, which assesses student classroom attention through test questions, faces numerous uncertainties that make it difficult to measure student classroom learning efficiency accurately. The brainwave detection method (Ismail et al., 2016), which measures attention by monitoring student brainwaves in the classroom through a wearable device, has potential but has not yet been widely used in educational practice due to its high cost and the limitations it imposes on student activities.
In order to overcome the problems faced by traditional attention assessment methods, some researchers have begun to utilize deep learning techniques in the field of computer vision. Student classroom behavior contains complex and valuable information. In recent years, with the rapid evolution of deep learning theory, deep neural network-based student classroom behavior recognition has gradually become a research hotspot, which mainly consists of two steps: detecting individual students in the video and classifying the state of different individual students. J. Wang et al. (2024) presented an efficient real-time target detection model with the speed and accuracy of YOLOv5, to identify individual students in a video stream, laying the foundation for subsequent behavioral analysis. Bewley et al. (2016) developed the simple online real-time tracking (SORT) algorithm by combining the Kalman filter algorithm with the Hungarian algorithm. Following this, researchers such as Wojke et al. (2017) further proposed the deep simple online real-time tracking (DeepSORT) algorithm by integrating the deep appearance model on top of the simple online real-time tracking (SORT) algorithm.
He et al. (2016) presented a deep learning model that performs well in image recognition tasks and efficiently processes student behavioral images extracted from video data to capture key features. Graves and Graves (2012) presented an approach suitable for modeling time-series problems in processing and predicting video data, capable of analyzing temporal variations in student behavior and identifying patterns of distraction. DeepLab can segment student posture accurately for more detailed behavioral analysis. While primarily used for natural language processing, the approach by Bao et al. (2021) can also assess the quality of instruction and student engagement by analyzing the language of classroom interactions.