HSE researchers have proposed a new neural network method to recognize people’s emotions and involvement. The algorithms are based on the analysis of video images of faces and have a higher accuracy than known analogues. The developed models are suitable for low-performance equipment, including mobile devices. The results of the work can be implemented in teleconferencing and online training systems to analyze the involvement and emotions of the participants. The results of the study are published in IEEE Transactions on Affective Computing.
The COVID-19 pandemic has led to the active development of online video conferencing tools and e-learning systems. Artificial intelligence technologies can help teachers remotely monitor the participation of event participants. Algorithms to analyze student behavior and identify engagement in the online environment are now being studied by specialists in the field of data mining for education. Among the analysis tools, the most popular are automatic methods based on computer vision technologies. In particular, it is believed that the quality of many e-learning systems can be greatly influenced by the recognition of emotions and the involvement of participants based on video analysis.
As part of the HSE Center for Artificial Intelligence project “Neural network algorithms to analyze the dynamics of emotional state and student engagement based on video surveillance data”, scientists have developed a new neural network algorithm to recognize emotions and engagement from images face video
The scientists taught the neural network to extract the characteristic signs of emotions, based on a special “sustainable” way of training the neural network and processing only the most important areas of the face. The essence of the method is that, firstly, faces are detected and their characteristic features are extracted, followed by grouping of the faces of each participant. In addition, with the help of specially trained effective neural network models, the emotional signs of each selected person are extracted, aggregated using statistical functions and classified. In the final phase, fragments of the video lesson are visualized with the most pronounced emotions and with different degrees of involvement of each listener. As a result, the researchers managed to create a new model that, for several faces of a video, determines the emotions of each person and the degree of enthusiasm at the same time.
“For various data sets, we have shown that the proposed algorithms are superior in accuracy to known analogues. At the same time, unlike most known technologies, the developed models can participate in real-time video processing even on low-performance equipment , included in the mobile devices of each participant in an online event,” comments the project leader, professor of the Department. of Information Systems and Technologies Higher School of Economics in Nizhny Novgorod Andrey Savchenko. “Together with Ilya Makarov of the Research Institute of Artificial Intelligence (AIRI), we have created a fairly easy-to-use computer program that allows you to process a video recording of a webinar or online class and get a set of video clips with the highest characteristic emotions of each participant”.
The results of the work can be implemented in teleconferencing and online training systems to analyze the involvement and emotions of the participants. Thus, during the preliminary test of the online course, by the reaction of the listeners, you can understand which parts of the lecture were more interesting, and which were difficult to understand and need to be adjusted. The possibilities of integrating the models developed into the Jazz by Sber videoconferencing service are currently being investigated. Video data is planned to be flagged to improve accuracy in analyzing listener behavior of online events.
HSE Press Office