MACHINE LEARNING METHODS FOR DETECTING ANOMALIES IN UNIVERSITY DATA
DOI:
https://doi.org/10.56525/8w3jgv24Keywords:
machine learning, anomaly detection, university data, isolation forest, autoencoder, LOF, classification algorithmsAbstract
This article is devoted to the development and comprehensive analysis of machine learning methods used to automatically detect anomalies in heterogeneous university data. The relevance of the research is due to the rapid growth of digital data in the academic environment, as well as the need to quickly identify atypical patterns of behavior that may indicate academic dishonesty, network traffic data failure or threats to information security.
The paper systematizes the theoretical foundations of anomaly detection, including the classification of anomalies by type (point, contextual, collective) and an overview of existing approaches to their detection. The mathematical apparatus of three key algorithms is described in detail: the Isolation Forest method, based on random partitioning of the feature space; a neural network approach based on autoencoders, using the reconstruction error as a measure of anomaly.; as well as the Local Outlier Factor (LOF) algorithm, which evaluates the degree of deviation of an object relative to its local environment.
Experimental studies were conducted on real university data covering academic performance, attendance, activity in LMS systems, and financial transactions. A comprehensive ensemble approach is proposed that integrates the results of all three algorithms based on weighted voting. A comparative evaluation of the methods using the Precision, Recall and F1-score metrics was carried out. The results demonstrate a high accuracy of anomaly detection with a minimum level of false alarms, which confirms the practical applicability of the proposed approach in a real university environment.




