MACHINE LEARNING METHODS FOR DETECTING ANOMALIES IN UNIVERSITY DATA

Authors

  • Berik Akhmetov Yessenov University, Aktau, Kazakhstan Author
  • Temirlan Zharylgassyn Yessenov University, Aktau, Kazakhstan Author

DOI:

https://doi.org/10.56525/8w3jgv24

Keywords:

machine learning, anomaly detection, university data, isolation forest, autoencoder, LOF, classification algorithms

Abstract

This article is devoted to the development and comprehensive analysis of machine learning methods used to automatically detect anomalies in heterogeneous university data. The relevance of the research is due to the rapid growth of digital data in the academic environment, as well as the need to quickly identify atypical patterns of behavior that may indicate academic dishonesty, network traffic data failure or threats to information security.

The paper systematizes the theoretical foundations of anomaly detection, including the classification of anomalies by type (point, contextual, collective) and an overview of existing approaches to their detection. The mathematical apparatus of three key algorithms is described in detail: the Isolation Forest method, based on random partitioning of the feature space; a neural network approach based on autoencoders, using the reconstruction error as a measure of anomaly.; as well as the Local Outlier Factor (LOF) algorithm, which evaluates the degree of deviation of an object relative to its local environment.

Experimental studies were conducted on real university data covering academic performance, attendance, activity in LMS systems, and financial transactions. A comprehensive ensemble approach is proposed that integrates the results of all three algorithms based on weighted voting. A comparative evaluation of the methods using the Precision, Recall and F1-score metrics was carried out. The results demonstrate a high accuracy of anomaly detection with a minimum level of false alarms, which confirms the practical applicability of the proposed approach in a real university environment.

Downloads

Download data is not yet available.

Downloads

Published

2026-05-29