Predição do Risco de Reprovação no Ensino Superior Usando Algoritmos de Machine Learning
DOI:
https://doi.org/10.21714/2238-104X2020v10i2-51124Abstract
Goal: this research proposes to identify the risk of failing higher education students using Machine Learning (ML) algorithms. Based on the administrative records of the Universidade Federal da Paraíba (UFPB) and Plataforma Lattes, for the period 2010-2016 of the discipline of differential and integral calculus I. Methodology: it was verified that the models with the best forecasting performance were Ridge, Logistic Regression, LASSO and Elastic Net, with no statistical differences in performance between them. Result: from the modeling on the training data, the results found explain that, of the 1,903 observations that make up a new data set, the test set, the frequency of students with status (failed and approved) correctly predicted by Accuracy was 69 %, on both models. In turn, 72% of students were correctly predicted as failing (Sensitivity). Contribution: these findings confirm that ML algorithms can be viable instruments to assist preventive academic management and pedagogical actions aimed at reducing failure rates in higher education.