Impact of Training Data Size on Model Accuracy: Mediating Role of Algorithm Complexity
Keywords:
Training data size, model accuracy, complexity of the algorithm, machine learning, mediating effect, overfitting, underfitting, predictive performance, artificial intelligenceAbstract
The blistering growth of machine learning applications has heightened the necessity to comprehend the effects of the training data size on the model performance. This paper investigates this effect of training data size on model accuracy, with the moderating effect of the complexity of the algorithm. The larger the dataset, the better the predictive performance typically is, but there is not always a linear relationship, with the structure and complexity of models potentially affecting data utilization. The paper takes a conceptual analytical standpoint i.e. within the current literature on machine learning to describe the role of complexity in algorithms as a mediating factor between availability of data and accuracy of results. Results indicate that, although more training data enhances generalization, excessively simplistic or overly complex algorithms can underfit or overfit, thus diminishing the benefits of increased training data. This paper concludes that the best model performance is obtained when the size of the training data and the complexity of the algorithm are well matched and learning and prediction reliability are balanced.




