Handling Overfitting in Machine Learning

Indie Quant
6 min readDec 6, 2024

Overfitting is one of the key challenges in Machine Learning. This occurs when model performance on training data is significantly better than unseen data which implies that instead of learning general patterns about the data, model learnt about the noise or anomalies specific to training data. In this article, we will learn about the ways to detect & prevent overfitting.

What is overfitting?

Source: wikipedia

As shown in the graph above, if a model learns about the patterns specific to training data — such as wavy green line in this case — there is a good possibility that it will not perform as good on test data since pattern is not general enough. Now take the example of black line — chances of black line being a general trend is very high. We can also say that even though variance of error in green line would be low — it will have a significant bias.

How to Identify Overfitting?

1. Performance on training set & validation/test set

--

--

Indie Quant
Indie Quant

Written by Indie Quant

Passionate About Data Science, Programming & Finance; Also Available at: https://indiequant.data.blog

Responses (1)