How to prevent overfitting and underfitting

OVERFITTING AND UNDERFITTING

Overfitting

It will perform well on its training data, poorly on new unseen data.

Handling Overfitting :

Cross-validation
This is done by splitting your dataset into ‘test’ data and ‘train’ data. Build the model using the ‘train’ set. The ‘test’ set is used for in-time validation. This way you know what the expected output is and you will easily be able to judge the accuracy of your model.
Regularization
This is a form of regression, that regularizes or shrinks the coefficient estimates towards zero. This technique discourages learning a more complex model.
Early stopping
When training a learner with an iterative method, you stop the training process before the final iteration. This prevents the model from memorizing the dataset.
Pruning
This technique applies to decision trees.
Pre-pruning: Stop ‘growing’ the tree earlier before it perfectly classifies the training set.
Post-pruning: Allows the tree to ‘grow’, perfectly classify the training set and then post prune the tree.
Dropout
This is a technique where randomly selected neurons are ignored during training
Regularize the weights.
Removing Irrelevant input features.
Removing outliers or anomalies.

Underfitting

Underfitting is a modeling error which occurs when a function does not fit the data points well enough. It is the result of a simple model with an insufficient number of training points. A model that is under fitted is inaccurate because the trend does not reflect the reality of the data.

Handling Underfitting :

Get more training data.
Increase the size or number of parameters in the model.
Increase the complexity of the model.
Increasing the training time, until cost function is minimized.
With these techniques, you should be able to improve your models and correct any overfitting or underfitting issues.

Learn Data Science Material which helps to learn concepts in Python, Statistics , Data Visualization, Machine Learning , Deep Learning. And it contains Projects helps to understand the flow of building model , and what are the necessary steps should be taken depending on the data set. Interview Questions helps to crack the interview.

Data Science Material

Learn Python from basics to advanced.

Join ML in python channel in telegram , Where you can learn every concepts in Python, Statistics, Data Visualization, Machine Learning, Deep Learning.

Join Aptitude Preparation channel in telegram , this channel helps to crack any interview.

Aptitude Preparation

Comments