Hyperparameters Optimization Techniques
- The process of finding most optimal hyperparameters in machine learning algorithms is called hyperparameter optimization.
- Common algorithms include:
- Grid Search
- Random Search
Grid search
- It is a very traditional technique for implementing hyperparameters. It brute force all combinations then validation technique ensures the trained model gets most of the patterns from the dataset.
- The Grid search method is a simpler algorithm to use but it suffers if data have high dimensional space called the curse of dimensionality. This is significant as the performance of the entire model is based on the hyper parameter values specified.
- Python Implementation for grid searchCv using Sklearn for KNN algorithms.
from sklearn.model_selection import GridSearchCV
from sklearn import datasets
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report
from pprint import pprint
#List Hyperparameters that we want to tune.
leaf_size = list(range(1,50))
n_neighbors = list(range(1,30))
p=[1,2]
#Convert to dictionary
hyperparameters = dict(leaf_size=leaf_size, n_neighbors=n_neighbors, p=p)
#Create new KNN object
GridSearchKNNClassifier = KNeighborsClassifier()
#Use GridSearch
clf = GridSearchCV(GridSearchKNNClassifier, hyperparameters, cv=10)
#Fit the model
best_model = clf.fit(X_train,y_train)
#Print The value of best Hyperparameters
print('Best leaf_size:', best_model.best_estimator_.get_params()['leaf_size'])
print('Best p:', best_model.best_estimator_.get_params()['p'])
print('Best n_neighbors:', best_model.best_estimator_.get_params()['n_neighbors'])
Random search
- Random search is a technique where random combinations of the hyperparameters are used to find the best solution for the built model. It tries random combinations of a range of values.
- To optimize with random search, the function is evaluated at some number of random configurations in the parameter space.
from sklearn.model_selection import RandomizedSearchCV
#List Hyperparameters that we want to tune.
leaf_size = list(range(1,50))
n_neighbors = list(range(1,30))
p=[1,2]
#Convert to dictionary
hyperparameters = dict(leaf_size=leaf_size, n_neighbors=n_neighbors, p=p)
# Applying Random Search
random_search_knn =RandomizedSearchCV(KNeighborsClassifier(), hyperparameters, cv=10)
random_search_knn.fit(X_train,y_train)
#Print The value of best Hyperparameters
print('Best leaf_size:', best_model.best_estimator_.get_params()['leaf_size'])
print('Best p:', best_model.best_estimator_.get_params()['p'])
print('Best n_neighbors:', best_model.best_estimator_.get_params()['n_neighbors'])
Comments
Post a Comment