October 7, 2022

– Advertisement –

This article was published as a part of the Data Science Blogathon.

– Advertisement –


– Advertisement –

In this article, we will train a classification model that uses feature extraction + classification theory, that is, we first extract the relevant features from an image and then use these feature vectors in a machine learning classifier to perform the final classification. make use of.

We will extract features from the pre-trained ResNet model. Then using those extracted features, we will train a multiclass SVM classifier on the STL-10 dataset and find the accuracy and confusion metrics on the dataset and the ROC curve.

– Advertisement –

What is Resnet-50?
A convolutional neural network with 50 layers is called ResNet-50. The ImageNet database consists of a pre-trained version of the network that has been trained on over a million photographs. The pre-trained network can classify images into 1000 different item categories, including several animals, a keyboard, a mouse, and a pencil.

Source – www.analyticssteps.com

You can download the entire .ipynb code file used in this article here. The code is well commented, so you can understand it easily.

dataset for image classification

In this section, we will discuss some basic information about datasets:

1. The STL-10 dataset is an image recognition dataset that can be used to develop algorithms for unsupervised feature learning, deep learning, and self-teach learning.

2. th

class_current =

3. Images are 96×96 pixels in size and color.

4. Each class has 500 training photos (10 pre-set folds) and 800 test pictures.

5. Unsupervised learning with 100000 unlabeled images. These samples are similarly taken from a large pool of photographs. In addition to the animals and vehicles in the specified set, there are various animals (bears, rabbits, etc.) and vehicles (trains, buses, etc.).

See also  High-Performance, Scalable Array Storage with TensorStore

6. Images were collected from tagged instances of ImageNet.

As a code, we need to do the same using the below code snippet:

import stl10 trainset from torchvision.dataset, testset=stl10(‘/content’, transform=transform, download=True), stl10(‘/content’, “test”, transform=transform) Model training for image classification

This section will discuss the entire machine learning pipeline for classifying different classes of STL-10 datasets.

Steps to extract features from a pre-trained ResNet model:
1. The ImageNet classification dataset is used to train the ResNet50 model.
2. PyTorch framework is used to download the ResNet50 pretrained model.
3. The features obtained from the last fully connected layer are used to train a multiclass SVM classifier.
4. Data loader is used to load the training and test datasets.
5. Model and loaded data are used to extract features.
6. Features have been envisioned.
7. Each image from the training and test sets is further fed in, and each embedding is saved.
8. The archived images are fed into a pre-trained ResNet 50, and the weights are frozen using i.requires grad = False.
9. We load the model after that.
10. Train and test loaders are scaled using standard scalars. These two datasets are then combined independently of each other.

data visualization:
Printing one of the images from our dataset to see what kind of images we have on hand to apply those algorithms:

Grid Search CV:
We will apply Grid Search CV to find the best value of the hyperparameters (C, Gamma and Kernel).

We can implement grid search cross-validation technique by defining parameters in a list and then train the same SVM model defined in sklearn with the help of those parameters to get maximum performance from our model.

import numpy as np from sklearn.model_selection import GridSearchCV from sklearn.svm import svc l1 = [0.1, 1, 10, 100]L2 = [1, 0.1, 0.01, 0.001]L3 = [‘poly’]# Defining the parameter range param_grid = {‘C’: l1, ‘gamma’: l2, ‘kernel’: l3} # GridSearch cross validation to find the best set of parameters three = 3 grid = GridSearchCV(SVC) (), param_grid, refit=True, verbose=three) grid fit(X_trainv2, y_train)

See also  Choosing the Right Cloud for Data Sovereignty

After running the above code, the parameters on which I trained my model are given below:

clf2 = SVC(kernel = “poly”, C = C_param, gamma = gamma_param, probability = True). fit(X_trainv2, y_train)

ROC Curve:
Here, we have to print the ROC curve for any one class versus all other classes, which means we are using the One vs. Rest strategy, and then compare those curves with the baseline model, which means that The line passing through the origin and making a slope of 45 along both the x and y-axis.

confusion matrix:
I have attached some confusion matrices for class 0 vs all classes where I consider 0 classes as positive and all others as unfavorable. But for other classes, I have implemented it in coding, but only to show the result, I have rendered only for one class.

For Class 0:

image classification

For class 2:

image classification

For Class 1:

image classification


These confusion matrices show the numbers of false positives, false negatives, true positives and true negatives. We can derive the overall and class-wise accuracy from all these values ​​and compare the results.

Overall accuracy achieved: 76.229%

Fine-tune ResNet 50:
We start with a default model that hasn’t been fixed and look at its layer. Then there was selective fine-tuning for a single layer. In this process only the weight of the passed layer will be modified, while we will freeze the remaining layers. We can do this on a pre-trained model, and its parameters are updated only if the layer name in the function matches the name we want to fine-tune, in which case the param needs to be graded Yes he is real.

See also  ETL or ELT: What is the Right Choice for Data Integration? Altamira Softworks

We can do either of two types of tuning to tweak our model:
1. Single Layer Tuning
2. Multi-layer Tuning


The loss for training and testing after correcting the downsampling sub-layer is the lowest, and the accuracy after fine-tuning the downsampling sub-layer in both training and testing is higher than that of other sub-layer fine-tuning.

In extremely deep CNNs, the vanishing gradient problem can be solved by ResNet. They work by omitting some layers based on the assumption that deep networks should have no more significant training error than their shallow counterparts.

ResNet has been shown to be effective in a variety of applications, but a significant downside is that deep networks typically take weeks to train, making them virtually unusable for practical applications.

1. First, we have discussed the Resnet-50 architecture, how it works, and its advantages and disadvantages.
2. Then, we discussed the STL-10 dataset being used in this tutorial, such as no. Images, no. of different sections, and image sizes. e.t.c.
3. Then, we trained the Resnet-50 model and applied feature extraction.
4. After extracting the features, we implemented Grid Search CV to get the best hyperparameters.
5. Finally, we conclude the article by discussing the accuracy and ROC-AUC curves.

That’s all for today. I hope you enjoy the article. If you have any doubts or suggestions, don’t hesitate to comment below. Or you can also connect with me on LinkedIn. I would be happy to join you.

See also my other articles.

thanks for reading,

The media shown in this article is not owned by Analytics Vidya and is used at the sole discretion of the author.


Source link

– Advertisement –