Image separation using Transfer Learning and K-Means clustering

Vipin Katara
10 min readApr 25, 2023

--

Agenda

In this article, we aim to develop the image separation tool. We will mainly focus on two techniques — Transfer Learning and Clustering. We will also implement transfer learning on the dataset.

Transfer Learning

Transfer learning is a machine learning technique where a model trained on one task is re-purposed on a second related task. Transfer learning is useful because it can significantly improve the performance of the target model by leveraging the knowledge learned from the source model. This is particularly useful when the target task has a small amount of data available, as the model can use the knowledge learned from the source task to initialize the weights and biases of the model and improve its performance.

Lets try to understand it with real world examples:-

Suppose a programmer wants to predict the species of a flower based on some images. One way of doing this is to use a Convolution Neural Network in a similar way as we did for Malaria detection or we can use a pretrained model with pretrained weights. So the programmer can use a neural network that was trained by somebody else for their work and they can also add or remove the layers.This method is referred as Transfer learning.

Advantages of Transfer Learning

  • Higher learning rate: Transfer learning offers a higher learning rate during training since the problem has already been trained for a similar task.
  • Higher accuracy after training: With a better starting point and higher learning rate, transfer learning provides a machine learning model to converge at a higher performance level, enabling more accurate output.
  • Faster training: The model leverages a pre-trained model which was trained for similar task, so its easier to achieve desired performance faster.

Clustering

Clustering is the process of dividing a set of data points into groups, or clusters, so that data points in the same cluster are more similar to each other than data points in other clusters.

There are many different clustering algorithms that can be used, including k-means clustering, hierarchical clustering and DBSCAN.

k-means clustering

k-means clustering is a method of clustering data points into k clusters, where k is a user-specified parameter. The algorithm works by iteratively assigning each data point to the cluster with the nearest mean, or center, until the clusters stop changing. The mean of a cluster is the average of all the data points in the cluster.

Here is a brief summary of the steps in the k-means algorithm:

  1. Choose the number of clusters, k.
  2. Randomly initialize k centroids.
  3. Assign each data point to the closest centroid.
  4. Recompute the centroids as the mean of the points in each cluster.
  5. Reassign each data point to the closest centroid.
  6. Repeat steps 4 and 5 until the clusters stop changing.

One of the advantages of k-means is that it is computationally efficient and easy to implement. However, it can be sensitive to the initial choice of centroids and may not always find the optimal solution.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a density-based clustering algorithm that is used to identify clusters of points that are densely packed together. It is particularly useful for identifying clusters of arbitrary shape and for detecting outliers, or points that do not belong to any cluster.

One of the advantages of DBSCAN is that it does not require the user to specify the number of clusters in advance, as the number of clusters is determined automatically based on the density of the points. It is also relatively robust to noise and can handle datasets with high levels of missing or incorrect data. However, DBSCAN can be sensitive to the choice of the parameters MinPts and Eps, and it may not work well for datasets with highly varying densities.

DBSCAN algorithm can be abstracted in the following steps:

  1. Find all the neighbour points within eps and identify the core points or visited with more than MinPts neighbors.
  2. For each core point if it is not already assigned to a cluster, create a new cluster.
  3. Find recursively all its density connected points and assign them to the same cluster as the core point.
    A point a and b are said to be density connected if there exist a point c which has a sufficient number of points in its neighbors and both the points a and b are within the eps distance. This is a chaining process. So, if b is neighbor of c, c is neighbor of d, d is neighbor of e, which in turn is neighbor of a implies that b is neighbor of a.
  4. Iterate through the remaining unvisited points in the dataset. Those points that do not belong to any cluster are noise.

Hierarchical clustering

Hierarchical clustering is a method of clustering data points into a tree-like structure called a dendrogram. There are two main types of hierarchical clustering: agglomerative and divisive.

In agglomerative hierarchical clustering, the algorithm starts by treating each data point as a separate cluster and then iteratively merges the closest clusters until all the data points are contained in a single cluster. The distance between clusters can be measured using various similarity measures, such as the Euclidean distance or the Pearson correlation coefficient.

In divisive hierarchical clustering, the algorithm starts by treating the entire dataset as a single cluster and then iteratively splits the clusters until each data point is contained in a separate cluster.

One of the advantages of hierarchical clustering is that it allows the user to visualize the clusters and the relationships between them in the dendrogram. However, it can be computationally expensive and may not be suitable for very large datasets.

The Development

Dependencies

Project is created with:

Keras: pip3 install keras

Numpy: pip3 install numpy

Scikit Learn: pip install sklearn

OpenCV: pip install opencv-python

Tensorflow: pip install tensorflow

Importing Necessary Libraries

Load Model

As our model, we have used pre-trained vgg16 model.

VGG16 is a convolutional neural network model proposed by K. Simonyan and A. Zisserman from the University of Oxford in the paper “Very Deep Convolutional Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5 test accuracy in ImageNet, which is a dataset of over 14 million images belonging to 1000 classes. It was one of the famous model submitted to ILSVRC-2014.

source: https://www.geeksforgeeks.org/vgg-16-cnn-model/
vgg16 model architecture

vgg16 consists mostly of five parts —

  1. Convolution
  2. Pooling
  3. Flattening
  4. Dense or Fully Connected
  5. Dropout

Convolution

The primary purpose of convolution is to extract features from the input image. Convolution preserves the relationship between pixels by learning image features using small squares of input data.

In convolution operation, we start with a kernel, which is simply a small matrix of weights. This kernel “slides” over the input data, performing an elementwise multiplication with the part of the input it is currently on, and then summing up the results into a single output pixel.

Here is the example of convolution operation with 3x3 kernel and 7x7 input image.

Source: www.superdatascience.com

Pooling

Pooling layer reduces the number of parameters when image is too large. In this project, we have used max pooling layer.

In max pooling, the maximum pixel value of the batch is selected. The batch here means a group of pixels of size equal to the filter size which is decided based on the size of the image. In the following example, a filter of 2x2 is chosen. The output of the pooling method varies with the varying value of the filter size.

Source: computersciencewiki.org

Flattening

Flattening is converting the data into a 1-dimensional array for inputting it to the next layer. We flatten the output of the convolutional layers to create a single long feature vector. And it is connected to the final classification model.

Dense or fully connected layer

The layer we call as Fully Connected layer, we flattened our matrix into vector and feed it into a fully connected layer like a neural network.

the feature map matrix will be converted as vector . With the fully connected layers, we combined these features together to create a model. Finally, we have an activation function such as softmax to classify the outputs as infected or uninfected.

Neural network with many convolutional layers

Dropout

Dropout is used to prevent the model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.

Load Data

The dog cat image dataset we have have used from available here contains 12.5k images of dogs and cats.We have used 4 images each from both category as our dataset to cluster.

In this project, we have used cv2 to read and store the pixel values of the image. Here is one of the approach to load and store the data.

Feature Vectors

A feature vector is a n-dimensional vector of numerical features that represent some object. In machine learning, feature vectors are used to represent samples for analysis and interpretation. The features in a feature vector could be anything that describes an object. For example, if you wanted to classify types of fruit, the features in your feature vector might include things like the fruit’s colour, shape, and texture. These features are chosen because they are relevant for the task at hand and are typically chosen through domain knowledge or experimental design. Feature vectors are used as input to machine learning algorithms, which can use them to make predictions or decisions.

feature_vector.py is the function to extract feature vector from single image and feature_vectors.py is the function that utilizes feature_vector.py to extract all the feature vector for all the images

Separate

To separate the images into different clusters is to use image features and a classification algorithm, such as support vector machines or decision trees. This involves extracting features from the images, such as colour, texture, and shape, and training a classifier to predict the cluster label for each image based on these features.

For this, we need to define how many clusters we want the images to be divided.

n_cluster is the number of cluster images to be divided
cluster_path is the output path of the results
path_to_files is the path to images that needs to be separated

Combing all functions

Lets combine all the functions to get the results.

Input

For the input, we have used subset of the dog cat dataset to cluster into two different categories, here is the image of sample input that we have used as our example:

Results

We were able to cluster images into two different cater one being cat and other being dog without human intervention.

Cluster_0

Cluster_1

Conclusion

These techniques for image separation is that they can be effective at organizing and categorizing images based on their visual features. Transfer learning can improve the accuracy of a model by leveraging the knowledge learned by a pre-trained model to extract feature vector from images, and k-means clustering can identify groups of similar images. These techniques can be useful in a variety of applications, such as image classification, object detection, and image retrieval. However, the effectiveness of these techniques may depend on the specific characteristics of the images being analyzed and the goals of the image separation task.

--

--