Image Compression using K-means Clustering : Colour Quantization

This post is a simple yet illustrative application of K-means clustering technique. Using K-means clustering, we will perform quantization of colours present in the image which will further help in compressing the image.

In a coloured image, each pixel is of size 3 bytes (RGB), where each colour can have intensity values from 0 to 255. Following combinatorics, the total number of colours which can be represented are 256*256*256. Practically, we are able to visualize only a few colours in an image. Shown below is an image of 1280 x 720 pixels taking 1.71 MB in PNG format. PNG is a lossless compression technique for images. Our objective is to compress the image further using colour quantization, though the compression will be lossy.

tiger

K-means clustering

It is basically an optimization algorithm to find ‘k’ clusters in the given set of data points. Initially, it randomly assigns k-cluster centers and then on the basis of some distance metric (for example, euclidean distance) it aims to minimize within cluster sum of squared distance of the data points from the cluster center. There are two steps in k-means clustering algorithm:

a) Assignment step – Each data point is assigned to the cluster whose center is nearest to it.
b) Update step – New means (centroids) are calculated from the data points assigned to the new clusters.

Just to give you an idea of clustering data points, Below picture has been taken from internet to depict data points before and after K-means clustering.

image

K-means clustering with K=3

In our problem of image compression, K-means clustering will group similar colours together in ‘k’ clusters (say ‘k’ = 128). Therefore, the centroid of each cluster is representative of the 3 dimensional colour vectors (RGB) falling in the respective cluster. By now you might have understood what are we trying to do. These ‘k’ centroids will replace all the colour vectors in their clusters, thereby keeping only ‘k’ colour combinations for the whole image. Thus, we need to keep only the label of each pixel in the image that tells about the cluster in which that pixel falls. Also, we keep the ‘k’ centroids as codebook which are the only colours seen in the compressed image.

Compression

We will write a simple python code to compress the image and store the compressed image along with the code book. The compressed image saved here is nothing but the cluster label of each pixel of the original image. Codebook is the fancy name given to the list of cluster centers (3-d RGB) achieved after running k-means algorithm. Afterwards, both the arrays (the cluster labels and the codebook) are saved in data type ‘unsigned integer’ as the range of intensity values (0-255) and value of ‘k’ is always going to be less than 255 . The code given below does all this.

from skimage import io
from sklearn.cluster import KMeans
import numpy as np

image = io.imread('tiger.png')
io.imshow(image)
io.show()

rows = image.shape[0]
cols = image.shape[1]
 
image = image.reshape(image.shape[0]*image.shape[1],3)
kmeans = KMeans(n_clusters = 128, n_init=10, max_iter=200)
kmeans.fit(image)

clusters = np.asarray(kmeans.cluster_centers_,dtype=np.uint8) 
labels = np.asarray(kmeans.labels_,dtype=np.uint8 )  
labels = labels.reshape(rows,cols); 

np.save('codebook_tiger.npy',clusters)    
io.imsave('compressed_tiger.png',labels);

We can select ‘k’ sufficient enough to represent the colours of image well. Here, ‘k’ has been chosen as 128. This means all the colour combinations in the original image have been quantized to 128 distinct colours only. These colours will be present in reconstructed image (after decompression) and it should be visually similar to original image.

Decompression

We also need to decompress the image in order to visualise the reconstructed image which is obviously an outcome of lossy compression performed. Below code does the decompression by assigning the 3-d colours from the code book to the each pixel depending upon its label.

from skimage import io
import numpy as np

centers = np.load('codebook_tiger.npy')
c_image = io.imread('compressed_tiger.png')

image = np.zeros((c_image.shape[0],c_image.shape[1],3),dtype=np.uint8 )
for i in range(c_image.shape[0]):
    for j in range(c_image.shape[1]):
            image[i,j,:] = centers[c_image[i,j],:]
io.imsave('reconstructed_tiger.png',image);
io.imshow(image)
io.show()

We can see the reconstructed image after decompression below. Though the reconstructed image has lost a lot of pixel colour information but still you won’t find any major difference visually.

reconstructed_tiger

Also, you can visualise these 128 colours found in the reconstructed image by viewing the colours in the codebook separately (may be by displaying mono-coloured square box). These colours are the centroids of clusters formed after performing k-means on original image.

Caution

  1. If you will try to compress a ‘jpeg’ image in exactly the same way as followed in the blog-post, you will incur errors as jpeg does a lossy compression. Compression algorithm of jpeg changes the intensity values of the pixel, so the pixels in the compressed image containing the label may become more than ‘k’ which leads to error.
  2. K-means algorithm is an optimization problem of finding the clusters in the given data-set. Execution time increases as the image dimensions increases or ‘K’ increases. So, initially you can start with a lesser value of ‘k’ in order to quickly get results.
  3. There is a trade off between the execution time and the number of colours represented in reconstructed image. Higher ‘k’ will produce better quality of compressed image but will take longer to execute.

Conclusion

You can check the disk space taken by the images which were considered in the blog-post here. I have posted the snapshot of working directory. The original png image was of 1757 KB (tiger.png) whereas the compressed tiger image and codebook are of only 433 KB all together. The reconstructed image is also taking less space because png runs its own compression algorithm. As there are only 128 unique colours now, png is able to get compression ratio of more than 2.

spaces

One can conclude that the compression applied here is done only by reducing the number of colours in the image which is also called as Colour Quantization. We have not reduced either the size of image or the intensity ranges of pixels.

Hope it was easy to follow this blog-post. The full python implementation of image compression with K-means clustering can be found on Github link here.

If you liked the post, follow this blog to get updates about the upcoming articles. Also, share this article so that it can reach out to the readers who can actually gain from this. Please feel free to discuss anything regarding the post. I would love to hear feedback from you.

Happy machine learning 🙂

Advertisements

29 thoughts on “Image Compression using K-means Clustering : Colour Quantization

  1. Why no response is shown on the screen when inputting “kmeans.fit(image)”? Does it consume too much time with k=128?

      • aha, maybe i should wait for it patiently. At first i guess it will take couple of hours and i can’t finish it before shutdown

  2. I mean, can you evaluate the time when k is as big as 128, like how many hours approximately does it need to execute

    • Umm….You need to check that. I am sure it would take a lot of time.

      You can fix the number of iterations or number of init of kmeans training algorithm to be less inorder to execute it fast.

  3. Hi Abhijeet,

    I tried running your code and I’m getting compressed grey scale image.
    Can you please help?

  4. Hello,
    Can this be done as a college final year project? As a beginner in machine learning, I am searching for a method to implement image compression which one would be the best? Please suggest

  5. Can this be used as final year project? As a beginner in machine learning, I am searching for a method to implement image compression. Can you please suggest me how should I do it?

    • It is very simple. Loop through all the images of your folder and apply the same way of compressing single image.

      You may like to take lower value of K otherwise it would be very slow.

  6. image = image.reshape(image.shape[0] * image.shape[1], 3)
    ValueError: cannot reshape array of size 1093824 into shape (273456,3)
    Why I am getting this error with the compression code

  7. for f in os.listdir(‘.’):
    if f.endswith(‘.png’):
    image = io.imread(f)
    rows = image.shape[0]
    cols = image.shape[1]
    I tried for loop as given but it gives an error: MemoryError. Does K means work well with around 40 image at a time?

  8. Hello,

    I ran the code with a tif image with resolution of 8181*7221, and it appeared:
    ValueError: cannot reshape array of size 59075001 into shape (59075001,3)

    Why I am getting this error?

    • Hi,
      It occurs to me that you are using a gray scale image but the codes used in the blog is written for colored images.
      For colored images, there are 3 channels i.e, RGB.

      Thanks.

  9. Hello,

    I ran this code with a tif image with resolution of 8181*7221, but a error appeared:
    ValueError: cannot reshape array of size 59075001 into shape (59075001,3)

    Why I am getting this error?

    Thank you very much.

    • May be not, You may want to modify the python codes for compressing grey scale images.

      for example:
      image = np.zeros((c_image.shape[0],c_image.shape[1],3),dtype=np.uint8 )
      would become
      image = np.zeros((c_image.shape[0],c_image.shape[1]),dtype=np.uint8 )

      Similarly, everywhere the code uses 3 channels, you can modify it to 1 channel and check if it works for you.

      Also, tiff images are of large sizes, downsize it so that K-means can run in less time, otherwise system will take huge time to complete the execution.

  10. Hello,

    I modified the code in your method. I ran it with Lena.bmp 258KB) and it worked. But for a larger image of 57,748KB it appeared error:

    Traceback (most recent call last):
    File “compress.py”, line 56, in
    kmeans.fit(image)
    File “/home/zsf/anaconda2/envs/tensorflow/lib/python3.6/site-packages/sklearn/cluster/k_means_.py”, line 972, in fit
    return_n_iter=True)
    File “/home/zsf/anaconda2/envs/tensorflow/lib/python3.6/site-packages/sklearn/cluster/k_means_.py”, line 381, in k_means
    random_state=random_state)
    File “/home/zsf/anaconda2/envs/tensorflow/lib/python3.6/site-packages/sklearn/cluster/k_means_.py”, line 445, in _kmeans_single_elkan
    max_iter=max_iter, verbose=verbose)
    File “sklearn/cluster/_k_means_elkan.pyx”, line 150, in sklearn.cluster._k_means_elkan.k_means_elkan
    MemoryError

    Thank you very much.

  11. Hello,

    Right. I know it is a memory error. I am revising the code to deal with this error and not yet complete.

    I tried to read the big original image or the image(nd.array) with chunksize, but it not worked. Do you have any suggestion?

    Thank you very much.

Leave a Reply