Image hashing python

image hashing using python

What is image hashing?

An image or photo hashing is a type of hashing that converts the image key values into a unique hash value. The image hashing is referred to as "Digital fingerprints" because it can find the duplicate or similar photo or image in a group of the gallery.


Applications of image hashing


1. To find duplicate images. If an image has its unique hash value, you can find the duplicate copies easily without searching. It can be more identical by its hash value.

2. To find similar images. If an image hash value is matched with another image hash value, you can say that those images are similar to each other.

3. To measure the distance between images. If two images are similar or non-similar, then how much similar or non-similar with each other. You can find this by using image hashing.


How is the image hashing algorithm work?


The image hashing works based on some factors that make the image more identical. Factors are Resizing, Rotating, Gamma value of a color, image format (PNG, JPG, BMP, etc.), image noise, watermarks. Those factors are applied corresponding to the image raw bytes value.


Image hashing algorithms


a-hash: The full meaning of a-hash is 'average hash'. It is the most normal hashing algorithm (easier than others image hashing algorithms). This algorithm simply uses a few transformation processes. The processes of a-hash are,

(a) Scaling of an image

(b) Converting the image into grayscale

(c) Calculating the mean value

(d) Binarization of the converted grayscale image based on the mean value.


p-hash: The full meaning of p-hash is 'perceptual hash'. This algorithm uses a discrete consing transformation process.


d-hash: The full meaning of d-hash is a 'difference hash'. The calculation of d-hash is done by using the value from the average hashing (a-hash) process and mostly takes the gradients values from the image.


w-hash: w-hash is a 'wavelet transformation' hashing process like p-hash, the w-hash algorithm also used the discrete cosine function. 


Now, we are going to implement those algorithms using python language. We use two images for the implementation of image hashing.

school of beginners logo
sob.png (img1)

school of beginners qrcode
sob-qr.png (img2)


a-hash and d-hash using python ('PhotoHash' library)


For this, we need to install a python package. This will reduce the time and implementation size.


Requirement

PhotoHash


Installation

#For windows, type this command in the command shell
pip install PhotoHash


a-hash in python

import photohash as ph

def average_hash(img):
return ph.average_hash(img)

if __name__ == '__main__':
a_hash1=average_hash('sob.png')
a_hash2=average_hash('sob-qr.png')
print('Image hash value of image 1: ', a_hash1)
print('Image hash value of image 2: ', a_hash2)

Output:

Image hash value of image 1:  f7ff7e424242bdff

Image hash value of image 2:  ff81819d818181ff


d-hash in python

import photohash as ph

def distance(img1, img2):
return ph.hash_distance(ph.average_hash(img1), ph.average_hash(img2))

if __name__ == '__main__':
d_hash=distance('sob.png', 'sob-qr.png')
print('Distance between two images: ', d_hash)

Output:

Distance between two images:  13



Find the similarity between two images using a-hash


If the difference hash value between two images is near zero then it will return a boolean signal (True or False).

import photohash as ph

def similar(img1, img2):
return ph.hashes_are_similar(ph.average_hash(img1), ph.average_hash(img2))

if __name__ == '__main__':
s_hash=similar('sob.png', 'sob-qr.png')
print('Are the images similar? ', s_hash)

Output:

Are the images similar?  False


Find the similarity between two images and set the limit of tolerance


Set the tolerance limit for the similarity of images. Is the image look-alike to others? set this based on the tolerance limit.


In low-tolerance

import photohash as ph

def IsLookAlike(img1, img2):
return ph.is_look_alike(img1, img2, tolerance=2)

if __name__=='__main__':
img1='sob.png'
img2='sob-qr.png'
similar=IsLookAlike(img1, img2)
print(similar)

Output:

False (That means, images are non-similar)


In high-tolerance

import photohash as ph

def IsLookAlike(img1, img2):
return ph.is_look_alike(img1, img2, tolerance=15)

if __name__=='__main__':
img1='sob.png'
img2='sob-qr.png'
similar=IsLookAlike(img1, img2)
print(similar)

Output:

True (That means, images are similar)


a-hash, d-hash, p-hash, and w-hash using python ('imagehash' library)


Why this library? This library is easy to use and required less implementation and all image hashing algorithms are available here.


Requirement

imagehash
PIL

Installation

#For windows, type this commands in the command shell
pip install imagehash
pip install pillow


a-hash, d-hash, the distance between images, the similarity between images in python


import imagehash as imgh
from PIL import Image

def Average_Hash(img1):
return imgh.average_hash(Image.open(img1))

if __name__=='__main__':
img1='sob.png'
img2='sob-qr.png'
a_hash1=Average_Hash(img1)
a_hash2=Average_Hash(img2)
print('Image hash value of image 1: ', a_hash1)
print('Image hash value of image 2: ', a_hash2)
print('Distance between two images: ', a_hash1-a_hash2)
print('Are the images similar? ', a_hash1==a_hash2)

Output:

Image hash value of image 1:  f7ff7e424242bdff

Image hash value of image 2:  ff81819d818181ff

Distance between two images:  34

Are the images similar?  False


p-hash in python


Set the required parameters and apply the p-hash algorithm easily on the image.

imgh.phash('sob.png', hash_size=8, highfreq_factor=4)


w-hash in python


Set the required parameters in a single and apply the w-hash algorithm easily on the image.

imgh.whash('sob.png', hash_size=8, image_scale=None, remove_max_haar_ll=True)



I would like to thank the 'hackerfactor' website for its research on image hashing.

Post a Comment

Previous Post Next Post