Introduction to Hashing

3 min readMay 27, 2021

Hashing is the procees of converting a key into another value using a hash function. At the moment, imagine a hash function as a blackbox which takes a key as input and outputs a different value. This output from the hash function is called “Hash”

Hashing Data Structure:

It is a way of storing (key,value) pairs in a hash table for a faster access of elements. Imagine a scenario where you want to check whether a word is present in a list containg 1,00,000 words. It is O(n) for looking and if it is sorted and we use a binary search, it would still be O(logn). This is where hash table provides an advantage by making the lookup O(1).

We just take the string as input, use the hash function to output an integer (hash) and use this integer to store the value in the list. Consider an example where the University is storing the details of all its students. So, the student name can be passed through a hash function which generates a hash. This hash can be used to find the index of the array where the student’s info can be stored.

hash=hashfunction(key)
index=(hash)%length_of_array.

It is not guaranteed that different keys always produce different hashes. Depending on the hash function, different keys may produce the same hash. This is leads to the problem of collisions

Collisions

There are a couple of ways to handle collisions.

Separate Chaining
Open Addressing

Seperate Chaining: In this method, each index in the array uses linked lists for dynamic allocation of data. Whenever different keys produce same hash, the values are chained through linked list. So, to find an item we go the respective index and traverse through the linked list. This makes the complexity O(k) where k is the maximum number of elements in a linked list. However this would be negligible compared to O(n) and closer to O(1)

Open Addressing: In case of Open Addressing, we look for alternate free indices in the array to store the value. It can be done through linear probing,quadratic probing or double hashing.

Example: In case of linear probing,a fixed window is used between two probes. If index=(hash)%length_of_array is not free then we try index=(hash+1)%length_of_array and so on till we find a free index to fill.

Good Hashing Function

Should not lead to many collisions.
Should make sure the distribution of elements in the data structure are uniformly distributed and do not have any clusters.

Dictionary in python v/s Hash Table

In python, a dictionary is used as a data structure that stores (key, value) pairs where the keys are mapped to the values. A hash table on the otherhand is a data structure that maps the hash of the key to a bucket which stores one or more values. Both are quite similar in a way and it can be considered that hash tables are a way to implement dictionaries or maybe that dictionaries in python are implemented using hash tables.

Hashing v/s Encryption and Decryption

In Encryption and decryption, a message is encrypted using a key and decrypted at the receiver. In hash tables, there is no need to decrypt the message. So, Encryption and decryption is a two way process while hashing can be thought of as a one-way function without a need to get the original value.

References

What is Hashing? How Hash Codes Work - with Examples