How Hashing is useful in Recent Technologies? Illustration with its applications.

 


  • What is Hashing?

     Hashing is a technique or process of mapping keys, values into the hash table by using a hash function. It is done for faster access to elements. The efficiency of mapping depends on the efficiency of the hash function used.

     Hashing is also known as Hashing Algorithm or Message Digest Function. It is a technique to convert a range of key values into a range of indexes of an array. Hashing is one of the searching techniques that use constant time. The time complexity in hashing is O(1).


  • What is Hash Function?

     A hash function is a function that takes a set of inputs of any arbitrary size and fits them into a table or other data structure that contains fixed-size elements.

·       A fixed process that converts a key to a hash key is known as a Hash Function.

·       This function takes a key and maps it to a value of a certain length which is called a Hash value or Hash. It represents the original string of characters, but it is normally smaller than the original.

·       It transfers the digital signature and then both hash value and signature are sent to the receiver. The receiver uses the same hash function to generate the hash value and then compares it to that received with the message.

·       If the hash values are the same, the message is transmitted without errors.


  • What is Hash Table?

     In a Hash Table, to store data we use Hash function that takes the data as input, and based on that data it generates some key and we store the data based on that key.


     The above figure shows the hash table with the size of n = 10.

     Our data can be of any form i.e., it can be in integer or string or some other data type. But our main aim should be to convert that data into an integer by doing some pre-processing and then apply some hash function to it and map the result in the hash table.

  • How does Hashing in Data Structure Works?

     In hashing, the hashing function maps strings or numbers to a small integer value. Hash tables retrieve the item from the list using a hashing function. The objective of hashing technique is to distribute the data evenly across an array. Hashing assigns all the elements a unique key. The hash table uses this key to access the data in the list.


     Hash table stores the data in a key-value pair. The key acts as an input to the hashing function. Hashing function then generates a unique index number for each value stored. The index number keeps the value that corresponds to that key. The hash function returns a small integer value as an output. The output of the hashing function is called the hash value.

     In Hashing technique, the hash table and hash function are used. Using the hash function, we can calculate the address at which the value can be stored.

     A collision occurs when more than one value is to be hashed by a particular hash function hash to the same slot in the table or data structure (hash table) being generated by the hash function.

  • Importance of Hashing -

     Hashing gives a more secure and adjustable method of retrieving data compared to any other data structure. It is quicker than searching for lists and arrays. In the very range, Hashing can recover data in 1.5 probes, anything that is saved in a tree. Hashing, unlike other data structures, doesn’t define the speed. A balance between time and space has to be maintained while hashing. There are two ways of maintaining this balance.

·       Controlling speed by selecting the space to be allocated for the hash table

·       Controlling space by choosing a speed of recovery


  • Applications of Hashing -

1. Blockchain:

     The foundation of cryptocurrency is the blockchain, which is nothing but the connection of individual blocks of transaction data. When we are dealing with cryptocurrency (for example bitcoin) the details related to the transactions are given to a hashing algorithm (SHA-256) the result of which is a hash value having a constant length using which we can’t guess what the actual data was. Since all these blocks are linked with each other, if we need to change the data in a particular block we need to change the data in its previous block and it goes on. This is how blockchain becomes immutable as well as trustworthy of its data.

2. Hashing Passwords:

     Whenever we enter the password for any account, it is converted to a string that has no connection with the password and stored in the database. So when a user wants to sign up for an account that would require them to choose a password, instead of storing this password as the text it is stored as a hash generated from the one user entered.

     Why shouldn’t we store the password as text? Because if in case an attacker gets access to the database, he would, in turn, get access to the real password which the user may have used for different accounts (the reason why most websites warn not to use the same password everywhere). When any user wants to login, they enter their password which gets hashed and checked with the values in their database to verify if the password entered was correct.

3. Tokenization:

     Credit card tokens are used to exchange the customer the data with an alphanumeric ID that has no definite value and relation to the account's owner. Tokens themselves do not contain any useful customer data is stored in the customer's bank. For instance, if we want to check if a token corresponds to a card, it is enough to compare the hash of the card with the token. Also, there is no way to reverse the token because it hasn't been stored in a table with both the card and token.

4. Rabin-Karp Algorithm:

     One of the most famous applications of hashing is the Rabin-Karp algorithm. This is basically a string-searching algorithm that uses hashing to find anyone set of patterns in a string. A practical application of this algorithm is detecting plagiarism.

     It is basically used to search for a string. Depending on the size of the string we keep on calculating the hash values for a pattern. One is for the original string and the other is for the sequence of characters in the string which we keep on shifting to towards the end of the string. Once we get matching of the two hashes, we need to check the original strings again as different strings may have the same hash. We continue this until we reach the end of the string.

5. Compiler Operation:

     The keywords of a programming language are processed differently than other identifiers. The accumulator adds all these accesses. in a collection that is implemented using a hash table to distinguish between the access. To differentiate between the keywords of a programming language (if, else, for, return, etc.) and other identifiers and to successfully compile the program, the compiler stores all these keywords in a set that is implemented using a hash table.

6. Linking File name and path together:

     When moving through files on our local system, we observe two very crucial components of a file i.e., file_name and file_path. In order to store the correspondence between file_name and file_path the system uses a map (file_name, file_path) which is implemented using a hash table.


Team:
1. Rauf Jamadar
2. Sahil Kakurle
3. Kalyani Laddha

Department of SY-E&TC Engineering,
Vishwakarma Institute of Technology, Pune.

Comments

  1. Great work guys๐Ÿ‘๐Ÿ‘

    ReplyDelete
  2. Nice helped me in understanding hashing part of data structures bit better

    ReplyDelete
  3. Great work.... Thanks for information

    ReplyDelete
  4. Cleared all my doubts ๐Ÿ‘Œ

    ReplyDelete
  5. Very informative and nice work

    ReplyDelete
  6. Informative and Precise. Great work guys!

    ReplyDelete

Post a Comment

Popular posts from this blog

Flynn’s architectural classification scheme

Cost Structures in Business