Hasing and Hash Table

Hashing

Hashing is an important Data Structure that is designed to use a special function called the Hash function which is used to map a given value with a particular key for faster access of elements. The efficiency of mapping depends on the efficiency of the hash function used.

In hashing, a string of characters is transformed into a shorter length value or key that represents the original string.

Hash Table

Hash table is a table (array) where we store the original string. The index of the table is the hashed key while the value is the original string.

Hash Function

A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are used to index a fixed-size table called a hash table.

A good hash function should have the following properties :

1) Efficiently computable.

2) Should uniformly distribute the keys (Each table position equally likely for each key)

There are many ways to hash a string into a key. The following are some of the important methods for constructing hash functions:

Mid-square
Division (common)
Folding
Digit Extraction
Rotating Hash

Mid-square

Square the string or identifier and then using an appropriate number of bits from the middle of the square to obtain the hash key.

Steps:

Square the value of the key (k^2)
Extract the middle n bits of the result obtained in step 1

If the key is a string, it is converted to a number.

Division

It is the most simple method of hashing an integer x
Divide the string or identifier by using the modulus operator

Folding

The folding method works in two steps

Divide the key value into a number of parts where each part has the same number of digits except the last part which may have lesser digits than the other parts.
Add the individual parts. That obtains the sum of part1 + part2 +... + part n. The hash value produced by ignoring the last carry, if any.

Digit Extraction

A predefined digit of the given number is considered as the hash address

Rotating Hash

Use any hash method
After getting the hash code or address from the hash method, do a rotation
Rotation is performed by shifting the digits to get a new hash address

Collision

Since a hash function gets us a small number for a key which is a big integer or string, there is a possibility that two keys result in the same value. The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision and must be handled using some collision handling technique.

There are two general ways to handle collisions:

Linear Probing: search the next empty slot and put the string there
Chaining: Put the string in a slot as a chained list (linked list)

Linear Probing

Linear probing is a scheme in computer programming for resolving collisions in hash tables, data structures for maintaining a collection of key-value pairs and looking up the value associated with a given key. In linear probing, we linearly probe for the next slot. Linear probing has a bad search complexity if there are a lot of collisions.

Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.

Chaining

In chaining, we store each string in a chain (linked list). So if there is a collision, we only need to iterate on that chain.

Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.

Hashing in Blockchain

Blockchain technology is without a doubt, one of the most defining technological innovations of our times. It has defined how digital transactions are verified and stored through the use of Distributed Ledger Technologies (DLT). But to understand how blockchain works in cryptocurrency, we need to have in mind one basic concept: hashing.

Hashing in blockchain refers to the process of having an input item of whatever length reflecting an output item of a fixed length. If we take the example of blockchain use in cryptocurrencies, transactions of varying lengths are run through a given hashing algorithm, and all give an output that is of a fixed length. This is regardless of the length of the input transaction. The output is what we call a hash.

A hash function, will take any transaction/data input and rehash it to produce an output of a fixed size. The process of using a given hash function to process a transaction is called hashing. The transactional output of that given hash function is what we call a hash. And that should be it. There is more we need to expound on to demystify hashing in the blockchain. At this point, I want to emphasize that it is good to remember that the basic characteristic of any given hash function lies in the size of its output. This is what gives us the different hash functions.

Cryptographic Hash Function

SHA 256: an output of a 256-bit hash and currently in use on the Bitcoin network
Keccak-256: an output of a 256-bit hash; currently in on the Ethereum network

The cryptographic hash function is an integral part of the blockchain innovation. It is essentially a feature that gives security capabilities to the processed transactions, making them immutable. Hashing is also at the center of “Merkle Trees”, which is an advanced approach to blockchain hashing. It is useful in issues of scalability, and mobile/light wallets.

Search This Blog

Data Structure