Simply put a hash function is a way to map an unbounded range of values to a fixed range. For example, you want to only write down the 2 letters of a person's name. Your hash function will give you 2 letters back, regardless of the length of the person's name.
This is useful in grouping information and later searching. These groups are called hash buckets
modulus operator is the simplest hash function.
f(x) = x % 10
f(129314) = 129314 % 10 = 4
As you can see in the previous example. You can't get 129314 back using 4. Because all the numbers ending with 4 will hash to f(x) = 4
This makes it unsuitable for encryption. For encryption, you need to get the value back.
Hash is not encryption.
You can definitely verify the number which has been hashed with the function. You can try guessing the original number from the hash.
A cryptographic hash function is designed with collision resistance in mind. The very property of grouping things together in hash buckets is lost with collision resistance.
Cryptographic hash functions are designed to make a large change in the hash for a slight change in the input value.
SHA256(100) = "ad57366865126e55649ecb23ae1d48887544976efea46a48eb5d85a6eeb4d306"
SHA256(101) = "16dc368a89b428b2485484313ba67a3912ca03f2b2b42429174a4f8b3dc84e44"
The hash generated is a hex code of the 256 bits in the above example. You can try this online SHA-256 generator
Because the domain is too large and the hash value is too small (usually 64/128 char long). It's like getting the document back from the shredder. These hash functions are also called as one-way hash function
Let's see how these hash functions are useful.
Convert the message to a digest with MD5 to shorten the input. Sign the digest with private key and send it to to receiver. Sign & Verify Digital Signatures with ECDSA
Amazon uses this scheme to authorize REST API calls.
Authorization = "AWS" + " " + AWSAccessKeyId + ":" + Signature;
Signature = Base64( HMAC-SHA1( YourSecretAccessKeyID, UTF-8-Encoding-Of( StringToSign ) ) );
StringToSign = HTTP-Verb + "\n" +
Content-MD5 + "\n" +
Content-Type + "\n" +
Date + "\n" +
CanonicalizedAmzHeaders +
CanonicalizedResource;
CanonicalizedResource = [ "/" + Bucket ] +
<HTTP-Request-URI, from the protocol name up to the query string> +
[ subresource, if present. For example "?acl", "?location", "?logging", or "?torrent"];
CanonicalizedAmzHeaders = <described below>
The same technology is used in SSL.
Let's say, your password is stored in the database as a hash. Since the hash is irreversible, the people operating the database can't read your password.
But if the hash is not collision resistant, the attacker enters another password which is not even closely related to your password. And it hashes to the same value and he gets access to your account.
So the more hash collisions a function offers, the easier the brute force attack becomes.
But crypto hash functions are better since they won't cause a collision. The simple modulus function will collide for every number with same last digit. The bcrypt algorithm generates a 192-bit password hash by encrypting three 64-bit blocks using a password-derived blowfish key. Bcrypt is a one-way hash function and it is preferred for storing passwords since it slows down the brute-force attack.
As the hash is a fixed length value, you can store records of any length as a hash. This will save storage space and help in navigation. Since those are fixed length records.
Store the cryptographic hash if you will never need the original stuff again.
That's because you will never get any part of original stuff back again. And that's what makes up for 100% loss of data.
Let's say you give your friend a number and ask him to multiply it with any number he can, such that the product has a certain pattern in it.
e.g You gave your friend 130312 and expect him to give you back a number with 555 after first 3 digits.
130312 x ? = XXX555XXX
Well, it's easy for him. He will divide any number with that pattern and give you answer back.
But he can't do that if you ask him to take the product and then do the hash of the number.
SHA256( 130312 x ? ) = XXX555XXX
Now you are sure your friend has tried a lot of numbers to multiply before he got the hash function to give him that pattern. That is proof of work to you.
You know he can't cheat. You can trust him this time.
Proof of work is at the heart of cryptocurrencies. Bitcoin terms this process as mining, the process of finding the bits required to make block produce a hash with the desired unique pattern.
SHA256( Block XOR Magic Number ) = Hash with pattern
I'll not go into the details of that. But the crypto hash can be used for puzzle solving. And the good thing is that there is no formula to solve this puzzle. You just need to try continuously to get that number which adds up to the transaction to produce the answer to the puzzle.