A HashMap is a data structure that stores key-value pairs, where the key is used to locate the value quickly. The HashMap class in Java is implemented using a hash table, which is a data structure that uses a hash function to map keys to indices in an array.
When a key-value pair is added to a HashMap, the hash function is used to calculate the index in the array where the value should be stored. This is called the bucket. If two or more keys have the same hash value, they are said to have collided and are stored in the same bucket in the form of a linked list.
When a value is retrieved from a HashMap, the hash function is used again to calculate the index of the bucket where the key-value pair should be located. The HashMap then checks all the keys in that bucket to find the correct value.
The time complexity of the basic operations in a HashMap is as follows:
get: O(1) average and O(n) worst case put: O(1) average and O(n) worst case remove: O(1) average and O(n) worst case It is important to note that the performance of a HashMap depends on the quality of the hash function used to map the keys to indices, and on the load factor, which is the ratio of the number of elements in the map to the number of buckets. A good hash function and a low load factor can help improve the performance of a HashMap.
HashMap is not thread safe, if multiple threads try to access the same HashMap object and modify it at the same time, it may lead to unexpected behavior, it's better to use ConcurrentHashMap in concurrent environments.
In Java 8, the HashMap implementation has a feature called "treeification". When a bucket in the HashMap contains a certain number of elements, the bucket is dynamically replaced with an ad-hoc implementation of TreeMap to improve performance. This is done to maintain a good balance between space and time complexity.
The idea behind this is that as the number of elements in a bucket increases, the performance of the HashMap degrades and the probability of collisions increases. This results in longer linked lists and longer search times. By replacing the bucket with a TreeMap, which has a logarithmic time complexity for search operations, the performance of the HashMap is improved.
The threshold for when a bucket is replaced with a TreeMap is called the TREEIFY_THRESHOLD and it is 8 by default, which means when a bucket has more than 8 elements, it is replaced with a TreeMap.
It's important to note that this feature of HashMap, improves the performance of the HashMap in certain cases, but it doesn't change its average time complexity, which is still O(1) for get and put operations and O(n) in the worst case scenario.
Uses linked list, but it is not the same java.util.LinkedList implementation. it's a custom implementation with a simplified structure to handle collisions in the HashMap.
Imagine, if your hashmap is poorly hashed, worst case all the element can lie in the hash bucket. In that case, your hashmap becomes a linked list, technically speaking.
The designers of the HashMap class in Java understood that the hash codes generated by the user-provided key objects may not be evenly distributed, which can lead to poor performance if many keys end up in the same bucket. To address this, the HashMap class applies an additional round of hashing on the key's hash code to create a more evenly distributed set of indices in the underlying array.
In particular, the internal hash function used by HashMap applies the following operations to the key's hash code:
It uses the bitwise AND operator to reduce the hash code to a valid index in the underlying array. It applies the multiplication method by a prime number to the hash code to ensure that the hash code is distributed uniformly among all the indices of the array. It uses the bitwise OR operator to combine the hash code with a constant value, which helps to further spread out the values in the array. By applying these additional operations, the internal hash function in HashMap increases the likelihood that the keys will be distributed evenly across the array, which improves the performance of the HashMap in the long run.
It's important to note that the implementation of the hash function used by HashMap may change between Java versions and vendors.