February 15, 2014

HashMap vs. equals()


You might face exam questions asking what happens when adding certain objects to a map. For this you have to understand what the equals() and hashCode() methods do, what is the contract of the Map collection and how its widely used implementation HashMap works.


Hash code and equals


The first tricky thing is that creating objects with the same "content" will not automatically yield the same hash code or or the same equality by the equals() method.

@Test
public void testEqualsAndHashCode() {

    class StringHolder {

        String string;

        public StringHolder(final String string) {
            this.string = string;
        }
    }

    StringHolder s1 = new StringHolder("a");
    StringHolder s2 = new StringHolder("a");

    Assert.assertFalse(s1 == s2);
    Assert.assertFalse(s1.equals(s2));
    Assert.assertFalse(s1.hashCode() == s2.hashCode());
}

If you want that the objects contained by the class determine the equality and the hash code of the class, you have to implement the equals() and the hashCode() methods. (And actually both of them, see at the end.)

Map


Map Interface


You might expect that more values with the same key cannot be added to a map. And it is true, but how can a map implementation check which keys are the same? It uses a combination of
  • equals()
  • hashCode()
This is what the contract of the Map interface says:
"This specification should not be construed to imply that [...] will cause key.equals(k) to be invoked for any key k. Implementations are free to implement optimizations whereby the equals invocation is avoided, for example, by first comparing the hash codes of the two keys."

HashMap


To understand what happens in the next example, you have to check out the implementation of the HashMap class, and see the following:
  • HashMap is an array of linked lists ("buckets").
  • One bucket (i.e. one array index) corresponds to one range of hash codes.
So what happens when calling put(key, value)?
  1. The hashCode() method will be called first to determine the proper bucket. This is why the HashMap offers a fast access to an element.
  2. Inside the bucket it walks through the linked list (since no other choice) and call equals() to find, whether the key is already existing in the map.

Example


The example shows indeed that not only the equals() method will be used to check the key equality.

Keys that are equal by the equals() method will be added, because they accidentally fall into different buckets. There might be more tricky examples where they would go into the same bucket and so override each-other.

@Test
public void testHashMap() {

    class AdjustableKey {

        int equals;
        int hashCode;

        public AdjustableKey(final int equals, final int hashCode) {
            this.equals = equals;
            this.hashCode = hashCode;
        }

        @Override
        public int hashCode() {
            return hashCode;
        }

        @Override
        public boolean equals(final Object obj) {
            return equals == ((AdjustableKey) obj).equals;
        }
    }

    final Map<AdjustableKey, String> map = new HashMap<AdjustableKey, String>();

    map.put(new AdjustableKey(1, 1), "a11"); // New
    map.put(new AdjustableKey(1, 2), "a12"); // New because of different hashCode
    map.put(new AdjustableKey(2, 1), "a21"); // New because of different equals
    map.put(new AdjustableKey(2, 2), "a22"); // New because of different hashCode
    map.put(new AdjustableKey(2, 2), "b22"); // Replaces a22

    Assert.assertEquals(4, map.size());
}

Outlook


When implementing the equals() or hashCode() methods always implement them both. See:
Implementations of the Set interface often use a Map implementation inside. The objects in the Set will become the keys of the backing Maps. So you can expect the same adding behavior for them as seen in the above example. For example:
  • HashSet -> HashMap
  • TreeSet -> TreeMap

Warning


I discovered that this article is not about a realistic situation. It suggests that "equal keys can be added to a Map multiple times", but this is not true in general, do not memorize it as a rule. It is limited only to show how Maps are working.

In the above experimental example it is possible only because the keys do not obey the contract of the hashCode() and equals() methods, which states that "equal objects must have equal hash codes".

When implementing the equals() and hashCode() methods in the keys correctly the above situation never happens. Equals keys can be added only once, just in the case of key "b22" in the example.

No comments :

Post a Comment