A recent post to the Java Subreddit discussed creating a high performance counter [sadly now deleted (cached copy)] for gather statistics. Unfortunately it contains a number of issues.
The first issue to resolve is that CounterConcurrentHashMap
extends ConcurrentHashMap
, rather than encapsulating the map. This violates the “favour composition over inheritance” advice. The main problems are that the counter doesn’t have a “is a” relationship with a concurrent hash map, it exposes implementation details, and it exposes a much larger API than is required. The last one is a problem because it means that a bad actor can interfere with your underlying data. Classes should only expose what they need to. So let’s rewrite CounterConcurrentHashMap
to fix that.
In this version, we no longer extend the ConcurrentHashMap
and instead have a private field for the map. This requires updating the call to put()
to use the field rather than a local method. Additionally, we’ve added a getCount()
method to get the current value of the counter for a particular key. We return an int
rather than the MutableInteger
, because we don’t want external users modifying the data behind our backs. At this point, we could make MutableInteger
an inner class of CounterConcurrentHashMap
. We should probably also rename CounterConcurrentHashMap
, but we’ll keep the name for the rest of the article.
You’ll also notice that we use getOrDefault()
, which protects us from a NPE
at the expense of creating a new object and not indicating that there’s no such key. Unknown keys return 0 rather than throwing an exception or null. The best alternative is shown below.
I don’t recommend using a Map.containsKey()
check instead of the null check, as a future improvement to the class might be to remove counters for a key, which could result in the containsKey()
check passing, but the get()
returning null. The above approach prevents these issues by only performing one operation.
The next issue is that the code creates a new MutableInteger
for every count. As this was the explicit reason for creating the MutableInteger
class rather than use Integer, this is somewhat ironic. We can resolve this by using Map.computeIfAbsent()
instead of put()
. Map.computeIfAbsent()
looks to see if a value exists for the the key, and if not calls the lambda to create the value, before returning the value that either already existed or was created. With ConcurrentHashMap
, this is an atomic operation, but isn’t guaranteed to be in other implementations.
In this version, we’ve completely removed initValue
and only create a new MutableInteger
object if we don’t already have the key. This means that we need to update oldValue
and then return the result.
This brings us to by far the biggest issue with the original code. It’s not thread-safe. The issue is that we don’t have an atomic increment of the count; we get the value and then set it. If two threads try to increment the same counter, it’s possible that they both call the get()
before either calls the set()
, meaning we have lost updates. This can be fixed by implementing an atomic increment in MutableInteger
, but that would be reinventing the wheel. Java has shipped AtomicInteger
(and friends) since Java 5. We can replace every instance of MutableInteger
with that class:
AtomicInteger
‘s no-argument constructor creates an AtomicInteger
with a value of 0;
We can shorten the up()
method though by removing intermediate variables.
We do have an alternative method to atomically update the value, while still using MutableInteger
by using the ConcurrentMap.compute()
method. Returning to the map of MutableIntegers
, we can write:
This moves the locking to ConcurrentHashMap
, as the compute()
method is atomic. This does require modifying MutableInteger.set()
to return a copy of this
.
However, there is one final improvement that can be made . If we expect to update the counter from many different threads, but only read the result infrequently, we can improve performance by using a LongAdder
instead. A LongAdder
uses an array of values, with one entry in the array per thread, allowing each thread to independently update the counter. The result can be queried by calling the relatively expensive sum()
method.
We’ve had to make a few changes to the API though. Most importantly, up()
now returns void rather than the current count. Less importantly, getCount()
returns a long
due to using the LongAdder
. We could have truncated the value ourselves if we wanted to keep that method returning an int.
So there we have it: A shorter class that performs better, creates fewer objects, is more secure and most importantly, is thread-safe.