Skip to main content

Size Maps When Possible

Java stores Maps as entries into buckets of singly linked lists or further buckets. Its max storage is constrained by the JVMs memory allocation, but theoretically can hold up to 231 values. Resizing a Map is an expensive operation, as existing buckets may have to be redistributed, or new buckets may need to be created. Dependent on the number of insertions to the Map, this can happen many times, requiring a large number of repositions.

Some key terms

  • Capacity: The number of buckets (length of the array of nodes) in the hash table. The initial capacity is set to 16 (24) and doubled each time it is reached.
  • Load factor: The measure that decides when to increase the capacity of the Map. By default, this is set to 0.75 ( 75%) of capacity.

Indexes

Map indexes in Java are initialised and extended in powers of 2 - and the load factor decides how many elements can be inserted before the map must be resized. Given a load factor of 0.75, for an initial capacity of 16, up to 12 values can be inserted before a resize occurs.

If you know that you will be inserting at least 40 values, it is optimal to select an initial capacity (of 64) that won't require a resize:

Exponent (n)Capacity (2n)Resize at
41612
53224
66448

Benchmark

Definition

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 200, time = 1, timeUnit = TimeUnit.MILLISECONDS)
@Fork(1)
@State(Scope.Benchmark)
public class HashMapBenchmark {

private Map<Integer, Integer> map;

@Param({"8", "16", "32", "64"})
public int capacity;

@Param({"8", "16", "32"})
public int numberOfInserts;

@Setup
public void setup() {
map = new HashMap<>(capacity);
}

@Benchmark
public Map<Integer, Integer> insert_to_hash_map() {
for (int i = 0; i < numberOfInserts; i++) {
map.put(i, i);
}
return map;
}
}

Results

As can be seen from the results below - for an initial map capacity of 8, inserting 32 keys takes 11.5% longer than the same amount of inserts to a 32 capacity map.

BenchmarkCapacityNo. of InsertsModeCountScoreErrorUnits
HashMapBenchmark.insert_to_hash_map88avgt20099.360± 3.490ns/op
HashMapBenchmark.insert_to_hash_map816avgt200198.932± 9.344ns/op
HashMapBenchmark.insert_to_hash_map832avgt200400.597± 15.615ns/op
HashMapBenchmark.insert_to_hash_map168avgt20093.933± 3.532ns/op
HashMapBenchmark.insert_to_hash_map1616avgt200198.945± 7.892ns/op
HashMapBenchmark.insert_to_hash_map1632avgt200378.605± 15.225ns/op
HashMapBenchmark.insert_to_hash_map328avgt20094.030± 3.626ns/op
HashMapBenchmark.insert_to_hash_map3216avgt200190.595± 7.072ns/op
HashMapBenchmark.insert_to_hash_map3232avgt200358.341± 15.057ns/op
HashMapBenchmark.insert_to_hash_map648avgt20092.010± 4.958ns/op
HashMapBenchmark.insert_to_hash_map6416avgt200187.479± 8.061ns/op
HashMapBenchmark.insert_to_hash_map6432avgt200359.244± 13.924ns/op

Although this is not always possible at runtime, ensuring to handle the cases where it is will speed up your Java functions.