The Low-Level Mechanics of C# Dictionary Lookups

Dictionary<TKey, TValue> feels simple from the outside. You give it a key, it gives you a value.

var customer = customersById[customerId];

That one line hides a fair bit of work. Usually, it’s extremely fast, which is why dictionaries show up everywhere in .NET code. They’re used for caches. They’re used for quick lookups. They’re often sitting underneath routing, configuration, and in-memory state. They feel boring, but they quietly support a lot of application behaviour. But the interesting part is this, a dictionary lookup is not fast because it scans cleverly. It is fast because it tries very hard to avoid scanning at all.

The rough shape is this. Take the key. Turn it into a hash code. Use that hash code to pick a bucket. Use the bucket to jump into an internal entries array. Check whether the entry is actually the key you asked for. If it is, return the value. If it is not, follow a short chain of collided entries until the key is found or the chain ends.

That is the whole trick.

The public API hides the lookup path

Most of the time, we access a dictionary through the indexer:

var value = dictionary[key];

That looks like array access, but it is doing a keyed lookup. If the key exists, you get the value. If the key does not exist, the indexer throws KeyNotFoundException. Thats fine when missing keys are exceptional. Its a poor fit when missing keys are expected.

if (dictionary.TryGetValue(key, out var value))
{
    Use(value);
}

TryGetValue is the better shape when absence is normal. It avoids using exceptions as control flow and keeps the lookup explicit. It also avoids the common double lookup pattern:

if (dictionary.ContainsKey(key))
{
    var value = dictionary[key];
}

That code looks standard, but it usually asks the dictionary the same question twice. First, "is this key here?" Then, "get me the value for this key." When you already need the value, TryGetValue says both things in one pass.

The dictionary is built around buckets and entries

Internally, a modern .NET dictionary is built around two main arrays. One array stores buckets. The other stores entries. The bucket array does not store your values directly. It stores indexes into the entries array. The entries array is where the actual key/value records live. Each entry carries enough information for lookup: the stored hash code, the key, the value, and a next field used when multiple keys land in the same bucket.

Conceptually, it looks like this:

buckets
+---+---+---+---+---+
| 0 | 3 | 0 | 1 | 0 |
+---+---+---+---+---+
      |       |
      v       v
entries
+-------+------+-------+--------+
| hash  | next | key   | value  |
+-------+------+-------+--------+
| ...   | -1   | A     | ...    |
| ...   |  0   | B     | ...    |
| ...   | -1   | C     | ...    |
+-------+------+-------+--------+

The real implementation has details that are there for performance and safety, but the idea is straightforward. A bucket points at an entry. If there was a collision, that entry can point at another entry. The dictionary walks that chain until it finds the matching key. One implementation detail worth knowing is that modern .NET stores bucket indexes internally using a one-based value. That lets zero mean “empty bucket”. So when the runtime reads a bucket value, it subtracts one before indexing into the entries array.

You do not need to care about that for normal app code. But it is a useful reminder that these collections are tuned hard. Even small layout choices matter when the type is used everywhere.

A lookup starts with `GetHashCode`

When you ask for a key, the dictionary first needs a hash code.

var hashCode = comparer.GetHashCode(key);

That hash code is not the final answer. It is just a fast way of narrowing the search area. The dictionary uses the hash code to choose a bucket. If the dictionary has many buckets and the hash function spreads keys well, most buckets have either no entry or a very short chain of entries. That is where the usual "near O(1)" behaviour comes from. The important word there is "usual". A dictionary lookup is not guaranteed to be one comparison. It depends heavily on the quality of the key’s hash code and equality behaviour. A good hash code spreads different keys across different buckets. A bad hash code dumps too many unrelated keys into the same bucket. This is why key design is important.

Hash code match is not enough

A common misunderstanding is that the hash code identifies the key. It doesnt. Different keys can produce the same hash code. Thats called a collision, and it is expected. The dictionary must still call equality to prove the key is really the key you asked for. So the lookup has two checks. First, it checks whether the stored hash code matches the requested key’s hash code. If the hash code does not match, the key cannot be equal. Then, if the hash code does match, it checks equality.

entry.hashCode == hashCode && comparer.Equals(entry.key, key)

That ordering is deliberate. Integer comparison is cheap. Equality can be more expensive, especially for strings or custom objects. The hash code acts as a quick filter before the dictionary pays for the stronger equality check.

Collisions turn one jump into a short walk

Imagine three different keys all land in the same bucket.

bucket[5] -> entry 12 -> entry 7 -> entry 3

A lookup for a key in that bucket starts at the first entry. If it is not the right key, it follows next. Then it checks the next entry. Then the next.

In a healthy dictionary, this chain is short. In a bad dictionary, it can get long. When it gets long, your supposedly constant time lookup starts behaving more like a small linear search inside a bucket.

That is why this kind of key is toxic:

public sealed class BadKey
{
    public int Id { get; }
    public BadKey(int id)
    {
        Id = id;
    }

    public override int GetHashCode() => 42;
    public override bool Equals(object? obj)
    {
        return obj is BadKey other && other.Id == Id;
    }
}

The equality implementation is fine. The hash code is the problem. Every key goes into the same bucket group. The dictionary still works, but it loses the thing that made it fast. A better key gives the dictionary useful distribution:

public sealed class BetterKey
{
    public int Id { get; }
    public BetterKey(int id)
    {
        Id = id;
    }

    public override int GetHashCode() => Id;
    public override bool Equals(object? obj)
    {
        return obj is BetterKey other && other.Id == Id;
    }
}

In real code, you would usually lean on records, tuples, or HashCode.Combine for multi-field keys:

public sealed class CustomerRegionKey
{
    public int CustomerId { get; }
    public string Region { get; }
    public CustomerRegionKey(int customerId, string region)
    {
        CustomerId = customerId;
        Region = region;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(CustomerId, Region);
    }

    public override bool Equals(object? obj)
    {
        return obj is CustomerRegionKey other &&
               other.CustomerId == CustomerId &&
               other.Region == Region;
    }
}

The key point is not that you should hand-write this everywhere. It is that Equals and GetHashCode are part of the lookup cost. If they are wrong, slow, or unstable, the dictionary pays for it.

Mutable keys are a quiet way to break lookups

The worst dictionary bug is not always a bad hash function. Sometimes it is a key that changes after insertion.

var key = new MutableKey { Id = 123 };
var dictionary = new Dictionary<MutableKey, string>();
dictionary[key] = "stored value";

key.Id = 456;

Console.WriteLine(dictionary.ContainsKey(key));

If Id is used by GetHashCode or Equals, changing it after insertion can make the key unreachable. The object is still in the dictionary. The dictionary has not lost it. But the key now leads the lookup to the wrong bucket. Thats a miserable bug because the collection looks corrupt when the real problem is the key contract.

Dictionary keys should be stable for as long as they are in the dictionary. If a key needs to change, remove the old key and insert a new one. Better yet, use immutable key types.

Strings deserve special care

String keys are common, and they bring another important detail, comparison rules.

This dictionary:

var routes = new Dictionary<string, Handler>();

uses the default string equality comparer. That may be exactly what you want. But if the key is a protocol value, route, header name, identifier, code, or anything else that should be compared using ordinal rules, it is better to be explicit:

var routes = new Dictionary<string, Handler>(StringComparer.Ordinal);

For case insensitive lookups:

var headers = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);

This is partly about correctness and partly about performance predictability. Culture aware string comparison can behave differently from ordinal comparison. Most technical keys are not natural language. They are identifiers. Treat them like identifiers.

Resizing changes the shape of the dictionary

A dictionary has capacity. That capacity is the amount its internal data structure can hold before it needs to expand. When you add enough items, the dictionary resizes. Resizing allocates larger internal arrays and redistributes entries across the new bucket layout. Existing values are still there, but the internal shape changes. For normal code, you do not worry about this. For hot paths or large dictionaries, avoid repeated growth when you already know the likely size.

var lookup = new Dictionary<int, Customer>(customers.Count);
foreach (var customer in customers)
{
    lookup[customer.Id] = customer;
}

That small constructor argument can avoid several resize steps. It is not a magic performance trick. It is just telling the collection what you already know.

Why the value is returned by reference internally

One neat implementation detail in modern .NET is that the internal lookup helper returns a reference to the value slot when it finds a key. That lets the dictionary avoid extra copying internally and gives the indexer a direct route to the stored value.

You normally see the safe public API:

var value = dictionary[key];

Inside the runtime, the lookup can work closer to the storage. If the key is found, it returns a reference to the entry’s value. If the key is not found, it returns a null reference internally and the public API decides what to do. The indexer throws. TryGetValue returns false. This is one of those details that explains why the collection is fast without needing you to write unsafe-looking application code.

The lookup path in plain English

When you call dictionary.TryGetValue(key, out value), the runtime roughly does this:

Check the key is not null.
Get a hash code using the dictionary's comparer.
Map the hash code to a bucket.
Read the first entry index from that bucket.
Check the entry's hash code.
If the hash matches, check key equality.
If equality matches, return the value.
If not, follow the entry's next pointer.
Repeat until found or the chain ends.

That is why dictionary performance usually comes down to a handful of things. The hash must be cheap enough. The hash must spread keys well. Equality must be correct. The dictionary should not be forced to resize repeatedly. And you should avoid doing two lookups when one will do.

A small benchmark worth running

If you want to make this visible, benchmark a good key against a deliberately bad key.

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<DictionaryLookupBenchmarks>();

[MemoryDiagnoser]
public class DictionaryLookupBenchmarks
{
    private readonly Dictionary<GoodKey, int> _good = new();
    private readonly Dictionary<BadKey, int> _bad = new();
    private readonly GoodKey _goodKey = new(9_999);
    private readonly BadKey _badKey = new(9_999);

    public DictionaryLookupBenchmarks()
    {
        for (var i = 0; i < 10_000; i++)
        {
            _good[new GoodKey(i)] = i;
            _bad[new BadKey(i)] = i;
        }
    }

    [Benchmark]
    public int GoodKeyLookup()
    {
        return _good[_goodKey];
    }

    [Benchmark]
    public int BadKeyLookup()
    {
        return _bad[_badKey];
    }
}

public readonly record struct GoodKey(int Id)
{
    public override int GetHashCode() => Id;
}

public readonly record struct BadKey(int Id)
{
    public override int GetHashCode() => 42;
}

This is an artificial benchmark, but that is the point. It isolates the failure mode. Both keys can be compared correctly. Only one gives the dictionary a useful hash distribution. You will usually see the bad key version fall apart as the dictionary grows, because lookup has to walk through a long collision chain.

What this means in production code

Most dictionary use does not need cleverness. Use Dictionary<TKey, TValue> and move on. It is well-tuned, heavily used, and much faster than hand-rolled lookup code in almost every normal case. The problems usually come from the edges. A custom key type with poor GetHashCode can quietly turn fast lookups into slow ones. A mutable key can make values look like they have disappeared. A ContainsKey followed by the indexer can do duplicate work. A string dictionary without an explicit comparer can hide comparison rules that matter later. A large dictionary built one item at a time without capacity can pay for unnecessary resizing.

None of these mean dictionaries are dangerous. They mean dictionaries are real data structures, not magic syntax. The nice thing is that once you understand the lookup path, the advice becomes boring and practical. Make keys stable. Make equality correct. Use a comparer intentionally. Prefer TryGetValue when missing keys are normal. Pre-size when you know the size. And when a dictionary shows up in a hot path, look at the key before you blame the collection.

The Low-Level Mechanics of C# Dictionary Lookups

The public API hides the lookup path

The dictionary is built around buckets and entries

A lookup starts with `GetHashCode`

Hash code match is not enough

Collisions turn one jump into a short walk

Mutable keys are a quiet way to break lookups

Strings deserve special care

Resizing changes the shape of the dictionary

Why the value is returned by reference internally

The lookup path in plain English

A small benchmark worth running

What this means in production code

Comments

More from this blog

Building a Bytecode VM in C#

Cache Lines: The Invisible Boundary Your .NET Code Keeps Crossing

MemoryCache finally gets proper visibility in .NET 11

Expression Trees vs DSL Trees in .NET

Command Palette

The public API hides the lookup path

The dictionary is built around buckets and entries

A lookup starts with GetHashCode

Hash code match is not enough

Collisions turn one jump into a short walk

Mutable keys are a quiet way to break lookups

Strings deserve special care

Resizing changes the shape of the dictionary

Why the value is returned by reference internally

The lookup path in plain English

A small benchmark worth running

What this means in production code

Comments

More from this blog

A lookup starts with `GetHashCode`