Skip to main content

Command Palette

Search for a command to run...

Building a QUIC service in .NET with MsQuic

Updated
9 min read
Building a QUIC service in .NET with MsQuic

If you only ever touch QUIC through HTTP/3, it is easy to miss how different the transport really is. HTTP keeps you thinking in requests, responses, headers, and status codes. MsQuic forces you to think in connections, streams, flow control, and lifetimes. Once you accept that shift, you can build systems that simply do not map cleanly onto HTTP at all.

I wrote about MsQUIC a few months ago here. This time we are going to build a real QUIC based service in .NET using MsQuic directly. The service is stateful, supports multiple concurrent bidirectional streams per client, and allows clients to reconnect and resume logical sessions. Along the way, we will look at concrete code patterns that actually work under load, not just demos that compile.

This assumes you already understand async and await, memory ownership, TLS, and basic networking. We will focus on how MsQuic changes the shape of the code you write.

Starting with the right mental model

The most important thing to understand before writing any code is that MsQuic is callback driven and aggressively concurrent. You do not call ReadAsync and wait. MsQuic calls you and tells you that something happened. Your job is to translate those events into a form that your application logic can consume safely.

If you try to fight this and pretend MsQuic is just another stream abstraction, you will lose. You should treat MsQuic as an event source and build a thin translation layer that feeds async friendly primitives such as Channels or Pipes.

Once you adopt that model, the rest of the design starts to fall into place.

Initialising MsQuic correctly in a .NET host

MsQuic has two global concepts you must get right from the beginning, the registration and the configuration. These are not cheap objects and they are not request scoped. They belong at the same level as your host itself.

In a typical .NET application, you would initialise these during startup and keep them alive for the lifetime of the process.

A simplified example looks like this.

using Microsoft.Quic;
using System.Net.Security;

var registration = new QuicRegistration(
    new QuicRegistrationOptions
    {
        AppName = "SuperQuicService",
        ExecutionProfile = QuicExecutionProfile.LowLatency
    });

var serverCertificate = LoadCertificate();

var configuration = new QuicConfiguration(
    registration,
    new QuicConfigurationOptions
    {
        AlpnProtocols = new[] { "super-quic" },
        MaxInboundBidirectionalStreams = 100,
        MaxInboundUnidirectionalStreams = 0,
        IdleTimeout = TimeSpan.FromMinutes(2),
        ServerAuthenticationOptions = new SslServerAuthenticationOptions
        {
            ApplicationProtocols = new List<SslApplicationProtocol>
            {
                new SslApplicationProtocol("super-quic")
            },
            ServerCertificate = serverCertificate
        }
    });

The important detail here is the intent. You are explicitly choosing stream limits. You are explicitly choosing an idle timeout. You are explicitly controlling ALPN. These decisions affect how your service behaves under stress.

Once this configuration is in use, every connection created from it inherits these characteristics. You cannot patch this later without restarting the process.

Listening for incoming connections

With configuration in place, you can create a listener. This listener accepts incoming QUIC connections and hands them to you via callbacks.

var listener = new QuicListener(
    registration,
    configuration,
    new IPEndPoint(IPAddress.Any, 5555));

listener.Start();

At this point, nothing interesting has happened yet. The real work begins when a client connects.

MsQuic will surface an incoming connection as a QuicConnection instance. You should immediately associate it with application state.

Attaching application state to a connection

When a new connection arrives, you should create a connection state object that represents everything your application knows about that peer. This is where many designs go wrong by putting too much logic into the state object itself.

A good connection state is mostly data.

sealed class ConnectionState
{
    public Guid ConnectionId { get; } = Guid.NewGuid();
    public ConcurrentDictionary<long, StreamState> Streams { get; } = new();
    public CancellationTokenSource Lifetime { get; } = new();
    public SessionState? Session { get; set; }
}

When you accept a connection, you create one of these and associate it with the QuicConnection using a GCHandle or a dictionary keyed by the connection handle.

From this point on, every stream event for this connection can find its owning state.

Accepting and handling streams

In QUIC, streams are the unit of work. A client can open many of them concurrently, and they are independent of one another.

When MsQuic notifies you of a new incoming stream, you create a stream handler. This handler owns the lifetime of that stream and is responsible for reading and writing data.

Here is a simplified pattern that works well in practice.

sealed class StreamState
{
    public long StreamId { get; }
    public QuicStream Stream { get; }
    public Pipe Pipe { get; } = new();

    public StreamState(long streamId, QuicStream stream)
    {
        StreamId = streamId;
        Stream = stream;
    }
}

The key idea here is the Pipe. MsQuic delivers buffers via callbacks. Pipes give you an async reader and writer pair that integrate naturally with modern .NET code.

When MsQuic tells you data has arrived, you write it into the PipeWriter. Your application logic reads from the PipeReader at its own pace.

Translating MsQuic receive callbacks into async reads

When data arrives on a stream, MsQuic invokes a callback with one or more buffers. You must copy or reference that data and then tell MsQuic when you are done with it.

A typical receive handler might look like this.

void OnStreamDataReceived(StreamState state, ReadOnlySpan<byte> data)
{
    var writer = state.Pipe.Writer;

    writer.Write(data);
    var result = writer.FlushAsync().GetAwaiter().GetResult();

    if (result.IsCompleted)
    {
        state.Stream.ShutdownRead();
    }
}

This code looks simple, but it hides an important behaviour. If the reader is slow, FlushAsync will eventually apply backpressure. That backpressure propagates all the way back to MsQuic, which slows the sender down. You are no longer lying to the transport.

On the reading side, your application logic can now be written as straightforward async code.

async Task ProcessStreamAsync(StreamState state)
{
    var reader = state.Pipe.Reader;

    while (true)
    {
        var result = await reader.ReadAsync();
        var buffer = result.Buffer;

        if (buffer.Length > 0)
        {
            await HandleApplicationMessage(buffer);
        }

        reader.AdvanceTo(buffer.End);

        if (result.IsCompleted)
            break;
    }
}

This is the point where MsQuic stops feeling alien. You have turned callbacks into something that fits naturally into the async model you already understand.

Defining an application protocol on top of streams

At this point, you have raw byte streams. You still need a protocol.

In our service, the first stream a client opens is a control stream. The client sends a session token. The server responds with either a resume acknowledgement or a new session id.

A simple framing format might be length prefixed JSON messages. That is not flash, but it is effective.

record ControlMessage(string Type, string? Token);

async Task HandleControlStream(StreamState state)
{
    var reader = state.Pipe.Reader;

    var message = await ReadJsonMessage<ControlMessage>(reader);

    if (message.Type == "resume")
    {
        state.Connection.Session = ResumeSession(message.Token);
    }
    else
    {
        state.Connection.Session = CreateNewSession();
    }

    await SendJsonMessage(state.Stream, state.Connection.Session);
}

Once the session is established, subsequent streams implicitly belong to it. The transport does not know or care. This is purely application logic layered on top of QUIC.

Supporting reconnection and resumability

QUIC can survive IP changes, but not process restarts. If you want resumability across reconnects, you must build it yourself.

The pattern that works is simple. Session state lives independently of connections. Connections attach to sessions.

sealed class SessionState
{
    public string Token { get; }
    public ConcurrentDictionary<long, object> LogicalState { get; } = new();
    public DateTime LastSeen { get; set; }

    public SessionState(string token)
    {
        Token = token;
        LastSeen = DateTime.UtcNow;
    }
}

When a connection drops, you do not immediately destroy the session. You mark it as detached. If a client reconnects and presents the same token within a timeout window, you reattach.

Reconnecting and reopening streams is cheap. You are not fighting TCP TIME_WAIT or connection storms.

Writing back to the client without blocking everything else

Sending data in MsQuic is also asynchronous and stream scoped. Each stream has its own flow control. Writing too much on one stream does not block others.

A typical send method.

async Task SendAsync(QuicStream stream, ReadOnlyMemory<byte> data)
{
    await stream.WriteAsync(data, endStream: false);
}

Because streams are independent, you can fire off writes on multiple streams concurrently without fear of head of line blocking. This is one of the core advantages of QUIC that you only feel when you work at this level.

Handling shutdowns cleanly

One of the hardest parts of any transport code is shutdown. Streams may still be active. Connections may be half closed. Callbacks may still be in flight.

The rule with MsQuic is simple. Never assume callbacks have stopped until the connection is fully closed. Use cancellation tokens to signal intent, but always code defensively.

When shutting down a connection, cancel its lifetime token, stop accepting new streams, and let existing streams drain if possible. Then close the connection.

If you get this wrong, you will see use after free bugs and random crashes. This is where discipline matters.

Observability with real identifiers

Because you are not using HTTP, you must create your own observability story.

A simple but effective approach is to generate a connection id and a stream id and include them in every log entry.

logger.LogInformation(
    "Received data on connection {ConnectionId}, stream {StreamId}",
    connectionState.ConnectionId,
    streamState.StreamId);

When something goes wrong under load, these identifiers give you a narrative. Without them, you are blind.

Metrics matter just as much. Active connections, active streams, and backpressure events tell you far more about system health than CPU usage ever will.

Testing with real QUIC traffic

The only tests that really matter here are integration tests that use real MsQuic connections.

You can spin up the server in process and connect using a QuicConnection from a test project. Because everything is async and UDP based, these tests are fast enough to run in CI.What you should not do is mock QuicStream or QuicConnection. That gives you confidence in code paths that will never behave the same way under real conditions.

If you choose MsQuic, you choose realism.

When this level of control is worth it

Direct MsQuic usage is not for every service. It is worth it when you need fine grained control over concurrency, latency, and state. It is worth it when HTTP semantics get in your way. It is worth it when you own both ends of the connection, Its not worth it for simple CRUD APIs or public facing endpoints where ecosystem compatibility matters more than raw capability.The key is intentionality. MsQuic is a powerful tool. Used deliberately, it lets you build systems that were previously awkward or impossible. Used casually, it will punish you.

Working directly with MsQuic in .NET forces you to think like a systems programmer again, but with modern language tools at your disposal. You deal with lifetimes, concurrency, and flow control explicitly, but you also get async, memory safety, and rich diagnostics.

If you are building serious distributed systems and you are willing to meet the transport layer on its own terms, this approach opens doors that HTTP keeps closed. That is the real payoff.