.Net 6 : Benchmark performance of JsonSerializer.DeserializeAsyncEnumerable

This should have been part of my earlier post on System.Text.Json Support for IAsyncEnumerable, but it slipped off my mind. So here we are.

To understand the significance of this feature in .Net 6, one need to understand the circumstances under which these might be useful. The first of those would be of course, that the we could be consuming the data even as the rest of the JSON is yet to be deserialized.

The significance is further amplified when you are only interested in the some earlier part of the data. Now you do not really need to deserialize the entire JSON (considering it is a huge one), hold it up in your buffers, and then use only a fraction of those. This could provide immense performance boost to the application.

Let us compare and benchmark the performance of various methods exposed by System.Text.Json for deserialization and attempt to understand it better.

There will be 3 methods which we would be placing under the hammer.

  • JsonSerializer.Deserialize<T>
  • JsonSerializer.DeserializeAsync<T>
  • JsonSerializer.DeserializeAsyncEnumerable<T>

Let us write some code to benchmark them.

[Benchmark]
public void TestDeseriliaze()
{
    foreach(var item in DeserializeWithoutStreaming().TakeWhile(x => x.Id < DATA_TO_COMSUME))
    {
        // DoSomeWork
    }
}

public IEnumerable<Data> DeserializeWithoutStreaming()
{
    var deserializedData = JsonSerializer.Deserialize<IEnumerable<Data>>(serializedString);
    return deserializedData;
}

[Benchmark]
public async Task TestDeseriliazeAsync()
{
    foreach (var item in (await DeserializeAsync()).TakeWhile(x => x.Id < DATA_TO_COMSUME))
    {
        // DoSomeWork
    }
}

public async Task<IEnumerable<Data>> DeserializeAsync()
{
    var memStream = new MemoryStream(Encoding.UTF8.GetBytes(serializedString));
    var deserializedData = await JsonSerializer.DeserializeAsync<IEnumerable<Data>>(memStream);
    return deserializedData;
}



[Benchmark]
public async Task TestDeserializeAsyncEnumerable()
{
    await foreach (var item in DeserializeWithStreaming().TakeWhile(x => x.Id < DATA_TO_COMSUME))
    {
        // DoSomeWork
    }
}

public async IAsyncEnumerable<Data> DeserializeWithStreaming()
{
    using var memStream = new MemoryStream(Encoding.UTF8.GetBytes(serializedString));
    await foreach(var item in  JsonSerializer.DeserializeAsyncEnumerable<Data>(memStream))
    {
        yield return item;
    }
}

Scenario 1 : Consuming only first 20% of the JSON Data

The first scenario we need to consider is when only a fairly small amount of the JSON data is consumed, say the first 20% of the data. While Deserialize<T> and DeserializeAsync would need to deserialize the entire JSON, even if the client would consume only the first 20% of that data, on other hand, DeserializeAsyncEnumerable would deserialize on-demand. This is evident in the benchmark results as well, where the performance of the DeserializeAsyncEnumerable is almost 3 times better.

MethodMeanErrorStdDev
TestDeseriliaze4.810 ms0.0952 ms0.2573 ms
TestDeseriliazeAsync5.166 ms0.1008 ms0.1161 ms
TestDeserializeAsyncEnumerable1.531 ms0.0305 ms0.0825 ms

Scenario 2: Consuming about 80% of the JSON Data

In the second scenario, we will consider when the client consume 80% of data. As one could assume, the now a larger part of JSON data has to be consumed and hence the performance margin decreases.

MethodMeanErrorStdDev
TestDeseriliaze4.960 ms0.0974 ms0.1877 ms
TestDeseriliazeAsync5.238 ms0.0997 ms0.1297 ms
TestDeserializeAsyncEnumerable4.851 ms0.0859 ms0.0804 ms

This is expected too, as more of the JSON is deserialized the performance difference is hardly significant, if not non-existent. But still, there is an advantage of using the DeserializeAsyncEnumerable – you would not have to wait for the entire JSON to be deserialized, the on-demand streaming approach allows you to consume the data as soon parts of JSON are deserialized.

I felt this is a huge improvement, especially when the concerned JSON is significantly large. Like many others, I am equally excited to see the improvements in .Net in recent years and is looking forward for the release of .Net 6.

.Net 6 : System.Text.Json support for IAsyncEnumerable

As the Preview 4 of .Net 6 becomes available, one of the things that excites me is the System.Text.Json support for IAsyncEnumerable. The IAsyncEnumerable, introduced in .Net Core 3 and C# 8, enables us to iterate over async Enumerables. The newer version extends this support to the System.Text.Json.

Consider the following data.

[{"Id":0,"Value":"915777539"},{"Id":1,"Value":"1332243482"},{"Id":2,"Value":"306207588"},
 {"Id":3,"Value":"1413388423"},{"Id":4,"Value":"2145941621"},{"Id":5,"Value":"1041779876"},
 {"Id":6,"Value":"1121436961"},{"Id":7,"Value":"520045044"},{"Id":8,"Value":"1357859915"},
 {"Id":9,"Value":"1340510964"},{"Id":10,"Value":"1183306988"},{"Id":11,"Value":"502467538"},
 {"Id":12,"Value":"31513434"},{"Id":13,"Value":"999086707"},{"Id":14,"Value":"961728759"},
 {"Id":15,"Value":"1756662810"},{"Id":16,"Value":"1018107007"},{"Id":17,"Value":"433502262"},
 {"Id":18,"Value":"1784715926"},{"Id":19,"Value":"1418088822"},{"Id":20,"Value":"645106286"},
 {"Id":21,"Value":"1720929044"},{"Id":22,"Value":"1102142546"},{"Id":23,"Value":"2138442183"},
 {"Id":24,"Value":"208176799"},{"Id":25,"Value":"1700100438"},{"Id":26,"Value":"769308703"},
 "Id":27,"Value":"1558581057"},{"Id":28,"Value":"352810944"},{"Id":29,"Value":"299925316"}]

we could now write a streaming deserialization method using the JsonSerializer.DeserializeAsyncEnumerable. For example.

public async IAsyncEnumerable<T> DeserializeStreaming<T>(string data)
{
    using var memStream = new MemoryStream(Encoding.UTF8.GetBytes(data));

    await foreach(var item in JsonSerializer.DeserializeAsyncEnumerable<T>(memStream))
    {
        yield return item;
        await Task.Delay(1000);
    }
}

// Data
public class Data
{
    public int Id { get; set; }
    public string Value { get; set; }

}

The async streams of deserialized data provides an oppurtunity to deserialize on demand, which could be great addition particularly deserializing large data.

var instance = new StreamingSerializationTest();
await foreach (var item in instance.DeserializeStreaming<Data>(dataString))
{
    Console.WriteLine($"Data Item: {nameof(Data.Id)}={item.Id} , {nameof(Data.Value)}={item.Value}");

    if(item.Id > 5)
    {
        break;
    }
}

As you can observe in the output below, this would deserialiaze only on-demand.

Streaming Deserialize Demo
Inside Deserializing..
Data Item: Id=0 , Value=915777539
Inside Deserializing..
Data Item: Id=1 , Value=1332243482
Inside Deserializing..
Data Item: Id=2 , Value=306207588
Inside Deserializing..
Data Item: Id=3 , Value=1413388423
Inside Deserializing..
Data Item: Id=4 , Value=2145941621
Inside Deserializing..
Data Item: Id=5 , Value=1041779876
Inside Deserializing..
Data Item: Id=6 , Value=1121436961

As the moment, the Deseriliazation is severely limited to root level Json Arrays, but I guess that would over time as .Net 6 reaches release. Let us take a look at the serialization as well now. Turns out that is easy as well.

private async IAsyncEnumerable<Data> Generate(int maxItems)
{
    var random = new Random();
    for (int i = 0; i < maxItems; i++)
    {
        yield return new Data
        {
            Id = i,
            Value = random.Next().ToString()
        };
    }
}

public async Task SerializeStream()
{
    using var stream = Console.OpenStandardOutput();
    var data = new { Data = Generate(30) };
    await JsonSerializer.SerializeAsync(stream, data);
}

At this point, am not quite excited about streaming Serialization as I am about streaming Deserilization. This is because of the lack of usecase it might support. But am not denying there could be usecases, and over the time, I might be equally excited about it as well.

Complete sample of the code in this demo could be found in my Github

We will continue exploring the .Net 6 features in the upcoming posts as well. Until then, enjoy coding…