Why Initialize Collection Size

In possibly every code base you might have seen, usage of Collections would have been a common sight. The introduction of Generic Collections in early stages of evolutin of language has made is all the more favourite of developers comparing to the non-generic cousins that existed before. One of the most common patterns seen in any code base is as follows.

var persons = GetPersonList();
var studentList = new List<Student>();
foreach(var person in persons)
{
    studentList.Add(new Student
    {
        FirstName = person.FirstName,
        LastName = person.LastName
    });
}

At the first glance, there is nothing wrong with the code. In fact, it works well too. Unlike arrays, one doesn’t need to declare the size of List beforehand, and it sort of magically does the trick to expand itself. However, there is a hidden problem caused due to the optimized way the Collection is implemented to magically expand.

Beneath the List declaration, there is still an array, which is made to resize according to the ever growing requirements of the Collection. This is done in the Add method, which is defined as

public void Add(T item) 
{
    if (_size == _items.Length) EnsureCapacity(_size + 1);
    _items[_size++] = item;
    _version++;
}

private void EnsureCapacity(int min) 
{
    if (_items.Length < min) {
        int newCapacity = _items.Length == 0? _defaultCapacity : _items.Length * 2;

        if ((uint)newCapacity > Array.MaxArrayLength) newCapacity = Array.MaxArrayLength;
        if (newCapacity < min) newCapacity = min;
        Capacity = newCapacity;
    }
}

The Add() method checks if the array is full, and if so, it calls the EnsureCapacity method to double the Capacity.

The resizing of course happens in the Capacity Property.

public int Capacity 
{
    get {
        Contract.Ensures(Contract.Result<int>() >= 0);
        return _items.Length;
    }
    set {
        if (value < _size) {
            ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity);
        }
        Contract.EndContractBlock();

        if (value != _items.Length) {
            if (value > 0) {
                T[] newItems = new T[value];
                if (_size > 0) {
                    Array.Copy(_items, 0, newItems, 0, _size);
                }
                _items = newItems;
            }
            else {
                _items = _emptyArray;
            }
        }
    }
}

The key take away from the Add/EnsureCapcity methods is that every time the size of array is breached, the capacity of the array is doubled. This leads to couple of undesired things.

  • Expensive operation to resize the array

As the List grows, there would innumerous expensive resizing operations, which is quite undesired.

  • Possibility that a lot of memory spaces left allocated but unused.

Consider if the number of Person in the personList is 258. Due to the logic shown above, the Capacity of Array would now 512. That is a whole lot of unused allocated memory space when you are working with memory intensive application.

// Default Behavior
Console.WriteLine($"StudentList(Count={studentList.Count},Capacity={studentList.Capacity})");
// Output
StudentList(Count=258,Capacity=512)

The ideal way to counter this would be to use overloaded constructor and initialize the size of array.

var studentList = new List<Student>(persons.Count());

This would reduce the need for resizing of internal array.

// With Collection Size Iniitialized as above
 Console.WriteLine($"StudentList(Count={studentList.Count},Capacity={studentList.Capacity})");
// Output
StudentList(Count=258,Capacity=258)

Of course there could be places where you might not be having the convenience of knowing the collection size beforehand. But still, if you are able to be pick an initial size well, you could eliminate or reduce the need to resize the array.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s