Thursday, September 7, 2017

Deferred sorting of collections

Recently I faced a problem of storing sorting information for later use. Sorting is easy if you already have a collection to sort. Just use LINQ extension methods. But what if you don't have the collection yet? What if you'll get the collection later, but you need to store sorting rules now? Here I'll tell you, how it can be done.

Let's say, we need to sort a collection of simple 'Person' class:

public class Person
{
    public string Name { get; set; }
    public int Age { get; set; }
}

If we have a collection, it is easy to apply any sorting:

IEnumerable<Person> people = new List<Person>();

var sortedPeople = people.OrderBy(p => p.Name).ThenByDescending(p => p.Age);

But what if we don't have a collection yet? I want to have the following class Sorter:

public enum SortingDirection
{
    Ascending,
    Descending
}

public class Sorter
{
    public void AddSortingExpression<T>(Func<Person, T> keySelector, SortingDirection direction)
    {
        ...
    }

    public IEnumerable<Person> Sort(IEnumerable<Person> people)
    {
        ...
    }
}

And I want to use it like this:

var sorter = new Sorter();
sorter.AddSortingExpression(p => p.Name, SortingDirection.Ascending);
sorter.AddSortingExpression(p => p.Age, SortingDirection.Descending);

var sortedPeople = sorter.Sort(people);

How should the Sorter class be implemented?

Solution 1

One possible solution is in dynamic creation of function, which will actually implement sorting:

public class Sorter
{
    private Func<IEnumerable<Person>, IOrderedEnumerable<Person>> _sortingFunction;

    public void AddSortingExpression<T>(Func<Person, T> keySelector, SortingDirection direction)
    {
        if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));

        switch (direction)
        {
            case SortingDirection.Ascending:
                if (_sortingFunction == null)
                {
                    _sortingFunction = people => people.OrderBy(keySelector);
                }
                else
                {
                    var oldSorter = _sortingFunction;
                    _sortingFunction = people => oldSorter(people).ThenBy(keySelector);
                }
                break;
            case SortingDirection.Descending:
                if (_sortingFunction == null)
                {
                    _sortingFunction = people => people.OrderByDescending(keySelector);
                }
                else
                {
                    var oldSorter = _sortingFunction;
                    _sortingFunction = people => oldSorter(people).ThenByDescending(keySelector);
                }
                break;
            default:
                throw new ArgumentOutOfRangeException(nameof(direction), direction, "Unknown sorting direction");
        }
    }

    public IEnumerable<Person> Sort(IEnumerable<Person> people)
    {
        if (_sortingFunction == null)
            return people;

        return _sortingFunction(people).ToArray();
    }
}

Here we construct sorting function (field _sortingFunction). On each invocation of AddSortingExpression method, we change this function using its previous value. I would like to pay your attention to the usage of 'oldSorter' variable. We can't use the following expression:

_sortingFunction = people => _sortingFunction(people).ThenByDescending(keySelector);

because it will lead to the endless recurrent calls of the same function. This is why we store the current value of sorting function into a temporary variable.

Solution 2

Another possible solution uses the fact, that LINQ offers lazy evaluation:

public class Sorter
{
    private class InternalEnumerable : IEnumerable<Person>
    {
        public IEnumerable<Person> People { get; set; } = Enumerable.Empty<Person>();

        public IEnumerator<Person> GetEnumerator()
        {
            return People.GetEnumerator();
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }

    private readonly InternalEnumerable _initialEnumerable = new InternalEnumerable();

    private IOrderedEnumerable<Person> _sortedPeople;

    public void AddSortingExpression<T>(Func<Person, T> keySelector, SortingDirection direction)
    {
        if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));

        switch (direction)
        {
            case SortingDirection.Ascending:
                if (_sortedPeople == null)
                {
                    _sortedPeople = _initialEnumerable.OrderBy(keySelector);
                }
                else
                {
                    _sortedPeople = _sortedPeople.ThenBy(keySelector);
                }
                break;
            case SortingDirection.Descending:
                if (_sortedPeople == null)
                {
                    _sortedPeople = _initialEnumerable.OrderByDescending(keySelector);
                }
                else
                {
                    _sortedPeople = _sortedPeople.ThenByDescending(keySelector);
                }
                break;
            default:
                throw new ArgumentOutOfRangeException(nameof(direction), direction, "Unknown sorting direction");
        }
    }

    public IEnumerable<Person> Sort(IEnumerable<Person> people)
    {
        if (_sortedPeople == null)
            return people;

        _initialEnumerable.People = people;
        return _sortedPeople.ToArray();
    }
}

Here we always start from the instance of InternalEnumerable class, implementing IEnumerable<Person>. So it looks like we have our collection to sort from the beginning. But in the Sort method, we change actual collection, which the instance of this class returns.

I hope this small tip will be useful for you in your programs.

No comments:

Post a Comment