[ANSWER] Here are Common Mistakes When Working with Enumerables. Can you see it?

Write code is not a simple task. It is easy to make mistakes that result in bad performance. The last post, I asked what is wrong with the following code:

public static IEnumerable<T> ElementsFromTheFirstWhichAreAlsoInTheSecond<T>(
    IEnumerable<T> enumerable1,
    IEnumerable<T> enumerable2
)
{
    var result = new List<T>();

    enumerable1.ToList().ForEach(a =>
    {
        if (enumerable2.Contains(a))
        {
            result.Add(a);
        }
    });

    return result;
}

Let’s try to explain what is wrong.

Mistake #1 – ToList!

From last post comments:

For some reason, a few developers like so much to write inline code. So, this pattern enumerable1.ToList().ForEach is very common. Problem is ToList will iterate all elements of enumerable to create a new object (more complex) List. – Tiago Borba

I think the “developers reason” is to try to adopt a “functional style.” Unfortunately in a wrong way.

Let me explain what Tiago was talking about with an example. When running the following code:

class Program
{
    static void Main(string[] args)
    {
        var enumerable = TenInstances().ToList();

        foreach (var foo in enumerable.Take(3))
        {
            Console.WriteLine(foo.Value);
        }
    }

    public static IEnumerable<Foo> TenInstances()
    {
        for (var i = 0; i < 10; i++)
        {
            yield return new Foo(i);
        }
    }    
}

class Foo
{
    public int Value { get; }

    public Foo(int value)
    {
        Value = value;
        Console.WriteLine($"Created {value}.");
    }
}

The output is:

As you can see, we are creating ten objects when we need only 3. Stop calling ToList() would change the result dramatically:

Did you get the point? We should never call ToList without a good reason.

It’s time to return to the code which motivated this post. Let’s dig a little deeper.

 
.method public hidebysig static 
	class [System.Runtime]IEnumerable`1<!!T> ElementsFromTheFirstWhichAreAlsoInTheSecond<T> (
		class [System.Runtime]IEnumerable`1<!!T> enumerable1,
		class [System.Runtime]IEnumerable`1<!!T> enumerable2
	) cil managed 
{
	.maxstack 3
	.locals init (
		[0] class Program/'<>c__DisplayClass1_0`1'<!!T>
	)

	IL_0000: newobj instance void class Program/'<>c__DisplayClass1_0`1'<!!T>::.ctor()
	IL_0005: stloc.0
	IL_0006: ldloc.0
	IL_0007: ldarg.1
	IL_0008: stfld class [System.Runtime]IEnumerable`1<!0> class Program/'<>c__DisplayClass1_0`1'<!!T>::enumerable2
	IL_000d: ldloc.0
	IL_000e: newobj instance void class [System.Collections]List`1<!!T>::.ctor()
	IL_0013: stfld class [System.Collections]List`1<!0> class Program/'<>c__DisplayClass1_0`1'<!!T>::result
	IL_0018: ldarg.0
	IL_0019: call class [System.Collections]List`1<!!0> [System.Linq]Enumerable::ToList<!!T>(class [System.Runtime]IEnumerable`1<!!0>)
	IL_001e: ldloc.0
	IL_001f: ldftn instance void class Program/'<>c__DisplayClass1_0`1'<!!T>::'<ElementsFromTheFirstWhichAreAlsoInTheSecond>b__0'(!0)
	IL_0025: newobj instance void class [System.Runtime]Action`1<!!T>::.ctor(object, native int)
	IL_002a: callvirt instance void class [System.Collections]List`1<!!T>::ForEach(class [System.Runtime]System.Action`1<!0>)
	IL_002f: ldloc.0
	IL_0030: ldfld class [System.Collections]List`1<!0> class ConsoleApp1.Program/'<>c__DisplayClass1_0`1'<!!T>::result
	IL_0035: ret
}

As you can see, there are two unnecessary allocations here: 1) for the List object and; 2) for the closure. The problem is it is not really necessary.

My recommendation? Goodbye, ToList().

public static IEnumerable<T> ElementsFromTheFirstWhichAreAlsoInTheSecond<T>(
    IEnumerable<T> enumerable1,
    IEnumerable<T> enumerable2)
{
    var result = new List<T>();
    foreach (var a in enumerable1)
    {
        if (enumerable2.Contains(a))
        {
            result.Add(a);
        }
    }
    return result;
}

But there are other considerations.

Mistake #2 – Unnecessary Iterations Caused by Contains Method

Please consider the following code:

static void Main(string[] args)
{
    var result = ElementsFromTheFirstWhichAreAlsoInTheSecond(
        Enumerable1(),
        Enumerable2()
    );
}

public static IEnumerable<int> Enumerable1()
{
    for (int i = 5; i >= 0; i--)
    {
        Console.WriteLine($"Enumerable 1: {i}");
        yield return i;
    }
}

public static IEnumerable<int> Enumerable2()
{
    for (int i = 0; i < 5; i++)
    {
        Console.WriteLine($"Enumerable 2: {i}");
        yield return i;
    }
}

public static IEnumerable<T> ElementsFromTheFirstWhichAreAlsoInTheSecond<T>(
    IEnumerable<T> enumerable1,
    IEnumerable<T> enumerable2)
{
    var result = new List<T>();
    foreach (var a in enumerable1)
    {
        if (enumerable2.Contains(a))
        {
            result.Add(a);
        }
    }
    return result;
}

Here is the output:

 

As you can see, the Contains will restart the enumeration in every call.

My recommendation? Ensure that the enumeration will run only one time.

public static IEnumerable ElementsFromTheFirstWhichAreAlsoInTheSecond(
    IEnumerable enumerable1,
    IEnumerable enumerable2)
{
    var result = new List();
    var enumerable2AsArray = enumerable2 as T[] ?? enumerable2.ToArray();
    foreach (var a in enumerable1)
    {
        if (enumerable2AsArray.Contains(a))
        {
            result.Add(a);
        }
    }
    return result;
}

Got it?

Mistake #3 – Returning a list

If the method’s returning type is IEnumerable, use yield return instead of creating a List. That will prevent unnecessary iterations.

public static IEnumerable<T> ElementsFromTheFirstWhichAreAlsoInTheSecond<T>(
    IEnumerable<T> enumerable1,
    IEnumerable<T> enumerable2)
{
    var enumerable2AsArray = enumerable2 as T[] ?? enumerable2.ToArray();
    foreach (var a in enumerable1)
    {
        if (enumerable2AsArray.Contains(a))
        {
            yield return a;
        }
    }
}

That’s it.

If you have additional considerations, comment it.

Compartilhe este insight:

2 respostas

  1. I know the method shown in the post is just an example, but for future readers’ sake I think it’s interesting to add that you can achieve the same result using the “Intersect” extension method from LINQ.

    Anyway, great post, as usual.

    1. Not really. Here is the Intersect implementation.

      private static IEnumerable<TSource> IntersectIterator<TSource>(
          IEnumerable<TSource> first, 
          IEnumerable<TSource> second, 
          IEqualityComparer<TSource> comparer
          )
      {
          Set<TSource> set = new Set<TSource>(comparer);
          foreach (TSource element in second)
          {
              set.Add(element);
          }
      
          foreach (TSource element in first)
          {
              if (set.Remove(element))
              {
                  yield return element;
              }
          }
      }
      

      Agree?

Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *

Elemar Júnior

Sou fundador e CEO da EximiaCo e atuo como tech trusted advisor ajudando diversas empresas a gerar mais resultados através da tecnologia.

Elemar Júnior

Sou fundador e CEO da EximiaCo e atuo como tech trusted advisor ajudando diversas empresas a gerar mais resultados através da tecnologia.

Mais insights para o seu negócio

Veja mais alguns estudos e reflexões que podem gerar alguns insights para o seu negócio:

Há pouco menos de um ano, aceitei o desafio de liderar o time de tecnologia e estratégia de negócios da...
Poucas empresas foram tão impactadas por inovações disruptivas quanto a Encyclopaedia Britannica – empresa com mais de 250 anos. Entretanto,...
Retomar em plenitude todas as atividades econômicas, aumentando o risco de contágio pelo novo coronavírus e, eventualmente, sacrificar uma parcela...
Software em funcionamento é mais relevante que documentação abrangente. Concordo com esse princípio expresso no manifesto ágil. Entretanto, acho que...
No post anterior, compartilhei um exemplo de como containers podem nos ajudar a deixar o código mais claro sobre os...
Mais uma vez, tive a honra de participar de um excelente bate-papo sobre microsserviços com o pessoal da Lambda 3....
× Precisa de ajuda?