Sunday, October 12, 2014

C#'s yield Keyword

Introduction

The yield keyword in C# represents the only C# interview question I failed to answer -- "What does the yield keyword do in C#?" All I remembered is it had something to do with enumerators.

This post demonstrates the use of the C# keyword, yield. This also covers how to implement an IEnumerable and IEnumerator interface in support of using C#'s foreach keyword. Demonstrating IEnumerable will show the true power of the yield keyword. It takes significantly less code to implement an enumerator with yield than it does with IEnumerable/IEnumerator.

Example

The following code is just a loop that displays all the factorial values for the number 5 (where 5 is the value of the parameter passed to the Factorial method):

foreach (var factor in FactorialManager.Factorial(5))
{
  Console.WriteLine(factor);
}

The output is as follows:


Under the covers the Factorial method implements the IEnumerable or IEnumerable<int> interface and the compiler as part of the foreach keyword invokes IEnumerable.GetEnumerator which returns an IEnumerator or IEnumerator<int>  instance. The IEnumerable and IEnumerator interfaces can be implemented manually or can be implemented by taking advantage of the yield keyword.

Implementing IEnumerable/IEnumerator

The foreach keyword expects the FactorialManager.Factorial method to return a type that implements IEnumerable<int>. The implementation of the FactorialManager class is as follows:

class FactorialManager : IEnumerable<int>
{
  private int _factorialToCompute;

  public FactorialManager(int factorialToCompute)
  {
    _factorialToCompute = factorialToCompute;
  }

  public static IEnumerable<int> Factorial(int factorialToCompute)
  {
    return new FactorialManager(factorialToCompute);
  }

  public IEnumerator<int> GetEnumerator()
  {
    return new FactorialSupport(_factorialToCompute);
  }

  System.Collections.IEnumerator 
    System.Collections.IEnumerable.GetEnumerator()
  {
    return GetEnumerator();
  }
}

The previous code is not complex but it is verbose. The  Factorial method (a class factory) returns an instance of FactorialManager which implements IEnumerable<int>:

  public static IEnumerable<int> Factorial(int factorialToCompute)
  {
    return new FactorialManager(factorialToCompute);
  }


The FactorialManager class instance implemented IEnumerable's GetEnumerator method:

  public IEnumerator<int> GetEnumerator()
  {
    return new FactorialSupport(_factorialToCompute);
  }

GetEnumerator method returns an instance of FactorialSupport. The FactorialSupport class implements the IEnumerator<T> interface therefore it is an appropriate return value for the GetEnumerator method. The IEnumerator<T> interface is defined follows:

public interface IEnumerator<out T> : IDisposable, IEnumerator


The IDisposable requires IEnumerator  to implement:

void Dispose()

The Dispose is used to free unmanaged resources. A factorial implementation has no unmanaged resources so Dispose should just return and do nothing.

The IEnumerator requires IEnumerator<T>  to implement the following property:

T Current { get; }

The Current property will be values such as 1, 2, 6, 24 and 120 depending on the current value returned by the factorial computation. 

The IEnumerator requires IEnumerator<T>  to implement the following methods:

void Reset()
bool MoveNext()

The Reset method initializes the computation. The MoveNext moves to the next iteration. So if the value of Current was 6 (1 * 2 * 3) then MoveNext will set Current to 24 (1 * 2 * 3 * 4).

The FactorialSupport class is implemented as follows:

public class FactorialSupport : IEnumerator<int>
{
  private int _index;

  private int _factorialToCompute;

  private int _factorial;

  public FactorialSupport(int factorialToCompute)
  {
    if (factorialToCompute < 0)
    {
      throw new ArgumentException("Cannot compute factorial for " + 
                                  factorialToCompute.ToString());
    }

    _factorialToCompute = factorialToCompute;
    Reset();
  }

  public int Current
  {
    get { return _factorial; }
  }

  public void Dispose()
  {
    return; // nothing to do here. 
  }

  object System.Collections.IEnumerator.Current
  {
    get { return this.Current; }
  }

  public bool MoveNext()
  {
    if (_index > _factorialToCompute)
    {
      return false;
    }

    _factorial *= _index;
    _index++;

    return true;
  }

  public void Reset()
  {
    _index = 1;
    _factorial = 1;
  }
}

The previous code just implements an IEnumerator that returns factorial values using Current, MoveNext and Reset.

Implementing support for foreach using FactorialManager and FactorialSupport requires approximately seventy lines of code. This is a lot of code for something as trivial as a factorial computation. 

Implementing an Enumerator with yield

A method, property or operator that returns IEnumerable<int> or IEnumerable may contain the yield keyword. The yield keyword is placed before the break or return keywords and is used as follows.

yield return <expression>;
yield break;

An example where the yield keyword is marked in boldface is as follows:

public static IEnumerable<int> Factorial(int n)
{
  if (n == 0)
  {
    n = 1;
  }

  int value = 1;

  for (int i = 1; i <= n; i++)
  {
    value *= i;
    yield return value;
  }
}

static void Main(string[] args)
{
  foreach (var factor in Factorial(5))
  {
    Console.WriteLine(factor);
  }
}

The Factorial method above (approximately ten lines of code) is all this is required to implement the IEnumerable and IEnumerator interfaces required by foreach. The beauty of the yield keyword is that it instructs the compiler to implement:
  • IEnumerable.GetEnumerator
  • IEnumerator.Dispose
  • IEnumerator.Current
  • IEnumerator.MoveNext
  • IEnumerator.Reset

Conclusion

The yield keyword is not a .NET feature. It is a feature built into the C# compiler designed to making support foreach simpler for developers.

No comments:

Post a Comment