Iterator Specifics in C#

Iterators in C# are a very complex syntactic sugar. They take a function with yield return statements (and possible some yield break statements) and transfer it into a state machine. When you yield return, the state of the function is recorded, and execution resumes from that state the next time the iterator is called upon to produce another object. The local variables of the iterator, including the hidden this parameter, become member variables of a helper class. The helper class also has an internal state member that keeps track of where execution interrupted and an internal current member that holds the object most recently enumerated.

class TestClass
{
    int end = 0;
    public TestClass(int end)
    {
        this.end = end;
    }

    public IEnumerable<int> DoCount(int begin)
    {
        for (int i = begin; i <= end; i++)
        {
            yield return i;
        }
    }
}

The DoCount method emits an integer enumerator that gives out integers starting at begin and continuing till end is reached. The compiler internally converts this enumerator into
something like this:

class TestClass_Enumerator : IEnumerable<int> {
    int state$0 = 0;  // internal member
    int current$0;    // internal member
    TestClass this$0; // implicit parameter to DoCount
    int begin;        // explicit parameter to DoCount
    int i;            // local variable of DoCount

    public int Current { get { return current$0; } }

    public bool MoveNext()
    {
        switch (state$0)
        {
            case 0: goto resume$0;
            case 1: goto resume$1;
            case 2: return false;
        }

        resume$0:;

        for (i = begin; i <= this$0.end; i++) {
            current$0 = i;
            state$0 = 1;
            return true;
            resume$1:;
        }

        state$0 = 2;
        return false;
    }
    ...
}

public IEnumerable<int> DoCount(int begin)
{
    TestClass_Enumerator e = new TestClass_Enumerator();
    e.this$0 = this;
    e.begin = begin;
    return e;
}

The enumerator class is auto-generated by the compiler and it contains two internal members for the state and current object, plus a member for each parameter (including the hidden this parameter), plus a member for each local variable. The Current property returns the current object. All the real work happens in MoveNext. To generate the MoveNext method, the compiler takes the code you write and performs a few transformations. First, all the references to variables and parameters need to be adjusted since the code moved to a helper class.

  • this becomes this$0, because inside the rewritten function, this refers to the auto-generated class, not the original class.
  • m becomes this$0.m when m is a member of the original class (a member variable, member property, or member function). This rule is actually redundant with the previous rule, because writing the name of a class member m without a prefix is just shorthand for this.m.
  • v becomes this.v when v is a parameter or local variable. This rule is actually redundant, since writing v is the same as this.v, but I call it out explicitly so you’ll notice that the storage for the variable has changed.

The compiler also has to deal with all those yield return statements. Each yield return x becomes:

    current$0 = x;
    state$0 = n;
    return true;
    resume$n:;

where n is an increasing number starting at 1.

And then there are the yield break statements. Each yield break becomes

    state$0 = n2;
    return false;

where n2 is one greater than the highest state number used by all the yield return statements.

Finally, the compiler puts the big state dispatcher at the top of the function. At the start of the function, insert:

switch (state$0)
{
    case 0: goto resume$0;
    case 1: goto resume$1;
    case 2: goto resume$2;
    ...
    case n: goto resume$n;
    case n2: return false;
}

with one case statement for each state, plus the initial zero state and the final n2 state.