Please navigate to the bottom of the page for Table of Contents

Tuesday, May 22, 2012

Explain C# yield keyword with an example

Yesterday I was helping another developer with an external API that had a weird limitation. The API only “processed” 75 strings in a call but allowed users to have tens of thousands of strings. So the question become on how to chunk these 20000 strings list into batches of 75.

There are a lot of ways to do this. I decided to use C# yield keyword to solve this problem. The basic idea of yield is to return from the middle of a an iterator and keep coming back into that loop on subsequent calls.  The function in which you yield return needs to have a return type of System.Collections.IEnumerable.

There are a few things to note when defining a function that uses yield. The yield statement can only appear inside an iterator block, which can be implemented as the body of a method, operator, or accessor. The body of such methods, operators, or accessors is controlled by the following restrictions:

  • Unsafe blocks are not allowed.

  • Parameters to the method, operator, or accessor cannot be ref or out.

  • A yield return statement cannot be located anywhere inside a try-catch block. It can be located in a try block if the try block is followed by a finally block.

  • A yield break statement may be located in a try block or a catch block but not a finally block.

Enough of theory. Now let’s see an example in action:

 

using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;

namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
// load from some data source
List<string> mydataset = new List<string>()
{ "aaa", "bbb", "ccc",
"ddd", "eee", "fff",
"ggg", "hhh", "iii"};

// process the data in chunks of 2
foreach (List<string> chunks in ChunkMe(mydataset, 2))
{
// process the data
Console.WriteLine("Processing batch...size of " + chunks.Count);
chunks.ForEach(s => Console.WriteLine(s));
}
}

/// <summary>
/// chunks a long arbitrary array into smaller peices
/// </summary>
/// <param name="data">long array of strings</param>
/// <param name="chunkSize">the number of rows to return</param>
/// <returns>rows from the array</returns>
public static IEnumerable ChunkMe(List<string> data, int chunkSize)
{
// do error checks on input params

// start at the beginning of the list
int currentChunkStart = 0;

// the chunk of data to return
List<string> currentChunk = null;

// while the string array has more data
while (currentChunkStart < data.Count)
{
// get data to return
currentChunk = data
// skip already processed entries
.Skip(currentChunkStart)
// take the next batch
.Take(chunkSize)
// get it
.ToList();

// set the next return point
currentChunkStart += chunkSize;

// return in the middle
yield return currentChunk;
}
}
}
}





As you can see from the above code, the return of the ChunkMe function is an IEnumerable and the calling code just iterates over the values returned by this function to gets chunks of data to process. Below is an example output:

Processing batch...size of 2
aaa
bbb
Processing batch...size of 2
ccc
ddd
Processing batch...size of 2
eee
fff
Processing batch...size of 2
ggg
hhh
Processing batch...size of 1
iii

3 comments:

  1. looks like a cool technique to have handy

    ReplyDelete
  2. very good, i had the need of a solution that allows to do it within try catch, and i ended up with generics without yield...

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;

    namespace ConsoleApplication1
    {
    class Program
    {
    static void Main(string[] args)
    {
    // load from some data source
    List mydataset = new List()
    { "aaa", "bbb", "ccc",
    "ddd", "eee", "fff",
    "ggg", "hhh", "iii","jjj","kkk"};

    // process the data in chunks of 2
    foreach(var chunks in ChunkMe(mydataset, 4))
    {
    // process the data
    Console.WriteLine("Processing batch...size of " + chunks.Count);
    chunks.ForEach(s => Console.WriteLine(s));
    }
    }

    public static IEnumerable> ChunkMe(
    List mydataset, int chunkSize)
    {
    int chunkCount = mydataset.Count / chunkSize;

    int lastChunkSize = mydataset.Count % chunkSize;

    var retVal = new List>(chunkCount + (lastChunkSize == 0 ? 0 : 1));

    for(int index = 0; index < chunkCount; index++)
    {
    retVal.Add(ChunkOne(mydataset, chunkSize, index * chunkSize));
    }

    if(lastChunkSize > 0)
    {
    retVal.Add(ChunkOne(mydataset, lastChunkSize, mydataset.Count-lastChunkSize));
    }

    return retVal;
    }

    public static List ChunkOne(List mydataset, int chunkSize, int position)
    {
    var retVal = new List(chunkSize);

    for(int cIndex = 0; cIndex < chunkSize; cIndex++)
    {
    retVal.Add(mydataset[(position) + cIndex]);
    }

    return retVal;
    }
    }
    }

    ReplyDelete