Building the RandomElement Operator
To demonstrate how to build a custom operator that returns a single
element from a source sequence, let’s look at an operator that returns
an element at random. There are often times when picking a candidate at
random from a collection is necessary, either for test data or deriving
a random sample from a customer list to provide specific advertising or
business offers. This new operator will be called RandomElement
and will return one element at random from a given sequence
(pseudo-random to be exact—it can be difficult to get the randomness
you need for a small collection or when retrieving more than one in a
tight loop).
The RandomElement
operator has the following requirements:
• Take an input source IEnumerable<T>
sequence and return one element at random.
• Allow control over the random number generator by optionally accepting a seed value.
• Throw an ArgumentNullException
if the source sequence is null.
• Throw an InvalidOperationException
if the source sequence is empty.
As seen when implementing the Last
operator, although it is certain that sequences you extend implement the IEnumerable<T>
interface, many sequences also implement the IList<T>
and ICollection<T>
interfaces, which can be used to improve performance. Any collection type that implements IList
allows elements to be accessed by index position (for example, list[2]
will return the third element), and any collection implementing ICollection
exposes a Count
property that immediately retrieves element count. These performance enhancements should be used if possible.
The code shown in Listing 3
uses the built-in .NET Framework random number generator to find an
index position that is within a sequence’s bounds and returns that
element. It first tries to use the IList
indexer to
retrieve the element (because it is faster), and if that interface
isn’t supported by the collection, this operator uses a slower
enumeration looping pattern (completely valid code, just slower).
Listing 3. Sample RandomElement
operator returns an element at random from a source sequence and takes an optional seed value for the random number generator
Although it is completely
up to the implementer to decide behavior for empty and null source
collections, following the same pattern used for the standard query
operators is highly recommended, as it makes consuming your custom
operators the same as the other LINQ operators. The general conventions
are as follows:
- If the source collection being extended is null, throw an
ArgumentNullException
. - If the source collection is empty, throw an
InvalidOperationException
. - Provide a variation of the operator with the suffix of “OrDefault” that doesn’t raise the
InvalidOperationException
, but returns the value default(T)
instead.
Listing 4 shows the basic unit tests written to confirm correct behavior of the RandomElement
operator. (I used NUnit—see www.nunit.org—but
any unit testing framework would have worked.) The basic test cases for
specific input sequence types that should be confirmed at a minimum are
as follows:
• The source sequence is null
.
• The source sequence has no elements (is empty).
• The source sequence implements IList
.
• The source sequence implements IEnumerable<T>
.
These tests, shown in Listing 4, are the minimum required for the RandomElement
operator, and lightly confirm both the IList<T>
and IEnumerable
implementations with a single element source and a two-element source.
Pay attention to boundary conditions when writing these tests; my first
attempt at coding this operator never returned the last element—the
unit tests allowed me to find and correct this error quickly.
Listing 4. Basic set of unit tests for the RandomElement
operator—these tests are written for the NUnit testing framework (www.nunit.org); more tests are included in the sample code.
The code for our RandomElement
operator in Listing 3
satisfies the requirements listed earlier; however, there is one more
requirement that will make this single element operator conform to the
patterns common in the standard query operators. We need to add a
variation called RandomElementOrDefault
to cater to empty
source sequences without throwing an exception. This is necessary to
allow the single element operators to be used on sequences that might
ordinarily be empty, and rather than throwing an exception, the
operator should return null for reference types or zero for numeric
value types.
To add this variation of our new operator, I simply cut and pasted the code from Listing 3, changed its name to RandomElementOrDefault
, and then altered the empty source check to return default(T);
rather than throw new InvalidOperationException
.
The remaining part of the operator is unchanged (although good coding
practice would have you refactor out the common code into a separate
method). The additional operator begins with the following code:
Following are a few tips for developing new single element operators:
• Remember to test and make use of any performance efficiencies offered by collections that implement IList<T>
and ICollection
interfaces in addition to IEnumerable<T>
if present on the source collection. Make certain to support the IEnumerable<T>
sequences first before looking for performance optimizations.
• Implement an additional variation of these operators (suffixed with OrDefault
) that returns a legal default value for a given type, rather than throw an InvalidOperationException
(change the throw new InvalidOperationException
to return default(T);)
• Keep your extension methods in a consistent namespace so that consumers of your operators only need to add a single namespace using
clause to get access to your custom operators.