2. One-to-One Joins
A one-to-one join allows data from two different
sequences related to each other by a common key value to be accessible
in the select projection of a query. One-to-one joins are very
commonplace in relational database systems, where the central theme is
separating data into multiple tables related by parent and child key
relationships. This process is called normalization.
Although LINQ to Objects supports most of the common join types
available in a relational database, it is not always the most efficient
way to work with large volumes of data.
Note
A relational database will out-perform LINQ to
Objects in cases where data volume is large. The join features across
object collections should be used for local fine-tuning of data
retrieved from a database, not in place of that database engine.
There are a number of ways to achieve one-to-one joins using LINQ, which include
Using the join (or Join) operator
Using a subquery in the select projection
Using the SingleOrDefault operator in the select projection
Using a cross join with a where filter
Each technique for one-to-one joining is covered in
detail in this section, including the relative performance of each .
Table 1. One-to-One Join Techniques and Their Performance Comparisons
Technique | Many Record Performance | Single Record Performance | Comments |
---|
Using the join (or Join) operator | Best | Worst | Use the join
syntax whenever you are working with five or more outer element
records. Builds a lookup on the inner keys first, avoiding repetitive
looping in subsequent outer iterations.
|
Using a subquery in the select projection | Good | Good | Use the SingleOrDefault syntax rather than a subquery. There is no benefit to using a subquery.
|
Using the SingleOrDefault operator in the select projection | Good | Best | Use SingleOrDefault syntax whenever you are working with a single (or minimal number of) outer records.
|
Using a cross join with a where filter | Worst | Good | Only
use when a cross join is actually needed. Never use when there are many
records—will result in very poor performance. |
To add clarity it is necessary to depart from our
running example data and use a very simple set of sample data to allow
you to focus on the mechanics of the join syntax. The key field in this case is a string type, but
any type can be used (integer values for instance if you retrieved data
from primary keys in a database table).
Table 2. Sample Data Used for Demonstrating Joining Multiple Data Sources
Customers | Orders |
---|
Key | Name | Key | Order Number |
---|
GOT1 | Gottshall | GOT1 | Order 1 |
VAL1 | Valdes | GOT1 | Order 2 |
GAU1 | Gauwain | DEA1 | Order 3 |
DEA1 | Deane | DEA1 | Order 4 |
ZEE1 | Zeeman | ZEE1 | Order 5 |
2.1 The join (or Join) Operator
The simplest join of this type is built using the join operator in the query expression syntax, or the Join operator in extension method syntax. The query expression syntax for a join takes the following general form:
from [outer sequence
join [inner sequence] on
[outer key selector] equals [inner key selector]
select [result selector]
To understand the behavior of this operator, the following code shows a basic example of linking two arrays of strings:
string[] outer = new string[] { "a", "b", "c", "d" };
string[] inner = new string[] { "b", "c", "d", "e" };
var q = from s1 in outer
join s2 in inner on s1 equals s2
select string.Format(
"Outer: {0} Inner: {1}", s1, s2);
foreach (string s in q)
Console.WriteLine(s);
The output of this code is
Outer: b Inner: b
Outer: c Inner: c
Outer: d Inner: d
Notice that any outer element with a matching inner
element is included in the result sequence, and the select projection
has access to both strings in this case (s1 and s2).
Also notice that any elements where the key values didn’t match are
skipped. This is called an inner join by relational database
dictionaries, and you can simply think of it in these terms: Any record
without a matching key value (a null key value for instance) is skipped.
To demonstrate how to use the join syntax over the data shown in Table 2, Listing 2 shows how to achieve an inner join from an order(s) to access all the customer data (in this case to retrieve the LastName property in the Select projection). The output from this code is shown in Output 2.
Listing 2. Simple join example between two collections—see Output 2. Notice there are no entries for Valdes or Gauwain.
var customers = Customer.SampleData(); var orders = Order.SampleData();
var q1 = from o in orders join c in customers on o.CustomerID equals c.CustomerID select new { o.OrderNumber, c.LastName };
foreach (var order in q1) Console.WriteLine( "Customer: {0} Order Number: {1}", order.LastName.PadRight(9, ' '), order.OrderNumber);
|
Output 2.
Customer: Gottshall Order Number: Order 1
Customer: Gottshall Order Number: Order 2
Customer: Deane Order Number: Order 3
Customer: Deane Order Number: Order 4
Customer: Zeeman Order Number: Order 5