5 The Join with Projection
Projection refers to
the idea that you might want to see only a limited number of columns in
the results of a query. It is very useful for join queries, which can
lead to a very large number of columns in the results.
Suppose you wanted to find the
actorID, lastName, firstName, roleID, and roleName of all rows of the
Actor and Role tables that matched on actorID. It is necessary to use
qualified column names in two places in this query, in the joining
condition and in the SELECT
expression. The need for dot notation in the joining condition was
explained in the previous example. The need for dot notation in the SELECT expression is similar. When joining the Actor and Role tables, you could specify either Actor.actorID or Role.actorID in the SELECT
expression because the value retrieved by either would be the same;
however, you have to use one or the other, and not just the actorID.
Whenever a column name is used in a query and more than one table in the
query has a column with that name, you must qualify the name using dot
notation.
Example Query 5 illustrates a join with projection.
Example Query 5
SELECT Actor.actorID, lastName, firstName, roleID, roleName
FROM Actor, Role
WHERE Actor.actorID = Role.actorID;
The output of Example Query 5 would start as shown in the Example Query 5 Results.
Example Query 5 Results
actorID | lastName | firstName | roleID | roleName |
---|
00001 | Hanks | Tom | 00057 | Forrest Gump |
00001 | Hanks | Tom | 00063 | Jim Lovell |
00002 | Paxton | Bill | 00064 | Fred Haise |
00003 | Bacon | Kevin | 00054 | Captain Jack Ross |
00003 | Bacon | Kevin | 00065 | Jack Swigert |
. . . | . . . | . . . | . . . | . . . |
6 The Cartesian Product
The term "Cartesian product"
refers to a situation where each row in one table is matched with each
row in another table, regardless of the column values they contain. The
Cartesian product is of theoretical importance, but it is rarely of
practical use. If you are trying to write a join query and you forget to
include the WHERE keyword and the joining condition, the result will be a Cartesian product. This is one of the easiest mistakes to make.
Consider Example Query 6.
There are 100 actors in the Actor table and 138 roles in the Role table.
Example Query 6 says to match each actor with each role, regardless
of whether the actor played the role. There are 100 times 138 rows
(13,800 rows) in the complete results. When you create a Cartesian
product by accident, you will typically recognize your mistake by the
number of rows in the result. The fact that the number of rows in the
result is the arithmetic product of the numbers of rows in the two
tables is a reminder of why this result is known as a product.
Example Query 6
SELECT Actor.actorID, lastName, firstName, roleID, roleName
FROM Actor, Role;
A representative subset of
the output of the Cartesian product is shown in the Example Query 6
Results. Every actor is matched with the role Dorothy Gale. Buried in
this list is Judy Garland, the actor who actually played that role.
Example Query 6 Results
actorID | lastName | firstName | roleID | roleName |
---|
. . . | . . . | . . . | . . . | . . . |
00059 | Furlong | Edward | 00001 | Dorothy Gale |
00046 | Gable | Clark | 00001 | Dorothy Gale |
00061 | Garland | Judy | 00001 | Dorothy Gale |
00029 | Getz | John | 00001 | Dorothy Gale |
00027 | Goldblum | Jeff | 00001 | Dorothy Gale |
. . . | . . . | . . . | . . . | . . . |
7 Joins with Additional WHERE Conditions
The idea of complex queries
involving multiple parts was introduced in the previous sections. Some
of the most useful queries are those that join two tables and include WHERE
conditions on one or more columns of either of the two tables involved.
Suppose you wanted to find the actorID, lastName, firstName, roleID,
and roleName for Jack Nicholson only. Such a query would join the Actor
and Role tables. It would also have a condition on the Actor table,
which would be included with the keyword AND. Example Query 7 illustrates this idea.
Example Query 7
SELECT Actor.actorID, lastName, firstName, roleID, roleName
FROM Actor, Role
WHERE Actor.actorID = Role.actorID
AND lastName = 'Nicholson' AND firstName = 'Jack';
The complete output of Example Query 7 is shown in the Example Query 7 Results.
Example Query 7 Results
actorID | lastName | firstName | roleID | roleName |
---|
00019 | Nicholson | Jack | 00052 | Colonel Nathan R. Jessep |
00019 | Nicholson | Jack | 00097 | The Joker |
00019 | Nicholson | Jack | 00098 | Jack Napier |
It is possible to create an
interesting example of a join query that includes a condition testing
for inequality between columns of the two joined tables. Suppose you
wanted to find the actorID, lastName, actor's gender, roleID, role-Name,
and role's gender for those cases where the gender of the actor and the
gender of the role were not the same. Example Query 8 illustrates
this idea. Note that dot notation must be used to distinguish the two
columns named "gender."
Example Query 8
SELECT Actor.actorID, lastName, Actor.gender, roleID,
roleName, Role.gender
FROM Actor, Role
WHERE Actor.actorID = Role.actorID
AND Actor.gender <> Role.gender;
The complete output of Example Query 8 is shown in the Example Query 8 Results.
Example Query 8 Results
actorID | lastName | actor.gender | roleID | roleName | Role.gender |
---|
00070 | Hoffman | M | 00078 | Dorothy Michaels (Tootsie) | F |
00073 | Williams | M | 00082 | Mrs. Doubtfire | F |
8 Joins with ORDER BY
Just as simple selection
queries can be combined with ordering, grouping, and calculating, so,
too, can join queries be combined with these features. For example, ORDER BY can be used with a join.
Suppose you wanted to find the
movieID, lastName, firstName, roleID, and roleName for all actors and
roles, ordered by movieID and lastName. Notice that movieID comes from
the Role table, whereas lastName comes from the Actor table. The join is
formed on the two tables Role and Actor, and the ordering of the
results depends on the columns taken from each of these two tables.
Example Query 9 illustrates this idea.
Example Query 9
SELECT movieID, lastName, firstName, roleID, roleName
FROM Actor, Role
WHERE Actor.actorID = Role.actorID
ORDER BY movieID, lastName;
The output of Example Query 9 would start as shown in the Example Query 5-9 Results.
Example Query 9 Results
movieID | lastName | firstName | roleID | roleName |
---|
001 | Bacon | Kevin | 00065 | Jack Swigert |
001 | Hanks | Tom | 00063 | Jim Lovell |
001 | Harris | Ed | 00067 | Gene Kranz |
001 | Paxton | Bill | 00064 | Fred Haise |
001 | Quinlan | Kathleen | 00068 | Marilyn Lovell |
001 | Sinise | Gary | 00066 | Ken Mattingly |
002 | Bergman | Ingrid | 00017 | Ilsa Lund (Laszlo) |
002 | Bogart | Humphrey | 00016 | Rick Blaine |
002 | Greenstreet | Sydney | 00021 | Signor Ferrari |
002 | Henreid | Paul | 00018 | Victor Laszlo |
002 | Lorre | Peter | 00022 | Ugarte |
002 | Rains | Claude | 00019 | Captain Renault |
. . . | . . . | . . . | . . . | . . . |