11 Joins with Calculated Columns and Built-In Functions
In this section, the use of
built-in functions and calculated columns is illustrated with join
queries. Recall that the Movie and Role tables are in a one-to-many
relationship, with the primary key of the Movie table, movieID, embedded
as a foreign key in the Role table. Suppose you wanted to find the sum
of the discountPrice values of all movies in which Batman appears as a
role. You have to join the two tables on movieID. The discountPrice
column in the SELECT expression comes from the Movie table, whereas the roleName column in the WHERE expression comes from the Role table. Example Query 13 illustrates this idea.
Example Query 13
SELECT SUM(discountPrice)
FROM Movie, Role
WHERE Movie.movieID = Role.movieID
AND roleName = 'Batman';
The complete output of Example Query 13 is shown in the Example Query 13 Results.
Example Query 13 Results
It is possible to create an
interesting example of a three-way join query that includes a calculated
column with a built-in date function. Suppose you wanted to find the
actorID and lastName, the movieID and title, and the age of the actor
when the movie was made. To match the actors and the movies, it is
necessary to join them through the Role table, and extract the year from
the actor's birthDate, then subtract that year from the year the movie
was made. Example Query 14 shows the resulting query.
Example Query 14
SELECT Actor.actorID, lastName, Movie.movieID, title,
year - YEAR(birthDate)
FROM Actor, Role, Movie
WHERE Actor.actorID = Role.actorID
AND Role.movieID = Movie.movieID;
The output of Example Query 14 would start as shown in the Example Query 14 Results.
Example Query 14 Results
actorID | lastName | movieID | title | Expr1004 |
---|
00061 | Garland | 016 | Wizard of Oz, The | 17 |
00062 | Morgan | 016 | Wizard of Oz, The | 49 |
00063 | Bolger | 016 | Wizard of Oz, The | 35 |
. . . | . . . | . . . | . . . | . . . |
Because movies have years instead
of dates, it is possible that the calculated age value is off by one.
Whether this error occurs depends on whether the actor's date of birth
came before or after the release date of the movie—but this information
can't be determined from the data in the database.
12 Subqueries
Because a query returns a set
of results in tabular form, in principle, wherever a table occurs in a
query, it is possible to substitute a query for that table. You can
write a single SQL query that consists of a combination of two queries,
one of which is the outer, main query, and the other of which is the
inner, subquery. The subquery plays the role of a table in the outer
query.
As pointed out earlier,
depending on the version of SQL, Example Query 15 may not find the
difference between DVDPrice and the average DVDPrice; however, using a
subquery, it is possible to generate the desired results. As shown in
Example Query 5-16, the subquery is enclosed in parentheses. In the
subquery, you would find the average DVDPrice for all of the rows in the
Movie table. It is important to use the keyword AS
to give this result column its own name. The outer query then finds the
difference between the DVDPrice column of the Movie table and the named
result column of the inner query. In effect, the outer query involves
two tables; however, it is not necessary to have a joining condition in
the query overall because there is only one value in the results of the
inner query.
Example Query 15
SELECT DVDPrice - AVG(DVDPrice)
FROM Movie;
Example Query 16
SELECT DVDPrice - avgDVDPrice
FROM Movie,
(SELECT AVG(DVDPrice) AS avgDVDPrice
FROM Movie);
The output of Example Query 16
would start as shown in the Example Query 16 Results. Depending on
the system you're using, the results of Example Query 16 may show
negative currency values in parentheses—a convention from the accounting
world.
Example Query 16 Results
Expr1000 |
---|
($5.00) |
$2.00 |
($1.00) |
. . . |
Another aspect of the
example should also be noted. If you are using Microsoft Access, you can
enter the query as shown previously, and it will work correctly. If you
save the query, however, the next time you open it with the Access SQL
editor, it will be given in this much less understandable form:
SELECT DVDPrice - avgDVDPrice
FROM Movie, [SELECT AVG(DVDPrice) AS avgDVDPrice
FROM Movie]. AS [%$##[at]_Alias];
A query is given below, which shows the use of DISTINCT nested inside the parentheses of the COUNT
function. This syntax does not work in Microsoft Access, and it is not
possible to save a query containing this error. For this reason, this
example query is unnumbered and does not exist in the example database.
Example Query 5-x.
SELECT COUNT(DISTINCT lastName)
FROM Actor;
Using a subquery,
however, it becomes possible to generate the desired results. The
subquery is enclosed in parentheses, as shown in Example Query 17. In
the inner query, you find the distinct values of lastName in the Actor
table. The outer query then counts the number of rows in the results of
the inner query.
Example Query 17
SELECT COUNT(*)
FROM
(SELECT DISTINCT lastName
FROM Actor);
The complete output of Example Query 5-17 is shown in the Example Query 5-17 Results.
Example Query 17 Results
If you did a query with COUNT(lastName), no null values would be counted, but duplicates would. With Example Query 17, duplicates are not counted, but COUNT(*) means that the count of 95 includes one distinct last-Name value which happens to be null.
If you are using
Microsoft Access, you can enter the query as shown earlier, and it will
work correctly. If you save the query, however, the next time you open
it with the Access SQL editor, it will be given in this much less
understandable form:
SELECT COUNT(*)
FROM [SELECT DISTINCT lastName
FROM Actor]. AS [%$##[at]_Alias];