DATABASE

SQL Programming Language : Ordering, Calculating, and Grouping in Queries

10/14/2010 9:35:13 AM

1 The Keyword DISTINCT

Not only is the information retrieved from a database important, but how the results are presented to users is also important. Although tables themselves are not allowed to have duplicate rows, query results can easily contain duplicates. Suppose you ran Example Query 1.

Example Query 1

SELECT roleID
FROM RoleQuote;

The output of Example Query 1 would start as shown in the Example Query 1 Results. There would be as many rows in the results as appear in the RoleQuote table. Because individual roles can have more than one quote, the query results would include duplicate roleID values.

Example Query 1 Results

roleID
00001
00001
00001
00002
00003
...

If a user is interested in identifying only those roles that had quotes, the duplication in the results is not helpful: You can't see the forest for the trees. The keyword DISTINCT has the effect of removing such duplicate rows from query results. Example Query 2 is the same as Example Query 1 except for the inclusion of DISTINCT after the keyword SELECT.

Example Query 2

SELECT DISTINCT roleID
FROM RoleQuote;

The output of Example Query 2 would start as shown in the Example Query 4-2 Results.

Example Query 2 Results

roleID
00002
00003
00005
00007
...

The keyword DISTINCT can also be used when the query selects more than one column. In such a query, DISTINCT comes before the list of columns and does not require parentheses or other punctuation. In this situation, rows are considered to be duplicates if they contain the same complete set of column values. DISTINCT has the effect of leaving only one row with a given set of values in it and removing any duplicate rows from the query results.

2 The Keywords ORDER BY

Presenting query results in a particular order can be very helpful to users. For example, a user might want to see rows of query results alphabetized by one of the text columns in it or arranged in ascending or descending order by one of the number columns in it. In other words, it might be desirable to see query results in sorted order. SQL does not have a keyword "alphabetize" or "sort." Instead, presenting query results in sorted order is accomplished with the keywords ORDER BY. These keywords appear at the end of the query, along with the name of the column to sort on.

Suppose you wanted to find the movieID and title of all movies in the Movie table, in order by title. Example Query 3 shows this query.

Example Query 3

SELECT movieID, title
FROM Movie
ORDER BY title;

The output of Example Query 3 would start as shown in the Example Query 3 Results.

Example Query 3 Results

movieIDtitle
024Amadeus
001Apollo 13
019Batman
022Batman & Robin
. . .. . .

The movie A Few Good Men is in the database, but it is listed as "Few Good Men, A." Example Query 3 illustrates why: If the movie title were given with the article "A" first, it would appear in the results of the query along with the other movie titles starting with the letter "A."

3 The Keywords ASC and DESC

When ordering results, the default is to sort them in ascending order. If you want to make this order explicit, you put the keyword ASC after the name of the column you're ordering on. If you would like results to appear in descending order, use the keyword DESC instead. Example Query 4 illustrates the use of this keyword.

Example Query 4

SELECT movieID, title
FROM Movie
ORDER BY title DESC;

The output of Example Query 4 would start as shown in the Example Query 4 Results.

Example Query 4 Results

movieIDtitle
016Wizard of Oz, The
017Tootsie
014Terminator, The
015Terminator 2: Judgment Day
. . .. . .

Example Query 4 again illustrates the usefulness of removing an article from the beginning of a movie title and putting it at the end. If The Wizard of Oz were listed with "The" first, "Tootsie" would appear at the top of the results and "The Wizard of Oz" would come second.

4 Ordering on More Than One Column

It is possible to order the results of a query on more than one column. The columns of interest are given after the keywords ORDER BY, separated by commas. The results are ordered first by the first column given, then by the next column given, and so on. It is possible to mix ascending and descending order by putting one of the ASC and DESC keywords after the names of the columns. The default is ascending order.

Suppose you wanted to find the actorID, lastName, and firstName of all actors, ordered first by lastName and then by firstName. In other words, you wanted the actors' names alphabetized, last name first. To keep the query manageable, middleName and suffix are not included in the ordering, although they would need to be included in a complete example. Example Query 5 shows this query.

Example Query 5

SELECT actorID, lastName, firstName
FROM Actor
ORDER BY lastName, firstName;

Partial output of Example Query 5 is given in the Example Query 5 Results. This output includes some of the rows where actors have the same last name, showing that overall, rows are ordered by lastName, and rows with the same lastName value are ordered by firstName.

Example Query 5 Results

actorIDlastNamefirstName
00069 Terry
00098AbrahamF.
00003BaconKevin
. . .. . .. . .
00058HamiltonLinda
00067HamiltonMargaret
. . .. . .. . .
00070HoffmanDustin
00025HoffmanGaby
. . .. . .. . .

Notice the lastName column of the very first row; it is null. When presenting the results in order, the null value comes first. Null is treated as "less than" any actual value.

5 Calculated Columns in Results

It can be convenient to have query results showing the results of calculations on number columns in tables. Arithmetic expressions can be built out of the names of number columns, numeric constants, addition, subtraction, multiplication, division, and parentheses. The symbols +, -, *, and / are used for addition, subtraction, multiplication, and division, respectively.

Suppose you wanted to find the movieID, title, DVDPrice, and discount-Price of movies, along with the difference between the regular DVDPrice and the discountPrice. DVDPrice and discountPrice are both currency columns, so the difference can be found by subtracting the one price column from the other. Example Query 6 shows this query.

Example Query 6

SELECT movieID, title, DVDPrice, discountPrice,
DVDPrice - discountPrice
FROM Movie;

The output of Example Query 6 would start as shown in the Example Query 6 Results.

Example Query 6 Results

movieIDtitleDVDPricediscountPriceExpr1004
001Apollo 13$12.99$9.99$3.00
002Casablanca$19.99$14.99$5.00
003Dirty Harry$16.99$16.99$0.00
. . .. . .. . .. . .. . .

Notice that the system makes up an arbitrary expression name for the calculated column in the results. This is a good place to use the keyword AS and create a more informative column name. The next example does so and also illustrates a more complicated arithmetic expression.

Suppose these conditions apply when buying or renting a movie:

  1. You estimate that if you buy a movie outright, on average you watch it four times. Also, if you buy a movie, you buy it at the discount price.

  2. You estimate that the total cost to you of renting a movie is $3.00. This may include gas money if you go to a video store, the value of the time spent searching online if you use such a service, and so on.

Now suppose that you wanted to find the movieID and title of a movie, along with the difference between the average cost of viewing it if bought at the discount price and the cost of viewing it if rented. This difference is the actual additional cost per viewing if you add the movie to your collection. Let the calculated column have the heading "cost per viewing if bought." Example Query 7 shows the query you need.

Example Query 7

SELECT movieID, title, (discountPrice / 4) - 3 AS [cost per
viewing if bought]
FROM Movie;

The output of Example Query 7 would start as shown in the Example Query 7 Results.

Example Query 7 Results

movieIDtitlecost per viewing if bought
001Apollo 13-0.5025
002Casablanca0.7475
003Dirty Harry1.2475
. . .. . .. . .

It is apparent from Example Query 7 that if you bought Apollo 13, you would save money. If you bought Casablanca or Dirty Harry, you would lose money. Of course, after seeing the results, you might conclude that you would watch these movies more than the average number of times, and it would pay off to buy them anyway.

6 The Date Functions YEAR, MONTH, and DAY

SQL also includes a number of built-in functions. The date data type illustrates the kind of function that computes or extracts a value from a column for each row included in the query. The names of the functions that apply to dates are YEAR, MONTH, and DAY. In a query, these keywords are followed by parentheses containing the name of a date column. The functions extract that part of the date for which they are named.

Suppose you wanted to find the actorID, lastName, and year of birth of all actors. Example Query 8 shows this query.

Example Query 8

SELECT actorID, lastName, YEAR(birthDate)
FROM Actor;

The output of Example Query 8 would start as shown in the Example Query 8 Results.

Example Query 8 Results

actorIDlastNameExpr1002
00001Hanks1956
00002Paxton1955
00003Bacon1958
. . .. . .. . .

Because the component parts of dates have a numeric nature, it is possible to do arithmetic on them. You can subtract dates directly, although you will get results that you may not expect. Suppose you wanted to find the actorID, last-Name, and the difference between the deathDate and the birthDate of all actors where the deathDate is not null. Example Query 9 shows this query.

Example Query 9

SELECT actorID, lastName, deathDate - birthDate
FROM Actor
WHERE deathDate IS NOT NULL;

The output of Example Query 9 would start as shown in the Example Query 9 Results.

Example Query 9 Results

actorIDlastNameExpr1002
00008Bergman25202
00068Blandick29899
00007Bogart21175
. . .. . .. . .

The difference between deathDate and birthDate is an unexpectedly large number because it is expressed in days instead of years. If you wanted to find the difference in years, you could rewrite the query using the YEAR function. Example Query 10 shows the new query.

Example Query 10

SELECT actorID, lastName, YEAR(deathDate) - YEAR(birthDate)
FROM Actor
WHERE deathDate IS NOT NULL;

The output of Example Query 10 would start as shown in the Example Query 10 Results.

Example Query 10 Results

actorIDlastNameExpr1002
00008Bergman69
00068Blandick82
00007Bogart58
. . .. . .. . .

Although the results in years are convenient for review purposes, Example Query 10 is not entirely precise. It does not take into account whether the month and day of a person's death fall before or after the month and day of that individual's birth. The results shown may be off by a year. It would be necessary to write yet another query to solve this problem.

7 The Built-In Function COUNT

The keyword COUNT is a built-in function that makes it possible to find the number of rows that would be retrieved by a query with a given set of conditions. Suppose you wanted to find the number of rows in the Actor table. Example Query 11 shows this query.

Example Query 11

SELECT COUNT(*)
FROM Actor;

The complete output of Example Query 11 is shown in the Example Query 11 Results.

Example Query 11 Results

Expr1000
100

The count is a single number, so the results of Example Query 11 are a table consisting of a single row and a single column. The column heading is an arbitrary expression unless the keyword AS is used in the query to define a specific heading.

In Query 11, the wildcard symbol * does not signify "count all rows." Rather, it signifies that when doing the count, all columns are taken into consideration. A row will be included in the count as long as at least one of the columns contained in it is not null. All rows should meet this condition, and there is no WHERE condition restricting the number of rows included in the query results, so all rows in the table are counted.

Suppose you wanted to find the number of actors whose last names start with the letter "S." Example Query 12 shows this query.

Example Query 12

SELECT COUNT(*)
FROM Actor
WHERE lastName LIKE 'S*';

The complete output of Example Query 12 is shown in the Example Query 12 Results. Example Query 12 counts the number of rows in the Actor table that meets the condition following the WHERE.

Example Query 12 Results

Expr1000
6

The keyword COUNT can also be applied to one or more listed columns rather than to *. COUNT will return a count of all rows where those columns are not null. Suppose you'd like to find the number of actors for whom one column, lastName, is not null. Example Query 13 shows this query.

Example Query 13

SELECT COUNT(lastName)
FROM Actor;

The complete output of Example Query 13 is shown in the Example Query 13 Results.

Example Query 13 Results

Expr1000
99

Notice that it would be redundant to include the condition WHERE lastName IS NOT NULL in Example Query 13.

8 The Built-In Functions MAX, MIN, SUM, and AVG

The MAX, MIN, SUM, and AVG keywords refer to the mathematical concepts of maximum, minimum, sum, and average, respectively. These keywords are the names of built-in functions that can be applied to number columns in tables. To use one of these functions, you put the number column of interest in parentheses after the function name. These functions give values that are based on the contents of more than one row in a table.

Suppose you wanted to find the average DVDPrice of all movies in the Movie table. Example Query 14 shows this query.

Example Query 14

SELECT AVG(DVDPrice)
FROM Movie;

The complete output of Example Query 14 is shown in the Example Query 14 Results.

Example Query 14 Results

Expr1000
$17.99

Suppose you wanted to find the sum of the discountPrice values of all movies in the Movie table. Example Query 15 shows this query.

Example Query 15

SELECT SUM(discountPrice)
FROM Movie;

The complete output of Example Query 15 is shown in the Example Query 15 Results.

Example Query 15 Results

Expr1000
$361.76

As you saw earlier, calculated columns may be defined using built-in functions such as the date functions, because those functions extract a single value for each row in the query. However, built-in functions such as AVG, which group together values of more than one row, do not mix with arithmetic calculations in some implementations of SQL.

For example, suppose you wanted to find the difference between the DVD-Price of each movie and the average DVDPrice of all of the movies. Example Query 16 illustrates this idea.

Example Query 16

SELECT DVDPrice - AVG(DVDPrice)
FROM Movie;

Unfortunately, depending on the system you are using, this query might not work. This example is saved as Example Query 16 in this book's Microsoft Access database, but trying to run it will generate an error message rather than results.

9 The Keywords GROUP BY

Grouping is a powerful capability in SQL. It allows you to generate query results that summarize table information in a way that is similar to a report. Grouping is based on SQL's built-in functions that aggregate the information found in tables.

Grouping has similarities to ordering in its logic. The general plan of a query using GROUP BY is this: From a given table, the query selects a column of interest, column X, for example. The query also selects one of the built-in functions, COUNT, for example. At the end of the query, the keywords GROUP BY are followed by the name of the column of interest, X. Different values for column X may occur more than once in the table. This GROUP BY query will count how many times each different value of X occurs.

In the Movie table, each movie appears only once; however, different movies are produced by the same company, so the companies may appear more than once in the table. Suppose you wanted find out how many movies each company produced. In other words, you want to count how many times each company appears in the Movie table. Example Query 17 shows this query.

Example Query 17

SELECT company, COUNT(*)
FROM Movie
GROUP BY company;

The output of Example Query 17 would end as shown in the Example Query 17 Results. The expression column contains the counts.

Example Query 17 Results

companyExpr1001
. . .. . .
Steve Tisch/Wendy Finerman1
The Malpaso Company2
TriStar Pictures1
Warner Brothers6

Now suppose instead that you wanted to find the average DVDPrice grouped by company. Example Query 18 shows this query.

Example Query 18

SELECT company, AVG(DVDPrice)
FROM Movie
GROUP BY company;

The output of Example Query 18 would end as shown in the Example Query 18 Results. The expression column contains the averages.

Example Query 18 Results

companyExpr1001
. . .. . .
Steve Tisch/Wendy Finerman$14.99
The Malpaso Company$13.49
TriStar Pictures$14.99
Warner Brothers$22.66

A GROUP BY query has to contain a built-in function that aggregates values from different rows. The SELECT operation can include only what you're grouping on and the built-in function. Technically, the SELECT expression doesn't have to include the grouping column—but without it, the meaning of the results may not be clear.

10 GROUP BY and ORDER BY

GROUP BY and ORDER BY can be used in the same query. In the previous example, the query results happened to be ordered by company. You can change the order of the results if you like—for example, ordering them by the value of the results of the built-in function.

Suppose you wanted to find the average DVDPrice grouped by company, presented in ascending order by the average DVDPrice. Example Query 19 illustrates this.

Example Query 19

SELECT company, AVG(DVDPrice)
FROM Movie
GROUP BY company
ORDER BY AVG(DVDPrice);

The output of Example Query 19 would start as shown in the Example Query 19 Results.

Example Query 19 Results

companyExpr1001
CarolCo Pictures$9.99
Imagine Entertainment and Universal Pictures$12.99
Gordon Company$12.99
The Malpaso Company$13.49
. . .. . .

11 The Keyword HAVING

In a simple query, you can limit the number of rows in the results by using the keyword WHERE, which puts a condition on one or more of the columns of the table. In a GROUP BY query, you can use the keyword HAVING to put a condition on those values of the built-in function that will be included in the results. HAVING appears at the end of the query and can use a constant value and the comparison operators (<, <=, =, >=, >, and <>).

Running the previous query would have produced results that included all companies and the average values of their DVDPrice. To save space, only partial results were shown. Suppose, however, that you wanted to see just the results for those companies where the average DVDPrice is less than or equal to $14.00. Example Query 20 illustrates this idea.

Example Query 20

SELECT company, AVG(DVDPrice)
FROM Movie
GROUP BY company HAVING AVG(DVDPrice) <= 14.00
ORDER BY AVG(DVDPrice);

The complete output of Example Query 20 is shown in the Example Query 20 Results.

Example Query 20 Results

companyExpr1001
CarolCo Pictures$9.99
Gordon Company$12.99
Imagine Entertainment and Universal Pictures$12.99
The Malpaso Company$13.49

12 Grouping by More Than One Column

Just as it's possible to order by more than one column at a time, it's also possible to group by more than one column at a time. The different grouping columns are included in the SELECT expression, followed by the built-in function. The different grouping columns are also included after the keywords GROUP BY, separated by commas. Every different combination of values in the grouping columns will give a separate row in the results.

In the example database, different movies might have the same DVDPrice. Among those movies with the same DVDPrice, there may be different values for the discountPrice. Suppose you wanted to find the number of movie DVDs grouped by their DVDPrice and their discountPrice. Example Query 21 shows this query.

Example Query 21

SELECT DVDPrice, discountPrice, COUNT(*)
FROM Movie
GROUP BY DVDPrice, discountPrice;

The output of Example Query 21 would start as shown in the Example Query 21 Results.

Example Query 21 Results

DVDPricediscountPriceExpr1002
$9.99$7.991
$9.99$9.991
$12.99$9.992
$14.99$9.997
$14.99$14.992
. . .. . .. . .

The last two lines of the Example Query 21 Results table illustrate the idea of multiple groups most clearly. There are two groups of movies where the DVDPrice is $14.99. The first group consists of seven movies with a DVDPrice of $14.99 and a discountPrice of $9.99. The second group consists of two movies with a DVDPrice of $14.99 and a discountPrice of $14.99.


Other  
 
Video
Top 10
SG50 Ferrari F12berlinetta : Prancing Horse for Lion City's 50th
The latest Audi TT : New angles for TT
Era of million-dollar luxury cars
Game Review : Hearthstone - Blackrock Mountain
Game Review : Battlefield Hardline
Google Chromecast
Keyboards for Apple iPad Air 2 (part 3) - Logitech Ultrathin Keyboard Cover for iPad Air 2
Keyboards for Apple iPad Air 2 (part 2) - Zagg Slim Book for iPad Air 2
Keyboards for Apple iPad Air 2 (part 1) - Belkin Qode Ultimate Pro Keyboard Case for iPad Air 2
Michael Kors Designs Stylish Tech Products for Women
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
Popular Tags
Video Tutorail Microsoft Access Microsoft Excel Microsoft OneNote Microsoft PowerPoint Microsoft Project Microsoft Visio Microsoft Word Active Directory Exchange Server Sharepoint Sql Server Windows Server 2008 Windows Server 2012 Windows 7 Windows 8 Adobe Flash Professional Dreamweaver Adobe Illustrator Adobe Photoshop CorelDRAW X5 CorelDraw 10 windows Phone 7 windows Phone 8 Iphone