![]() | ![]() |
Home |
|
|
Transact-SQL User's Guide |
|
| Chapter 5 Subqueries: Using Queries Within Other Queries |
|
| Types of subqueries |
There are two basic types of subqueries:
Expression subqueries are introduced with an unmodified comparison operator, must return a single value, and can be used almost anywhere an expression is allowed in SQL.
Quantified predicate subqueries operate on lists introduced with in or with a comparison operator modified by any or all. Quantified predicate subqueries return 0 or more values. This type is also used as an existence test, introduced with exists.
Subqueries of either type are either noncorrelated or correlated (repeating).
A noncorrelated subquery can be evaluated as if it were an independent query. Conceptually, the results of the subquery are substituted in the main statement, or outer query. This is not how Adaptive Server actually processes statements with subqueries. Noncorrelated subqueries can alternatively be stated as joins and are processed as joins by Adaptive Server.
A correlated subquery cannot be evaluated as an independent query, but can reference columns in a table listed in the from list of the outer query. Correlated subqueries are discussed in detail at the end of this chapter.
Expression subqueries include:
Subqueries in a select list (introduced with in)
Subqueries in a where or having clause connected by a comparison operator (=, !=, >, >=, <, <=)
Expression subqueries take the general form:
[Start of select, insert, update, delete statement or subquery]
where expression comparison_operator (subquery)
[End of select, insert, update, delete statement or subquery]
An expression consists of a subquery or any combination of column names, constants, and functions connected by arithmetic or bitwise operators.
The comparison_operator is one of:
Operator | Meaning |
= | Equal to |
> | Greater than |
< | Less than |
>= | Greater than or equal to |
<= | Less than or equal to |
!= | Not equal to |
<> | Not equal to |
!> | Not greater than |
!< | Not less than |
If you use a column name in the where or having clause of the outer statement, make sure a column name in the subquery_select_list is join compatible with it.
A subquery that is introduced with an unmodified comparison operator (that is, a comparison operator that is not followed by any or all) must resolve to a single value. If such a subquery returns more than one value, Adaptive Server returns an error message.
For example, suppose that each publisher is located in only one city. To find the names of authors who live in the city where Algodata Infosystems is located, write a statement with a subquery that is introduced with the comparison operator =:
select au_lname, au_fname
from authors
where city =
(select city
from publishers
where pub_name = "Algodata Infosystems")au_lname au_fname -------------- -------------- Carson Cheryl Bennet Abraham (2 rows affected)Using scalar aggregate functions to guarantee a single value
Subqueries that are introduced with unmodified comparison operators often include scalar aggregate functions, because these return a single value.
For example, to find the names of books that are priced higher than the current minimum price:
select title
from titles
where price >
(select min(price)
from titles)title
---------------------------------------------------
The Busy Executive's Database Guide
Cooking with Computers: Surreptitious Balance
Sheets
Straight Talk About Computers
Silicon Valley Gastronomic Treats
But Is It User Friendly?
Secrets of Silicon Valley
Computer Phobic and Non-Phobic Individuals:
Behavior Variations
Is Anger the Enemy?
Life Without Fear
Prolonged Data Deprivation: Four Case Studies
Emotional Security: A New Algorithm
Onions, Leeks, and Garlic: Cooking Secrets of the
Mediterranean
Fifty Years in Buckingham Palace Kitchens
Sushi, Anyone?
(14 rows affected)group by and having in
expression subqueriesBecause subqueries that are introduced by unmodified comparison operators must return a single value, they cannot include group by and having clauses unless you know that the group by and having clauses will return a single value.
For example, this query finds the books that are priced higher than the lowest priced book in the trad_cook category:
select title
from titles
where price >
(select min(price)
from titles
group by type
having type = "trad_cook")Using distinct with expression
subqueriesSubqueries that are introduced with unmodified comparison operators often include the distinct keyword to ensure the return of a single value.
For example, without distinct, this subquery would fail because it would return more than one value:
select pub_name from publishers
where pub_id =
(select distinct pub_id
from titles
where pub_id = publishers.pub_id)Quantified predicate subqueries, which return a list of 0 and higher values, are subqueries in a where or having clause that are connected by any, all, in, or exists. The any or all subquery operators modify comparison operators.
There are three types of quantified predicate subqueries:
any/all subqueries. Subqueries introduced with a modified comparison operator, which may include a group by or having clause, take the general form:
[Start of select, insert, update, delete statement; or subquery]
where expression comparison_operator [any | all]
(subquery) [End of select, insert, update, delete statement; or subquery]
in/not in subqueries. Subqueries introduced with in or not in take the general form:
[Start of select, insert, update, delete statement; or subquery]
where expression [not] in (subquery)
[End of select, insert, update, delete statement; or subquery]
exists/not exists subqueries. Subqueries introduced by exists or not exists are existence tests which take the general form:
[Start of select, insert, update, delete statement; or subquery]
where [not] exists (subquery)
[End of select, insert, update, delete statement; or subquery]
Though Adaptive Server allows the keyword distinct in quantified predicate subqueries, it always processes the subquery as if distinct were not included.
Subqueries with any and allThe keywords all and any modify a comparison operator that introduces a subquery.
When any is used with <, >, or = with a subquery, it returns results when any value retrieved in the subquery matches the value in the where or having clause of the outer statement.
When all is used with < or > in a subquery, it returns results when all values retrieved in the subquery match the value in the where or having clause of the outer statement.
The syntax for any and all is:
{where | having} [not]
expression comparison_operator {any | all} (subquery) Using the > comparison operator as an example:
> all means greater than every value, or greater than the maximum value. For example, > all (1, 2, 3) means greater than 3.
> any means greater than at least one value, or greater than the minimum value. Therefore, > any (1, 2, 3) means greater than 1.
If you introduce a subquery with all and a comparison operator does not return any values, the entire query fails.
all and any can be tricky. For example, you might ask "Which books commanded an advance greater than any book published by New Age Books?"
You can paraphrase this question to make its SQL "translation" more clear: "Which books commanded an advance greater than the largest advance paid by New Age Books?" The all keyword, not the any keyword, is required here:
select title
from titles
where advance > all
(select advance
from publishers, titles
where titles.pub_id = publishers.pub_id
and pub_name = "New Age Books") title ---------------------------------------- The Gourmet Microwave (1 row affected)
For each title, the outer query gets the titles and advances from the titles table, and it compares these to the advance amounts paid by New Age Books returned from the subquery. The outer query looks at the largest value in the list and determines whether the title being considered has commanded an even greater advance.
The > all operator means that the value in the column that introduces the subquery must be greater than each of the values returned by the subquery, for a row to satisfy the condition in the outer query.
For example, to find the books that are priced higher than the highest-priced book in the mod_cook category:
select title from titles where price > all
(select price from titles
where type = "mod_cook")title
---------------------------------------------------
But Is It User Friendly?
Secrets of Silicon Valley
Computer Phobic and Non-Phobic Individuals:
Behavior Variations
Onions, Leeks, and Garlic: Cooking Secrets of
the Mediterranean
(4 rows affected)However, if the set returned by the inner query contains a NULL, the query returns 0 rows. This is because NULL stands for "value unknown," and it is impossible to tell whether the value you are comparing is greater than an unknown value.
For example, try to find the books that are priced higher than the highest-priced book in the popular_comp category:
select title from titles where price > all
(select price from titles
where type = "popular_comp")title --------------------------------------------------- (0 rows affected)
No rows were returned because the subquery found that one of the books, Net Etiquette, has a null price.
The = all operator means that the value in the column that introduces the subquery must be the same as each value in the list of values returned by the subquery, for a row to satisfy the outer query.
For example, the following query finds out which authors live in the same city by looking at the postal code:
select au_fname, au_lname, city
from authors
where city = all
(select city
from authors
where postalcode like "946%")> any means that the value in the column that introduces the subquery must be greater than at least one of the values in the list returned by the subquery, for a row to satisfy the outer query.
The following example is introduced with a comparison operator modified by any. It finds each title that has an advance larger than any advance amount paid by New Age Books.
select title
from titles
where advance > any
(select advance
from titles, publishers
where titles.pub_id = publishers.pub_id
and pub_name = "New Age Books") title
---------------------------------------------------
The Busy Executive's Database Guide
Cooking with Computers: Surreptitious Balance
Sheets
You Can Combat Computer Stress!
Straight Talk About Computers
The Gourmet Microwave
But Is It User Friendly?
Secrets of Silicon Valley
Computer Phobic and Non-Phobic Individuals:
Behavior Variations
Is Anger the Enemy?
Life Without Fear
Emotional Security: A New Algorithm
Onions, Leeks, and Garlic: Cooking Secrets of
the Mediterranean
Fifty Years in Buckingham Palace Kitchens
Sushi, Anyone?
(14 rows affected) For each title selected by the outer query, the inner query finds a list of advance amounts paid by New Age Books. The outer query looks at all the values in the list and determines whether the title being considered has commanded an advance that is larger than any of those values. In other words, this example finds titles with advances as large as or larger than the lowest value paid by New Age Books.
If the subquery does not return any values, the entire query fails.
The = any operator is an existence check; it is equivalent to in. For example, to find authors that live in the same city as any publisher, you can use either =any or in:
select au_lname, au_fname
from authors
where city = any
(select city
from publishers) select au_lname, au_fname
from authors
where city in
(select city
from publishers) au_lname au_fname -------------- -------------- Carson Cheryl Bennet Abraham (2 rows affected)
However, the != any operator is different from not in. The != any operator means "not = a or not = b or not = c"; not in means "not = a and not = b and not = c".
For example, to find the authors who live in a city where no publisher is located:
select au_lname, au_fname
from authors
where city != any
(select city
from publishers) The results include all 23 authors. This is because every author lives in some city where no publisher is located, and each author lives in only one city.
The inner query finds all the cities in which publishers are located, and then, for each city, the outer query finds the authors who do not live there.
Here is what happens when you substitute not in in the same query:
select au_lname, au_fname
from authors
where city not in
(select city
from publishers)au_lname au_fname -------------- ------------ White Johnson Green Marjorie O'Leary Michael Straight Dick Smith Meander Dull Ann Gringlesby Burt Locksley Chastity Greene Morningstar Blotchet-Halls Reginald Yokomoto Akiko del Castillo Innes DeFrance Michel Stringer Dirk MacFeather Stearns Karsen Livia Panteley Sylvia Hunter Sheryl McBadden Heather Ringer Anne Ringer Albert (21 rows affected)
These are the results you want. They include all the authors except Cheryl Carson and Abraham Bennet, who live in Berkeley, where Algodata Infosystems is located.
You get the same results if you use !=all, which is equivalent to not in:
select au_lname, au_fname
from authors
where city != all
(select city
from publishers) Subqueries that are introduced with the keyword in return a list of 0 and higher values. For example, this query finds the names of the publishers who have published business books:
select pub_name
from publishers
where pub_id in
(select pub_id
from titles
where type = "business") pub_name ---------------------------------------- New Age Books Algodata Infosystems (2 rows affected)
This statement is evaluated in two steps. The inner query returns the identification numbers of the publishers who have published business books, 1389 and 0736. These values are then substituted in the outer query, which finds the names that go with the identification numbers in the publishers table. The query looks like this:
select pub_name
from publishers
where pub_id in ("1389", "0736") Another way to formulate this query using a subquery is:
select pub_name
from publishers
where "business" in
(select type
from titles
where pub_id = publishers.pub_id) Note that the expression following the where keyword in the outer query can be a constant as well as a column name. You can use other types of expressions, such as combinations of constants and column names.
The preceding queries, like many other subqueries, can be alternatively formulated as a join query:
select distinct pub_name from publishers, titles where publishers.pub_id = titles.pub_id and type = "business"
Both this query and the subquery versions find publishers who have published business books. All are equally correct and produce the same results, though you may need to use the distinct keyword to eliminate duplicates.
However, one advantage of using a join query rather than a subquery for this and similar problems is that a join query shows columns from more than one table in the result. For example, to include the titles of the business books in the result, you would need to use the join version:
select pub_name, title from publishers, titles where publishers.pub_id = titles.pub_id and type = "business"
pub_name title
-------------------- ----------------------------------------
Algodata Infosystems The Busy Executive's Database Guide
Algodata Infosystems Cooking with Computers: Surreptitious
Balance Sheets
New Age Books You Can Combat Computer Stress!
Algodata Infosystems Straight Talk About Computers
(4 rows affected) Here is another example of a statement that can be formulated either with a subquery or a join query: "Find the names of all second authors who live in California and receive less than 30 percent of the royalties on a book." Using a subquery, the statement is:
select au_lname, au_fname
from authors
where state = "CA"
and au_id in
(select au_id
from titleauthor
where royaltyper < 30
and au_ord = 2)au_lname au_fname ------------------------ ------------ MacFeather Stearns (1 row affected)
The outer query produces a list of the 15 authors who live in California. The inner query is then evaluated, producing a list of the IDs of the authors who meet the qualifications.
More than one condition can be included in the where clause of both the inner and the outer query.
Using a join, the query is expressed like this:
select au_lname, au_fname from authors, titleauthor where state = "CA" and authors.au_id = titleauthor.au_id and royaltyper < 30 and au_ord = 2
A join can always be expressed as a subquery. A subquery can often be expressed as a join.
Subqueries that are introduced with the keyword phrase not in also return a list of 0 and higher values. not in means "not = a and not = b and not = c".
This query finds the names of the publishers who have not published business books, the inverse of the example in "Subqueries used with in":
select pub_name from publishers
where pub_id not in
(select pub_id
from titles
where type = "business")pub_name ---------------------------------------- Binnet & Hardley (1 row affected)
The query is the same as the previous one except that not in is substituted for in. However, you cannot convert this statement to a join; the "not equal" join finds the names of publishers who have published some book that is not a business book. The difficulties interpreting the meaning of joins that are not based on equality are discussed in detail in Chapter 4, "Joins: Retrieving Data from Several Tables."
A subquery using not in returns a set of values for each row in the outer query. If the value in the outer query is not in the set returned by the inner query, the not in evaluates to TRUE, and the outer query puts the record being considered in the results.
However, if the set returned by the inner query contains no matching value, but it does contain a NULL, the not in returns UNKNOWN. This is because NULL stands for "value unknown," and it is impossible to tell whether the value you are looking for is in a set containing an unknown value. The outer query discards the row. For example:
select pub_name
from publishers
where $100.00 not in
(select price
from titles
where titles.pub_id = publishers.pub_id)pub_name ------ New Age Books (1 row affected)
New Age Books is the only publisher that does not publish any books that cost $100. Binnet & Handley and Algodata Infosystems were not included in the query results because each publishes a book for which the price is undecided.
Use the exists keyword with a subquery to test for the existence of some result from the subquery:
{where | having} [not] exists (subquery) That is, the where clause of the outer query tests for the existence of the rows returned by the subquery. The subquery does not actually produce any data, but returns a value of TRUE or FALSE.
For example, this query finds the names of all the publishers who publish business books:
select pub_name
from publishers
where exists
(select *
from titles
where pub_id = publishers.pub_id
and type = "business")pub_name ---------------------------------------- New Age Books Algodata Infosystems (2 rows affected)
To conceptualize the resolution of this query, consider each publisher's name in turn. Does this value cause the subquery to return at least one row? In other words, does it cause the existence test to evaluate to TRUE?
In the results of the preceding query, the second publisher's name is Algodata Infosystems, which has an identification number of 1389. Are there any rows in the titles table in which pub_id is 1389 and type is business? If so, "Algodata Infosystems" should be one of the values selected. The same process is repeated for each of the other publisher's names.
A subquery that is introduced with exists is different from other subqueries, in these ways:
The keyword exists is not preceded by a column name, constant, or other expression.
The subquery exists evaluates to TRUE or FALSE rather than returning any data.
The select list of the subquery usually consists of the asterisk (*). There is no need to specify column names, since you are simply testing for the existence or nonexistence of rows that meet the conditions specified in the subquery. Otherwise, the select list rules for a subquery introduced with exists are identical to those for a standard select list.
The exists keyword is very important, because there is often no alternative non-subquery formulation. In practice, a subquery introduced by exists is always a correlated subquery (see "Using correlated subqueries").
Although you cannot express some queries formulated with exists in any other way, you can express all queries that use in or a comparison operator modified by any or all with exists. Some examples of statements using exists and their equivalent alternatives follow.
Here are two ways to find authors that live in the same city as a publisher:
select au_lname, au_fname
from authors
where city = any
(select city
from publishers)select au_lname, au_fname
from authors
where exists
(select *
from publishers
where authors.city = publishers.city)au_lname au_fname -------------- -------------- Carson Cheryl Bennet Abraham (2 rows affected)
Here are two queries that find titles of books published by any publisher located in a city that begins with the letter "B":
select title
from titles
where exists
(select *
from publishers
where pub_id = titles.pub_id
and city like "B%") select title
from titles
where pub_id in
(select pub_id
from publishers
where city like "B%") title
---------------------------------------------------
You Can Combat Computer Stress!
Is Anger the Enemy?
Life Without Fear
Prolonged Data Deprivation: Four Case Studies
Emotional Security: A New Algorithm
The Busy Executive's Database Guide
Cooking with Computers: Surreptitious Balance
Sheets
Straight Talk About Computers
But Is It User Friendly?
Secrets of Silicon Valley
Net Etiquette
(11 rows affected) not exists is just like exists except that the where clause in which it is used is satisfied when no rows are returned by the subquery.
For example, to find the names of publishers who do not publish business books, the query is:
select pub_name
from publishers
where not exists
(select *
from titles
where pub_id = publishers.pub_id
and type = "business")pub_name ---------------------------------------- Binnet & Hardley (1 row affected)
This query finds the titles for which there have been no sales:
select title
from titles
where not exists
(select title_id
from salesdetail
where title_id = titles.title_id) title ----------------------------------------- The Psychology of Computer Cooking Net Etiquette (2 rows affected)
You can use subqueries that are introduced with exists and not exists for two set theory operations: intersection and difference. The intersection of two sets contains all elements that belong to both of the original sets. The difference contains the elements that belong only to the first set.
The intersection of authors and publishers over the city column is the set of cities in which both an author and a publisher are located:
select distinct city from authors where exists (select * from publishers where authors.city = publishers.city)
city -------------------- Berkeley (1 row affected)
The difference between authors and publishers over the city column is the set of cities where an author lives but no publisher is located, that is, all the cities except Berkeley:
select distinct city from authors where not exists (select * from publishers where authors.city = publishers.city)
city -------------------- Gary Covelo Oakland Lawrence San Jose Ann Arbor Corvallis Nashville Palo Alto Rockville Vacaville Menlo Park Walnut Creek San Francisco Salt Lake City (15 rows affected)
|
|