Sybase Technical Library - Product Manuals Home
[Search Forms] [Previous Section with Hits] [Next Section with Hits] [Clear Search] Expand Search

Nested subqueries [Table of Contents] Chapter 8:  Adding, Changing, and Deleting Data

ASA User's Guide

[-] Part 1 Working with Databases
[-] Chapter 7: Using Subqueries
[-] How subqueries work

How subqueries work

Understanding which queries are valid and which ones aren't can be complicated when a query contains a subquery. Similarly, figuring out what a multi-level query does can also be very involved, and it helps to understand how the database server processes subqueries. For general information about processing queries, see Summarizing, Grouping, and Sorting Query Results

Correlated subqueries

In a simple query, the database server evaluates and processes the query's WHERE clause once for each row of the query. Sometimes, though, the subquery returns only one result, making it unnecessary for the database server to evaluate it more than once for the entire result set.

Uncorrelated subqueries

Consider this query:

SELECT name, description 
FROM product 
WHERE quantity <  2 * (
	SELECT avg(quantity) 
	FROM sales_order_items)

In this example, the subquery calculates exactly one value: the average quantity from the sales_order_items table. In evaluating the query, the database server computes this value once, and compares each value in the quantity field of the product table to it to determine whether to select the corresponding row.

Correlated subqueries

When a subquery contains an outer reference, you cannot use this shortcut. For instance, the subquery in the query

SELECT name, description 
FROM product 
WHERE quantity < 2 * (
	SELECT avg(quantity) 
	FROM sales_order_items 
	WHERE product.id=sales_order_items.prod_id)

returns a value dependent upon the active row in the product table. Such subqueries are called correlated subqueries. In these cases, the subquery might return a different value for each row of the outer query, making it necessary for the database server to perform more than one evaluation.

Converting subqueries in the WHERE clause to joins

In general, a query using joins executes faster than a multi-level query. For this reason, whenever possible, the Adaptive Server Anywhere query optimizer converts a multi-level query to a query using joins. The conversion is carried out without any user action. This section describes which subqueries can be converted to joins so you can understand the performace of queries in your database.

Example

The question "When did Mrs. Clarke and Suresh place their orders, and by which sales representatives?" can be written as a two-level query:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id IN (
	SELECT id 
	FROM customer 
	WHERE lname = 'Clarke' OR fname = 'Suresh')

An alternate, and equally correct way to write the query uses joins:

SELECT fname, lname, order_date, sales_rep 
FROM sales_order, customer 
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

The criteria that must be satisfied in order for a multi-level query to be able to be rewritten with joins differ for the various types of operators. Recall that when a subquery appears in the query's WHERE clause, it is of the form

SELECT select-list

FROM table

WHERE 

	[NOT] expression comparison-operator (subquery) |

	[NOT] expression comparison-operator ANY / SOME (subquery) |

	[NOT] expression comparison-operator ALL (subquery) |

	[NOT] expression IN (subquery) |

	[NOT] EXISTS (subquery)

GROUP BY group-by-expression

HAVING search-condition

Whether a subquery can be converted to a join depends on a number of factors, such as the type of operator and the structures of the query and of the subquery.

Comparison operators

A subquery that follows a comparison operator (=, <>, <, <=, >, >=) must satisfy certain conditions if it is to be converted into a join. Subqueries that follow comparison operators in general are valid only if they return exactly one value for each row of the main query. In addition to this criterion, a subquery is converted to a join only if the subquery

Example

Suppose the request "When were Suresh's products ordered, and by which sales representative?" were phrased as the subquery

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = (
	SELECT id 
	FROM customer 
	WHERE fname = 'Suresh')

This query satisfies the criteria, and therefore, it would be converted to a query using a join:

SELECT order_date, sales_rep 
FROM sales_order, customer
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

However, the request, "Find the products whose in-stock quantities are less than double the average ordered quantity" cannot be converted to a join, as the subquery contains the aggregate function avg:

SELECT name, description 
FROM product 
WHERE quantity <  2 * (
	SELECT avg(quantity) 
	FROM sales_order_items)

Quantified comparison test

A subquery that follows one of the keywords ALL, ANY and SOME IS CONVERTED INTO A JOIN ONLY IF IT SATISFIES CERTAIN CRITERIA.

The first four of these conditions are relatively straightforward.

Example

The request "When did Mrs. Clarke and Suresh place their orders, and by which sales representatives?" can be handled in subquery form:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = ANY (
	SELECT id 
	FROM customer 
	WHERE lname = 'Clarke' OR fname = 'Suresh')

Alternately, it can be phrased in join form

SELECT fname, lname, order_date, sales_rep 
FROM sales_order, customer 
WHERE cust_id=customer.id AND (lname = 'Clarke' OR fname = 'Suresh')

However, request, "When did Mrs. Clarke, Suresh, and any employee who is also a customer, place their orders?" would be phrased as a union query, and thus cannot be converted to a join:

SELECT order_date, sales_rep 
FROM sales_order 
WHERE cust_id = ANY (
	SELECT id 
	FROM customer 
	WHERE lname = 'Clarke' OR fname = 'Suresh'
	UNION 
	SELECT id
	FROM employee)

Similarly, the request "Find the order ID's and customer ID's of those orders that were not placed after all products of order #2001 were shipped," is naturally expressed with a subquery

A subquery with the ALL operator that can be converted to a join

SELECT id, cust_id 
FROM sales_order 
WHERE NOT order_date > ALL (
	SELECT ship_date 
	FROM sales_order_items 
	WHERE id=2001)

It would be converted to the join:

SELECT sales_order.id, cust_id
FROM sales_order, sales_order_items
WHERE (sales_order_items.id=2001) and (order_date <= ship_date)

However, the request "Find the order ID's and customer ID's of those orders not shipped after the first shipping dates of all the products" would be phrased as the aggregate query

SELECT id, cust_id 
FROM sales_order 
WHERE NOT order_date > ALL (
	SELECT first (ship_date)
	FROM sales_order_items )

Therefore, it would not be converted to a join.

Negating subqueries with the ANY and ALL operators

The fifth criterion is a little more puzzling: queries of the form

SELECT select-list

FROM table

WHERE NOT expression comparison-operator ALL (subquery) 

are converted to joins, as are queries of the form

SELECT select-list

FROM tableWHERE expression comparison-operator ANY (subquery) 

but the queries

SELECT select-list

FROM tableWHERE expression comparison-operator ALL (subquery)

and

SELECT select-list

FROM table

WHERE NOT expression comparison-operator ANY (subquery)

are not.

Logical equivalence of ANY and ALL expressions

This is because the first two queries are in fact equivalent, as are the last two. Recall that the any operator is analogous to the OR operator, but with a variable number of arguments; and that the ALL operator is similarly analogous to the AND operator. Just as the expression

NOT ((X > A) AND (X > B))

is equivalent to the expression

(X <= A) OR (X <= B)

the expression

NOT order_date > ALL (
	SELECT first (ship_date) 
	FROM sales_order_items )

is equivalent to the expression

order_date <= ANY (
	SELECT first (ship_date)
	FROM sales_order_items )

Negating the ANY and ALL expressions

In general, the expression

NOT column-name operator ANY (subquery)

is equivalent to the expression

column-name inverse-operator ALL (subquery)

and the expression

NOT column-name operator ALL (subquery)

is equivalent to the expression

column-name inverse-operator ANY (subquery)

where inverse-operator is obtained by negating operator, as shown in the table:

Table of operators and their inverses

The following table lists the inverse of each operator.

Operator

inverse-operator

=

<>

<

=>

>

=<

=<

>

=>

<

<>

=

Set membership test

A query containing a subquery that follows the keyword IN IS CONVERTED INTO A JOIN ONLY IF:

Example

So, the request "Find the names of the employees who are also department heads", expressed by the query:

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
	SELECT dept_head_id 
	FROM department 
	WHERE (dept_name='Finance' or dept_name = 'Shipping'))

would be converted to a joined query, as it satisfies the conditions. However, the request, "Find the names of the employees who are either department heads or customers" would not be converted to a join if it were expressed by the UNION query

A UNION query following the IN operator can't be converted

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
	SELECT dept_head_id 
	FROM department 
	WHERE (dept_name='Finance' or dept_name = 'Shipping')
	UNION
	SELECT cust_id
	FROM sales_order)

Similarly, the request "Find the names of employees who are not department heads" is formulated as the negated subquery

SELECT emp_fname, emp_lname
FROM employee 
  WHERE NOT emp_id IN (
	SELECT dept_head_id 
	FROM department 
	WHERE (dept_name='Finance' OR dept_name = 'Shipping'))

and would not be converted.

The conditions that must be fulfilled for a subquery that follows the IN keyword and the ANY keyword to be converted to a join are identical. This is not a coincidence, and the reason for this is that the expression

A query with an IN operator can be converted to one with an ANY operator

WHERE column-name IN (subquery)

is logically equivalent to the expression

WHERE column-name = ANY (subquery)

So the query

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id IN (
	SELECT dept_head_id 
	FROM department 
	WHERE (dept_name='Finance' or dept_name = 'Shipping'))

is equivalent to the query

SELECT emp_fname, emp_lname
FROM employee 
WHERE emp_id = ANY (
	SELECT dept_head_id 
	FROM department 
	WHERE (dept_name='Finance' or dept_name = 'Shipping'))

Conceptually, ASA converts a query with the IN operator to one with an ANY operator, and decides accordingly whether to convert the subquery to a join.

Existence test

A subquery that follows the keyword EXISTS is converted to a join only if it satisfies the following two conditions:

Example

Therefore, the request, "Which customers placed orders after July 13, 1994?", which can be formulated by this query whose non-negated subquery contains the outer reference customer.id = sales_order.cust_id, could be converted to a join.

SELECT fname, lname
FROM customer
WHERE EXISTS (
	SELECT * 
	FROM sales_order 
	WHERE (order_date > '1994-07-13') AND (customer.id = sales_order.cust_id))

The EXISTS keyword essentially tells the database server to check for empty result sets. When using inner joins, the database server automatically displays only the rows where there is data from all of the tables in the FROM clause. So, this query returns the same rows as does the one with the subquery:

SELECT fname, lname
FROM customer, sales_order
WHERE (sales_order.order_date > '1994-07-13') AND (customer.id = sales_order.cust_id).


Nested subqueries [Table of Contents] Chapter 8:  Adding, Changing, and Deleting Data