Correlated subqueries

Correlated subqueries sometimes let to create the query very briefly, which can look more cumbersome when you use other means. We shall remind that correlated subquery contains reference to the query which it contains (we shall name it the main query) with the result that subquery executes for each string of the main query.

Example 8.9.1

It is required to determine the date and the flights of every passenger, committed by them at its last flight day

To say the other way, you have to find the maximum dateof the flight for each passenger and to find all of his flights. With determination of the maximum date there is no problems :

  1. SELECT id_psg, MAX(date)
  2. FROM pass_in_trip
  3. GROUP BY id_psg;

However, there is no flight. If we'll try to include flight in the list of output:

  1. SELECT id_psg, trip_no, MAX(date)
  2. FROM pass_in_trip
  3. GROUP BY id_psg;
we receive an error message, because number of flight is not used in aggregate function and also does not enter into the list of columns of the grouping. If we'll do the last one:

  1. SELECT id_psg, trip_no, MAX(date)
  2. FROM pass_in_trip
  3. GROUP BY id_psg, trip_no;
we receive last flight of the passenger by each flight which he flied. It is completely not that problem which we try to solve. Use of a orrelated subquery

  1. SELECT id_psg, trip_no, [date]
  2. FROM pass_in_trip pt_1
  3. WHERE [date] = (SELECT MAX([date])
  4. FROM pass_in_trip pt_2
  5. WHERE pt_1.id_psg = pt_2.id_psg
  6. );
gives that is necessary for us:

id_psg trip_no date
10 1187 2003-04-14 00:00:00.000
9 1182 2003-04-13 00:00:00.000
8 1187 2003-04-14 00:00:00.000
6 1123 2003-04-08 00:00:00.000
5 1145 2003-04-25 00:00:00.000
3 1145 2003-04-05 00:00:00.000
3 1123 2003-04-05 00:00:00.000
2 1124 2003-04-02 00:00:00.000
1 1100 2003-04-29 00:00:00.000

Here for each flight it is checked, whether it is accomplished last flight day of the given passenger. Thus if such flights there was a several, we receive all of them.

Seeming lack of the resulted solution is just also that the subquery should be calculated for every string of main query. To avoid it, it is possible to offer the alternative solution, which is using joining of the Pass_in_trip table with a subquery resulted at the beginning which calculates the maximal dates for each passenger:

  1. SELECT pt_1.id_psg, trip_no, [date]
  2. FROM pass_in_trip pt_1 JOIN
  3. (SELECT id_psg, MAX([date]) md
  4. FROM pass_in_trip
  5. GROUP BY id_psg
  6. ) pt_2 ON pt_1.id_psg = pt_2.id_psg AND
  7. [date] = md;

However, as it will seem strangely, optimizer of the  A database management system (DBMS) by Microsoft Corporation. SQL(Structured Query Language) is a database computer language designed for the retrieval and management of data in relational database management systems (RDBMS), database schema creation and modification, and database object access control management.SQL Server 2000 will choose for both queries the identical  execution plan at presence of a natural index (cluster) on a primary key. Accordingly, and time of execution of queries should not differ.


It should be reminded that examples given here can be run directly on the website by selecting the check box Without checking on the page with exercises.

Suggested exercises: 58

Bookmark and Share


aggregate functions Airport ALL AND AS keyword ASCII AVG Battles Bezhaev Bismarck C.J.Date calculated columns Cartesian product CASE cast CHAR CHARINDEX Chebykin check constraint classes COALESCE common table expressions comparison predicates Computer firm CONSTRAINT CONVERT correlated subqueries COUNT CROSS APPLY CTE data type conversion data types database schema DATEADD DATEDIFF DATENAME DATEPART DATETIME date_time functions DDL DEFAULT DEFAULT VALUES DELETE DISTINCT DML duplicates edge equi-join EXCEPT exercise (-2) More tags
The book was updated
several days ago
©SQL-EX,2008 [Evolution] [Feedback] [About] [Links] [Team]
All right reserved.