loading..
Русский    English
19:34

GROUP BY clause

The GROUP BY clause is used to define the row groups for each of the aggregate functions (COUNT, MIN, MAX, AVG, and SUM) that may be applied. When aggregate functions are used without a GROUP BY clause, all the columns with names mentioned in SELECT clause must be included in the aggregate functions. These functions are then applied to the total set of the rows that fit the query predicate. Otherwise, all columns in the SELECT list not included into the aggregate functions must be listed in the GROUP BY clause. As a result, all the returned query rows distributed into groups are characterized by the same combinations of these column values. Later on, aggregate functions are applied to each group. It is essential that NULL values are considered equal in this case, i.e. when grouping by the column including NULL values all rows will be combined in one group.

When a GROUP BY clause is used without any aggregate function in the SELECT clause, the query will simply return one row from each group. Beside the DISTINCT keyword, this opportunity may be used in eliminating the row duplicates from the result set.

Let us consider a simple example:

Console
Execute
  1. SELECT model, COUNT(model) AS Qty_model, AVG(price) AS Avg_price
  2. FROM PC
  3. GROUP BY model;

The number of computers and their average price are defined for each PC model in the query. All rows with the same model value are combined in a group with value count and the average price calculated for each group thereafter. Executing this query gives the following table:

model    Qty_model    Avg_price
1121    3    850
1232    4    425
1233    3    843,333333333333
1260    1    350

Should the SELECT clause include date column these characteristics may be calculated for each date specified. For that, the date should be added as grouping column with the aggregate functions be calculated for each combination of {model, date}.

There are some particular rules for executing aggregate functions:

  •  If none of the rows was returned by the query (or none of the rows for the given group), the source data for any aggregate function to be calculated is missing. In such case, the COUNT function returns zero, while other functions return NULL.
    This property can give somewhat unexpected results. Let's consider an example:
    Console
    Execute
    1. SELECT 1 a WHERE
    2. EXISTS(SELECT  MAX(price) FROM PC WHERE price<0)

    Subquery in EXISTS predicate returns a single row with NULL as the column value. In view of this fact, query in example returns 1 despite of absent of PCs having negative prices in database.
  •  The argument of the aggregate function cannot include aggregate functions itself (the function of function) i.e. no maximum of average values is obtainable.
  •  The result of the COUNT function is integer. Other aggregate functions inherit the types of processing data.
  •  An error occurs where the result is over the maximal value of the used data type while executing SUM function.

In so doing, if the query does not include GROUP BY clause, the aggregate functions in the SELECT clause process all the result rows of this query. If the query includes the GROUP BY clause, each row set with the same value in the column or the same combination of values in several columns given in the GROUP BY clause forms a group with the aggregate functions being calculated for each group separately.

Suggested exercises: 14, 15, 19, 20, 21, 22, 30, 32, 55, 56, 57, 59, 63, 66, 67, 68, 70, 72, 74, 76, 84, 86, 87, 89, 91, 102, 111, 112, 113, 114, 117, 119, 130

Bookmark and Share
Tags
aggregate functions Airport ALL AND AS keyword ASCII AVG Battles Bezhaev Bismarck C.J.Date calculated columns Cartesian product CASE cast CHAR CHARINDEX Chebykin check constraint classes COALESCE common table expressions comparison predicates Computer firm CONVERT correlated subqueries COUNT CROSS APPLY CTE data type conversion data types database schema date/time functions DATEADD DATEDIFF DATENAME DATEPART DATETIME DDL DEFAULT DEFAULT VALUES DELETE DISTINCT DML duplicates EXCEPT exercise (-2) exercise 19 exercise 23 exercise 32 More tags
The book was updated
month ago
©SQL-EX,2008 [Evolution] [Feedback] [About] [Links] [Team]
All right reserved.
Rambler's Top100