Exercise #30
In this task, data from two tables has to be combined in one result set, having fund receipts and payments for a point occurring on the same day on a single row.
Similar task (29) using the Income_o and Outcome_o tables usually doesn’t cause any difficulties. The following query illustrates the essence of the problem.
Solution 2.1.1
SELECT Income.point, Income.date, SUM(out), SUM(inc)
FROM Income
LEFT JOIN Outcome ON Income.point = Outcome.point
AND Income.date = Outcome.date
GROUP BY Income.point, Income.date
UNION
SELECT Outcome.point, Outcome.date, SUM(out), SUM(inc)
FROM Outcome
LEFT JOIN Income ON Income.point = Outcome.point
AND Income.date = Outcome.date
GROUP BY Outcome.point, Outcome.date;
[[ column ]] |
---|
NULL [[ value ]] |
The idea of this solution lies in joining the receipt registration table with the payment registration table based on the point and date columns. The Left join used here ensures the correct result in case a point only received funds on certain days without making any payments (out is NULL). After that, the result is unioned with an inverted query that uses a left outer join of the payment registration table with the receipt registration table. Thus, the case of a point having payments but no receipts on a certain day is taken into account. Duplicate records (which occur if there’re both receipts and payments) are automatically removed by the UNION operator.
The query 2.1.1 returns incorrect results when there are multiple receipts and payments occur for a point on a single day. Let’s have a An illustrative example for this case is March 24, 2001. Let`s execute a couple of queries:
SELECT *
FROM Income
WHERE date = '2001-03-24 00:00:00.000'
AND point = 1;
[[ column ]] |
---|
NULL [[ value ]] |
SELECT *
FROM Outcome
WHERE date = '2001-03-24 00:00:00.000'
AND point = 1;
[[ column ]] |
---|
NULL [[ value ]] |
We get the following result:
Income
code | point | date | inc |
---|---|---|---|
3 | 1 | 2001-03-24 00:00:00.000 | 3600 |
11 | 1 | 2001-03-24 00:00:00.000 | 3400 |
Outcome
code | point | date | out |
---|---|---|---|
2 | 1 | 2001-03-24 00:00:00.000 | 3663 |
13 | 1 | 2001-03-24 00:00:00.000 | 3500 |
In this case, with both fund receipts and payments present, the outer join is equivalent to an inner join; in other words, each row of one table is combined with each row of the other table with matching date and point. Therefore, we get the following set of rows before group by is executed (shown are the receipts and payments columns only):
inc | out |
---|---|
3600 | 3663 |
3600 | 3500 |
3400 | 3663 |
3400 | 3500 |
After the data is grouped and the sum is calculated, we get a double sum both for receipts and payments. If there were three receipts records, the payments sum would be three times as much, etc.
This has nothing to do with duplicates, since each query combined by the UNION operator returns the same result for this case, so that in the end, there remains just one row for each pair {point, date}.