Sunday, August 14, 2011

an interesting SQL query question

A very interesting SQL statement question. We have a 'payment' table:

payment table:

year    salary
2000    1000
2001    2000
2002    3000
2003    4000

The question is: Write a query on payment table so we get the result below:

Query Result:

year    salary
2000    1000
2001    3000
2002    6000
2003    10000

We can find that the result's salary is the sum of the salary of the year and the salary of last year. For example:
In Query Result, 2001's salary is 3000 = Table payment's year 2001's salary 2000 + Table payment's year 2000's salary 1000  

Answer 1:
select year, (select sum(b.salary) from payment b where b.year <= a.year) as salary from payment a;

To expalin this one, better imagin we have two tables payment a, and payment b

payment a:
Year    payment
2000    1000
2001    2000
2002    3000
2003    4000

payment b:
Year    payment
2000    1000
2001    2000
2002    3000
2003    4000

First, MySql tries to retrieve all the year data from table a. For the second column, MySql will select the rows from table b that b.year <= a.year, and then get the SUM of the payment of these rows. The process is like:
For a.2000: b.2000 <= a.2000; SUM(b.1000)
For a.2001: b.2000 <= a.2001; b.2001 <= a.2001; SUM(b.1000,b.2000)
For a.2002: b.2000, b.2001, b.2002 <= a.2002; SUM(b.1000,b.2000,b.3000)
...

Finally, we can get the result:
year    salary
2000    1000
2001    3000
2002    6000
2003    10000

Answer 2:
Answer 1 using sub query is straight forward and easier for people to understand. But when it comes to SQL, sub query is mostly not the best option. We can always write the easier sub query first, and then after we fully understand how the data should be retrieved, we can change it to use JOIN. That is how to use JOIN to do it:

select a.year, sum(b.salary) salary from payment b join payment a on b.year <= a.year group by a.year;

Answer 3 (Here is the real interesting answer):
If we still remember the math we learned at primary school(Well, at least that is being taught in primary school in China), we can find that the salary in payment table is actually an arithmetic sequence (http://en.wikipedia.org/wiki/Arithmetic_series)! For arithmetic sequence, we have this formula: Sn = (A1 + An) * n / 2. So, we can use this query:

select year, (1000+salary)*salary/2000 from payment;

It works perfectly for this specific question.

No comments: