I have to admit that I am totally hooked on the The Rozenshtein Method. My buddy, Richard Campbell showed it to me a year or two ago and I have been hooked ever since. I recently demoed it at TechED in Dallas, a recent WebCast, VSLive in New York and will be showing it off at TechEd in Malaysia next week. I have gotten lots of email and positive feedback so I decided to blog it here.
Here is how it works. You need a crosstab query. You have to move rows into columns. You also need ANSI 92 SQL that will run in any database.Well there are several ways to do this, but the most generic and one of the most powerful ways is called the Rozenshtein Method, which was developed by the Russian mathematician David Rozenshtein. This technique was taken from his book: Optimizing Transact-SQL : Advanced Programming Techniques.
First let’s look at the desired results. We want to take the orders data from Northwind and pivot the sales date (aggregated by month) as columns with the sum of the total sales in the row grouped by customer. It would look something like this:
CompanyName TotalAmount Jan Feb Mar…(etc)
Company1 100 25 33 10
Company2 467 76 62 87
(etc)
The TSQL query to do this is, go ahead and run it in Northwind in SQL Server:
SELECT CompanyName, SUM((UnitPrice*Quantity)) As TotalAmt,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-1)))) AS Jan,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-2)))) AS Feb,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-3)))) AS Mar,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-4)))) AS Apr,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-5)))) AS May,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-6)))) AS Jun,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-7)))) AS Jul,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-8)))) AS Aug,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-9)))) AS Sep,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-10)))) AS Oct,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-11)))) AS Nov,
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-12)))) AS Dec
FROM Customers INNER JOIN
Orders ON Customers.CustomerID = Orders.CustomerID INNER JOIN
[Order Details] ON Orders.OrderID = [Order Details].OrderID
Group By Customers.CompanyName
So how does this work?
This method uses Boolean aggregates, so that each column has a numeric expression that resolves each row as a zero or one and that value (0 or 1) is multiplied by your numeric expression (Like TotalSales or (UnitPrice*Quantity). That is all there is to it, quite simple. But wait, there’s more to explain:
We want to create columns for each Month in our data. To find a month use DatePart. But we need to subtract the DatePart value (1-12) from the amount you’re looking for (1 for Jan, 2 for Feb, etc) as shown here for January:
DatePart(mm,OrderDate)-1
So that true = zero, false > 0 or < 0. For example if the month you were looking for was January and the DatePart was 1 and you subtract 1 from that value you get 0, which is true. If you are looking for March you would get -2 and that would be false.
Next you have to compute the sign of the expression and get the absolute value like so:
ABS(SIGN(DatePart(mm,OrderDate)-1)))
This will give us a positive value. Remember 0 is still true. Now subtract the value computed from 1 in order to get a 0 or 1 from the value of your expression (the Boolean aggregate). The code is:
(1-ABS(SIGN(DatePart(mm,OrderDate)-1))))
For example if you had March return 3 from the Datepart, 3-1=2 and 1-2 =-1. The absolute value is 1. This will always return 0 or 1. If your expression was zero, the value is now one. If was one, the value is zero.
Last step. Taking the SUM of the Boolean values will give you a count of the values that qualify. So you can find out how many sales you made in Jan, Feb, etc. So now multiply the value by the price and quantity, but remember its now one = true. Take a look here:
SUM((UnitPrice*Quantity)*(1-ABS(SIGN(DatePart(mm,OrderDate)-1)))) AS Jan
If its zero, nothing gets added, if its one, you get the value of the sale. The sum of the total expression is the total of sales for the month. If you have a DatePart that is evaluated to 0 then ((UnitPrice*Quantity)*0) is 0 and those results are ignored in the SUM. If you have a month that matches your expression resolves to 1 and ((UnitPrice*Quantity)*1) is the value of the sale.
How easy!
But wait, there’s more! Suppose you wanted two values combined? Compute each value down to zero or one separately. Now you can use AND by multiplying, OR by adding (and reduce to 1 or 0 using SIGN).
Ok, have fun!!!