Far and away the number one Business Intelligence client in the world is Microsoft Excel. While there are tons of data visualization tools out there, Excel is hands down the leader since it is both familiar to users and very powerful. Developers have built tons of business intelligence (BI) apps in Excel using connections to data warehouses (or cubes) and letting the users go crazy with pivot tables and charts.
Things are about to get even better with project Gemini or now Microsoft SQL Server PowerPivot (we rebel and just call it PowerPivot). PowerPivot is an add-in for Excel 2010 or Sharepoint. PowerPivot for Excel is a data analyses tool that allows you to connect to a database, download data and store it in a local data engine (VertiPaq) so you can slice and dice it to your heart’s content using familiar Excel tools such as pivot tables and charts. (You can then send it up to SharePoint if you like, but let’s just focus on Excel for now.) PowerPivot works in-memory using your local PC’s processing power and is optimized to handle millions of rows in memory on cheap commodity PCs. The VertiPaq OLAP Engine that is part of PowerPivot compresses and manages millions of rows of data in memory for you.
There are many awesome features of PowerPivot, but something I learned reading the team’s blog on data importing was that PowerPivot supports SQL Azure natively. This scenario is great since you can download your SQL Azure data, store it locally and slice and dice offline.
Let’s imagine you have this set up in SQL Azure:
- SQL Azure cloud based online-transaction processing database (OLTP)
- SQL Azure cloud based online analytical processing data warehouse database (OLAP)
You don’t have a local server to set up some Analysis services cubes and SQL Azure doesn’t provide that capability. You decide on a hybrid solution and provide a “self-service” distributed OLAP system. Your users, using Excel and PowerPivot, download from the cloud some of the OLAP data from SQL Azure and use their own hardware to do the number crunching. You maybe thinking, “I heard that this VertiPaq engine is great, but how will a laptop handle all of this data/processing?” Remember if you architect your OLAP database properly, there will be a lot of “pre-crunching” already done, avoiding many segmentations and rollups. I tested about 100 million rows (of old Corzen data) on the “PDC laptop” with 2 GB of RAM and had near instantaneous results.While the cloud is real sexy right now, you can’t perform these type of operations without major latency.
Let’s walk through the basics, just getting some data from SQL Azure into Excel and PowerPivot. First you need Excel 2010 and PowerPivot. You can grab Excel via Microsoft’s web site for free (for now!) since it is in beta and PowerPivot from here. Make sure you download the proper version: x32 or x64.
Once installed, you will see a “PowerPivot” tab in Excel. You can click on on the PowerPivot window icon to get started.
The PowerPivot tool is mostly for managing data and connections. You can import data from a variety of data sources, including databases, files, and services (including RESTful ones.) To connect and download from SQL Azure you have to choose From Database|From Other Sources from the Home tab on the ribbon.
This will bring you to a list of available data sources, choose Microsoft SQL Azure.
Of course, you will need to log into SQL Azure.
Once logged in, you can download data either from a TSQL query or just the tables and views raw. I choose to download just the tables from my Northwind database.
There are come cool features like the ability to select one table and then automatically select all related tables. You can also specify some filters to your imports (good if you are segmenting the data by user.) Once you are finished, PowerPivot will now import all of your data for you.
Now that your data is in PowerPivot you can play with it very easily. Remember that the data is located on the client in-memory as part of your workbook. (It is compressed.) PowerPivot gives you a tab for each data table you imported and you can go in and in typical Excel fashion, sort and filter, like I did for Barcelona (oh the fond memories of 5 TechEds in Barcelona…..) All of this sorting and filtering is happening in-memory on your machine, you can unplug your network cable if you like, however, you can also refresh your data as you need. PowerPivot gives the developer several ways to refresh data. Some techniques and guidelines are here.
Next you can create some PivotTables, charts, or combination. Just choose PivotTable from the Home menu and then choose a PivotTable type. I will just make a dirt simple Pivot table to give you an idea of some of the interactivity you can build pretty rapidly. (I am also using the OLTP version of Northwind for this walk through, so I did not create any cool queries to work with yet. I’ll do another blog post with some more sophisticated demos after the holidays.)
Here is a very simple, yet super powerful PivotTable that I built in 30 seconds using the PowerPivot tools. Of course I just used the raw Order table and my lookups (ShipVia and Employee) are showing their integer value, but work with me. I have a pivot table where I can view the total amount each customer spent on freight in all of our orders,by country/city broken out by the employee who took the order. I can also dynamically filter the data set by the Country and also further filter it by the ShipVia data set.
This creates a very interactive data analysis “report” for your users. (Or they can create it themselves if they are Excel savvy.) In a few seconds the users can see multiple versions of the data, changing the filters (and sorts) on the fly. All shippers in all cities for all employees. No problem. Shipper ID 2 in USA only. Sure. You get the drift. (What is cool is that the users can share this via PowerPivot for SharePoint if they like!)
Here is what I did:
I Dragged the ShipCity and CustomerID to the “Row Labels” box to use ShipCity and CustomerID as the main data elements in our rows. I dragged Freight into the Values box to sum on freight for each CustomerID. (Customers have many orders with different freight costs, so freight is a good item to sum up.) I choose Country as a dynamic filter and will break out the Freight totals by EmployeeID (in the Column Labels box). Lastly, I added a Vertical Slicer where I see a slice of all shippers (ShipVia) and can do additional filters on all current data selections.
Pretty easy. Now combine this with some pre-built views on the server and your users can really go to town. All without a persistent connection back up to SQL Azure.