Microsoft Analysis Services & MDX - blogs by Ajit Singh

Business Intelligence domain players–2011 to 2013

2013-06-21T11:59:00.001+05:30

MDX Clients and Servers

2012-03-23T11:49:00.002+05:30

The following is a list of some familiar applications/clients and servers that support MDX in the marketplace.

MDX Clients	MDX Servers
arcplan Edge	Descisys Terasolve
arcplan Enterprise	IBM Cognos TM1
arcplan Excel Analytics	IBM InfoSphere
arcplan Mobile BI	Infor PM OLAP (formerly MIS Alea)

Bissantz DeltaMaster	Kognitio Pablo
IBM Cognos	Micrsosoft Analysis Services
Jaspersoft	Mondrian
Microsoft Excel	Oracle Database OLAP Option
Microsoft Proclarity	Oracle Essbase
Microsoft Reporting Services	Panoratio
Microstrategy	SAP BusinessObjects Planning and Consolidation
Oracle Interactive Reporting (formerly Brio)	SAP BusinessObjects Profitability and Cost Management
Panorama NovaView	SAP BusinessObjects Strategy Management
Response 42 Query & Report	SAP NetWeaver BW
SAP BusinessObjects Crystal Reports	SAP HANA
SAP BusinessObjects Voyager	SAS OLAP Server
SAP BusinessObjects Analysis, Edition for OLAP (next version of Voyager)	Teradata
Saiku (Formerly PentahoAnalysisTool)

Fact table design for "State Workflow Analysis": Analysis Services Dimensional modeling

2008-07-19T10:46:00.001+05:30

Note: This is a work in progress article and may be updated regularly based on feedback.

There are many business analytical needs whose data is in the form of “State Workflow”. A state workflow consists of a set of states. There are a valid number of “State Transitions”, i.e. changing of one state to another. The state transitions happen by some event. There can be a valid number of events responsible for state transition.

The typical business problem that conforms to above situation is “Sales Pipeline” analysis. The typical queries that need to be resolved are:

How long it took to move from one state to another?
How many items moved from one state to another?
How much time spent in each state for a given period range?
What is the count of items in a given state for a period range vs another period range?
What is the aggregation of a related attribute with reference to state?

One typical analysis as presented by below graph is that for each day, what is the "% of Count of All opportunities" in each stages of sales pipeline.

Or analyze the data using additional dimension such as "by Industry"

Or by "Booked Revenue"

The question is how we design our OLAP system to do such kind of analysis. Well, the trick lies in creating an intelligent fact table where we capture many of details at the time of ETL process itself rather than importing the transaction system "AS-IS" and hoping that some OLAP magic wand would answer all of our queries.

Let us understand it with some simple example which needs similar kind of analysis. In today’s rising gasoline cost, what better example is to understand how I drive or use my car?

If you think about operating a car, you will immediately think of several states that could be modeled in a workflow. Here are the states that I’ve chosen for this example:

Not Running: In this state, you are in the car but the engine is not running.
Idling: You’ve now started the engine, but you’re not moving.
Moving Forward: The car is moving forward.
Moving in Reverse: The car is moving in reverse.
Done with the Car: You are finished with the car.

There are many other states that you could model, but this list is enough to provide a substantial example. The next step is to identify the events that can occur while you are in each state.

Below table lists the events that are allowed for each state, along with the planned state transitions as each event is handled.

(Well, I still drive a manual gear car, for the sheer pleasure of it)

Lets look at our typical transactional system. Lets say I have an advanced tracking system installed in my car which records the states and events in a file. Typically, the transactional data recorded is very simple in the following structure:

Now, the typical analytical questions to understand my driving habits are:

On a given day, month or year:

How many miles I drive in “Moving Forward”+ “Moving in Reverse” or individual state?
How many times I “Apply Brake” while in “Moving Forward” state?
How many miles I drive in “Gear Four” event in “Moving Forward” state? (higher the gear, better is the mileage)
How much time is spent in “Idling” state for “Traffic Wait” event? (higher the time, higher is the fuel wastage)
How many times, I do “Start the engine” event from “Not Running” state?
How many times, I do “Beep Horn” even? (Am I compulsive honker?)
How many times I change gears while in “Moving Forward” state?
What is my average speed per trip?
What is my average speed per trip while in “Moving Forward” state?
Many more you can think of.

To answer the above analytical questions from our transactional data presented above is difficult for the end users. If a cube is created directly on the above transactional data, the MDX queries would be even more difficult. The point is, we want our end users to do more of analysis rather than spend effort on arranging the data prior to analysis.

The trick is to setup a ETL system which produces a synthesized fact table for data analysis. Lets take the below fact table which presents the same transactional data provided above:

From the above synthesized fact table, lets review if we can get the answers of our typical analytical questions to understand my driving habits. I think it should answers all of them.

We can create an OLAP cube on top of this fact table and using aggregation functions, answer the same analytical questions using MDX queries or expose the data to some OLAP client. Well, this is a topic for some other day.

Google specialized search for Analysis Services and MDX web resources integrated in my blog

2008-07-14T23:13:00.002+05:30

I have been tracking web pages related to Analysis Services and MDX since 2005. Over these few years, I have collected numerous links, articles, blogs on Analysis Services and Multi-Dimensional Expressions.

Normally, I tried to store the key articles on my computer disk but there are many resources which I wished I could just search on. Finally, this Sunday afternoon, I extracted all the URLs and created my own Google custom search engine which tracks the web sites indexed my me. It turned out to be very cool and most of the time gave very accurate results which otherwise was not possible to get in generic Google search engine.

E.g., if you search for a very generic term called "Bucket", here is what you would get:

Well, try the same search in the customized search option in my blog and here is the result:

Here you would get a bunch of links which deals with the "Bucket Analysis" using Multi-dimensional expression (MDX) queries. Well, the result also listed lots of sites from one "Gary Low" whose blog on Analysis Services had a name "Bucket" so, I needed to filter out the "Gary Low" in my search keyword.

There has been a lot of effort on my part gone into setting up this custom search specifically for Analysis Services and MDX and hope that you too can take advantage of it.

Happy searching!!

NextAnalytics and MDX : Part 1 - Swap Cells with Row Labels

2008-07-12T20:39:00.001+05:30

NextAnalytics blog on "Can a business intelligence product be used to answer analytic questions?" raised some valid questions on the complexity, the business users face while analyzing the business data. It did generate quite of few good responses on how MDX can provide the similar solution. I think both the approaches address different level of needs and both can co-exist.

I just started to look at their online demo site. Being in CPM (Corporate Performance Management) industry for about eight years, I can understand the difficulty the functional users face while venturing outside the realm of predefined reports and designing on their own.

I plan to write a multi-part articles on how NextAnalytics solutions can be replicated using MDX. I would be using "Adventure Works DW" OLAP database to replicate the similar business case.

I saw a very interesting case in NextAnalytics where the result grid cells distinct values can be swapped with either rows or columns as labels. The row or column members appear in against the swapped cell's. This is a very important and useful feature for performing the basket analysis or outliers in a report.

The goal is not to prove who is superior or complex to use. Just the sheer pleasure of validating that MDX is equally capable of fulfilling similar requirement.

Let me illustrate as what is happening in NextAnalytics' "Swap Cells with Row Labels" Feature:

1. This is the start point. The sales data is presented in the cross tab format. (a limited portion of data is displayed, hence in screens, they might not match)

2. Now the variation of Sales data for each day and sales person (column-wise) is calculated and displayed. I guess, the formula would be somewhat at the below line:

Variation = (Sales Amount - Average) / (Standard Deviation for the day)

3. Using "Swap cells to Row Labels", the unique values of variation is shifted to rows and the row members are loaded in the corresponding standard deviation cells.

I would say that the above feature is simply awesome. I can visually say who are the "Outstanding" sales reps (Standard Deviation of more than 3) and who are the laggards (in above screen, standard deviation of -1) and whether they are consistent in their performance.

Challenge to replicate the same requirement using MDX:

1. Lets get some sample cross tab data

2. Now, lets calculate the variation. I would convert the variation amounts to the range basket so that we would have few but distinct variation baskets in the cell.

3. Now, we need to do "Cells to Row Labels" transformation on the above MDX. What i mean is to retain the dats on the column, put 3,2,1,0,-1,-2,-3 on rows and fill in the corresponding employees in the cells.

The result should be somewhat as below grid ( I have filled couple of cells manually for illustration).

The MDX in item #1 and item#2 is for illustration of concept and not necessary the accuracy of calculation. The trick needed is to use the item#2 MDX and generate the last item#3 output.

Is anybody up for the challenge to write the third part of the MDX query?

Mosha provided the first cut of the query as below:

WITH
SET Emp AS
    {
      [Employee].[Employee].&[290]
     ,[Employee].[Employee].&[289]
     ,[Employee].[Employee].&[284]
     ,[Employee].[Employee].&[291]
     ,[Employee].[Employee].&[283]
     ,[Employee].[Employee].&[288]
     ,[Employee].[Employee].&[282]
     ,[Employee].[Employee].&[296]
     ,[Employee].[Employee].&[281]
     ,[Employee].[Employee].&[286]
     ,[Employee].[Employee].&[295]
     ,[Employee].[Employee].&[292]
     ,[Employee].[Employee].&[287]
     ,[Employee].[Employee].&[272]
     ,[Employee].[Employee].&[294]
     ,[Employee].[Employee].&[293]
     ,[Employee].[Employee].&[285]
    }
MEMBER sales_avg AS
    Avg
    (
      [Date].[Date].CurrentMember * [Emp]
     ,[Reseller Sales Amount]
    )
MEMBER sales_StDev AS
    StDev
    (
      [Date].[Date].CurrentMember * [Emp]
     ,[Reseller Sales Amount]
    )
MEMBER Sales_Variation AS
    ([Reseller Sales Amount] - sales_avg) / sales_StDev
   ,format_string = "currency"
MEMBER Sales_Variation_Basket AS
    CASE
      WHEN
        Sales_Variation > 3
      THEN 3
      WHEN
        Sales_Variation > 2
      THEN 2
      WHEN
        Sales_Variation > 1
      THEN 1
      WHEN
        Sales_Variation > 0
      THEN 0
      WHEN
        Sales_Variation > -1
      THEN
        -1
      WHEN
        Sales_Variation > -2
      THEN
        -2
      WHEN
        Sales_Variation > -3
      THEN
        -3
    END
MEMBER Measures.[-2] AS
    Generate
    (
      Filter
      (
        Emp
       ,
        Sales_Variation_Basket = -2
      )
     ,
      Employee.Employee.CurrentMember.Name + ","
    )
MEMBER Measures.[-1] AS
    Generate
    (
      Filter
      (
        Emp
       ,
        Sales_Variation_Basket = -1
      )
     ,
      Employee.Employee.CurrentMember.Name + ","
    )
MEMBER Measures.[0] AS
    Generate
    (
      Filter
      (
        Emp
       ,
        Sales_Variation_Basket = 0
      )
     ,
      Employee.Employee.CurrentMember.Name + ","
    )
MEMBER Measures.[1] AS
    Generate
    (
      Filter
      (
        Emp
       ,
        Sales_Variation_Basket = 1
      )
     ,
      Employee.Employee.CurrentMember.Name + ","
    )
MEMBER Measures.[2] AS
    Generate
    (
      Filter
      (
        Emp
       ,
        Sales_Variation_Basket = 2
      )
     ,
      Employee.Employee.CurrentMember.Name + ","
    )
SELECT
{
    [Date].[Date].&[915]
   ,[Date].[Date].&[946]
   ,[Date].[Date].&[975]
   ,[Date].[Date].&[1006]
   ,[Date].[Date].&[1036]
   ,[Date].[Date].&[1067]
} ON COLUMNS
,{
    Measures.[-2]
   ,Measures.[-1]
   ,Measures.[0]
   ,Measures.[1]
   ,Measures.[2]
} ON ROWS
FROM [Adventure Works]

And the output is as desired (partial screenshot):

While the above meets the requirement, there is one significant limitation that we need to overcome.

In the solution, the distinct values of "Sales_Variation_Basket" are manually defined as Measures.[3].....Measures.[-3].

We need to automate it. I mean, using MDX, to read the set of "Sales_Variation_Basket" values, read the distinct values out of it and sort it, and then put the distinct values of "Sales_Variation_Basket" on rows and then put the employees in the result cells.

I am sure we would have the solution soon.

2008-06-28T09:53:00.001+05:30

Design

Cube structure optimization for MDX query performance in Analysis Services 2005 SP2: Tips for Parent Child Hierarchies usage

Fact table design for “State Workflow Analysis”: Analysis Services Dimensional modeling

Handling inter-dimensional members dependency and reducing cube sparsity using reference dimensions in Analysis Services 2005 SP2 : Cube design tip

Identifying intra-dimensional members relationship and reducing cube sparsity in Analysis Services 2005 SP2 : Cube design tip

Leaves() : An example to understand it for both regular hierarchies as well as parent child hierarchies

Aggregation design: useful tips

Level based attribute hierarchy: MDX query performance woes in SQL Server 2005 SP2: Is it fixed in post SP2 hotfix?

Parent child hierarchy to level base hierarchy conversion: hiding placeholder dimension members in client application

Trouble / Troubleshooting

Aggregate(), Sum() functions using calculated members does not work in Analysis Services 2005 SP2 (9.00.3042.00 version) but works in Analysis Services 2000 SP4

Analysis Services 2005 migration tool: Custom member formula issues in migrated database

Cube Partitions: Fact table not listing in Business Intelligence Development Studio in partition wizard

Analysis Services 2005: Many-to-Many relationship does not support unary operators with parent-child dimension

MDX

NextAnalytics and MDX : Part 1 - Swap Cells with Row Labels

Selecting dimension's default member based on a member property

Sorting members on member codes / member properties

Time Dimension: How to set Default Member to Current Month

Setting dynamic default member in dimension X based on the current member of dimensions Y

ADOMD.NET

Code : utility code for converting cellset to a data table

Others

Google specialized search for Analysis Services and MDX web resources integrated in my blog

Art of reading MDX articles

MDX Expression Builder : Need for a tool making it easier for functional users to write MDX expressions, queries.

Art of reading MDX articles

2008-06-25T16:39:00.001+05:30

I learnt MDX much more by reading articles by knowledgeable people in this industry, Mosha for example. These extremely well written articles are sprayed with numerous MDX queries each explaining the nuances of MDX. Earlier, I used to open two windows, one for reading article and other for executing MDX queries. Many a times, continuous copy and paste distracted the subject context.

Of late, I have devised a new technique. I just copy and paste the complete article in SS Management Studio's MDX query editor and just read in that. Whenever, I need to execute the query, I just select the query text portion and execute it. Now, I don't have to switch between windows or lost my focus, of course at the cost of text formatting. I wish I could do the same thing in Mosha's MDX Studio but it does not support text wrapping. (I have devised another technique where I copy and paste the article in my Notepad++ text editor and then re-wrap the text to the desired width and then paste).

It works great for me to read text based articles with lots of MDX in it without getting distracted.

Aggregate(), Sum() functions using calculated members does not work in Analysis Services 2005 SP2 (9.00.3042.00 version) but works in Analysis Services 2000 SP4

2008-06-16T14:25:00.001+05:30

When any aggregate function, viz, Aggregate(), Sum() is used on the set of calculated members in Analysis Services 2005 SP2 (9.00.3042.00 version), they are not computing the value and return nulls instead where as the same functions work correctly in Analysis Services 2000 SP4. This seems to be a bug and hope that soon some hotfix would be released.

The workaround for this is not to use these functions for calculated members and rather calculate the values using the mathematical operator. e.s. instead of

'aggregate( [Organization].[Organizations].[Calc5],
                      [Organization].[Organizations].[Calc3],
                      [Organization].[Organizations].[Calc4],
                      [Organization].[Organizations].[Calc6],
                      [Organization].[Organizations].[Calc7]
} )'

define the formula as

'                    [Organization].[Organizations].[Calc5] +
                      [Organization].[Organizations].[Calc3] +
                      [Organization].[Organizations].[Calc4] +
                      [Organization].[Organizations].[Calc6] +
                      [Organization].[Organizations].[Calc7]
'

The issue is detailed in below section:

The below Adventureworks query gets the result properly for '[Organization].[Organizations].[SelectMembers] ' member:

with member [Organization].[Organizations].[SelectMembers] as 'aggregate(
     {    [Organization].[Organizations].&[5],
                      [Organization].[Organizations].&[3],
                      [Organization].[Organizations].&[4],
                      [Organization].[Organizations].&[6],
                      [Organization].[Organizations].&[7]}
)'
SELECT          {
                      [Date].[Fiscal Year].&[2002],
                      [Date].[Fiscal Year].&[2003],
                      [Date].[Fiscal Year].&[2004]
                }
                *
                {    [Organization].[Organizations].&[5],
                      [Organization].[Organizations].&[3],
                      [Organization].[Organizations].&[4],
                      [Organization].[Organizations].&[6],
                      [Organization].[Organizations].&[7],
                     [Organization].[Organizations].[SelectMembers]}
ON ROWS ,

{[Account].[Accounts].&[101],[Account].[Accounts].&[52]
}
ON COLUMNS
FROM [Finance]

However, when the organizations members are converted as calculated members and referred in the aggregate() formula, the values returned are "Null" for '[Organization].[Organizations].[SelectMembers]' member.

with
member [Organization].[Organizations].[Calc5] as '[Organization].[Organizations].&[5]'
member [Organization].[Organizations].[Calc3] as '[Organization].[Organizations].&[3]'
member [Organization].[Organizations].[Calc4] as '[Organization].[Organizations].&[4]'
member [Organization].[Organizations].[Calc6] as '[Organization].[Organizations].&[6]'
member [Organization].[Organizations].[Calc7] as '[Organization].[Organizations].&[7]'
member [Organization].[Organizations].[SelectMembers] as 'aggregate(
{                     [Organization].[Organizations].[Calc5],
                      [Organization].[Organizations].[Calc3],
                      [Organization].[Organizations].[Calc4],
                      [Organization].[Organizations].[Calc6],
                      [Organization].[Organizations].[Calc7]
} )'

SELECT          {
                      [Date].[Fiscal Year].&[2002],
                      [Date].[Fiscal Year].&[2003],
                      [Date].[Fiscal Year].&[2004]
                }
                *
                {    [Organization].[Organizations].[Calc5],
                      [Organization].[Organizations].[Calc3],
                      [Organization].[Organizations].[Calc4],
                      [Organization].[Organizations].[Calc6],
                      [Organization].[Organizations].[Calc7],
                      [Organization].[Organizations].[SelectMembers]
}
ON ROWS ,

{[Account].[Accounts].&[101],[Account].[Accounts].&[52]
}
ON COLUMNS
FROM [Finance]

However, the similar query for FoodMart2000 returns the same result for both the approach for '[Store].[SelectMembers] ' member:

with member [Store].[SelectMembers] as 'Aggregate(
{
[Store].[All Stores].[USA].[CA],
[Store].[All Stores].[USA].[OR],
[Store].[All Stores].[USA].[WA]}
)'
select crossjoin({[Time].[1997],[Time].[1998]} ,
{
[Store].[All Stores].[USA].[CA],
[Store].[All Stores].[USA].[OR],
[Store].[All Stores].[USA].[WA],
[Store].[SelectMembers] }
)
on rows,
{[Account].[All Account].[Net Income].[Net Sales].[Gross Sales],[Account].[All Account].[Net Income].[Net Sales].[Cost of Goods Sold]} on columns
from budget

The query returns the same result for '[Store].[SelectMembers]' member even though it is referring the calculated members:

with
member [Store].[All Stores].[USA].[CalcCA] as '[Store].[All Stores].[USA].[CA]'
member [Store].[All Stores].[USA].[CalcOR] as '[Store].[All Stores].[USA].[OR]'
member [Store].[All Stores].[USA].[CalcWA] as '[Store].[All Stores].[USA].[WA]'
member [Store].[SelectMembers] as 'Aggregate(
{
[Store].[All Stores].[USA].[CalcCA],
[Store].[All Stores].[USA].[CalcOR],
[Store].[All Stores].[USA].[CalcWA]}
)'
select crossjoin({[Time].[1997],[Time].[1998]} ,
{
[Store].[All Stores].[USA].[CalcCA],
[Store].[All Stores].[USA].[CalcOR],
[Store].[All Stores].[USA].[CalcWA],
[Store].[SelectMembers] }
)
on rows,
{[Account].[All Account].[Net Income].[Net Sales].[Gross Sales],[Account].[All Account].[Net Income].[Net Sales].[Cost of Goods Sold]} on columns
from budget

Identifying intra-dimensional members relationship and reducing cube sparsity in Analysis Services 2005 SP2 : Cube design tip

2008-06-06T20:32:00.001+05:30

In my previous blog, I discussed the approach for "Handling inter-dimensional members dependency and reducing cube sparsity using reference dimensions in Analysis Services 2005 SP2 : Cube design tip". [Watch for the difference, Inter and Intra] The approach discuss the situation where members from one dimension are valid for a few members on other dimension. The typical example would be that for Sales Accounts, normally,only the sales department would be applicable and Sales data wont exists for say, "vehicle expenses department".

But what if there exists a relation between the members on a given "single" dimension? During the cube modeling phase, many a times designers ignore this aspect and later, many of the needed business reports can not be generated by the MDX reporting clients and complex MDX queries need to be hand coded.

Let us explain this with a simple business example of a consulting company. The typical chart of account of our fictitious consulting company is:

The data for two years are:

The resultant fact table structure is:

The desired report that needs to be generated is:

However, what seems like a simple report can not be generated by the client applications since they don't understand the relationship that exists between the dimension members. The resultant MDX query would look something like below:

with member Measures.[USD/HR] as '
IIF(Account.currentmember is [Sales - Product 1],[Sales - Product 1]/[Revenue Hours - Product 1],
    IIF(Account.currentmember is [Sales - Product 2],[Sales - Product 2]/[Revenue Hours - Product 2],
        IIF(Account.currentmember is [Sales - Product 3],[Sales - Product 3]/[Revenue Hours - Product 3],
            IIF(Account.currentmember is [Sales - Product 4],[Sales - Product 1]/[Revenue Hours - Product 4],0))))'
select {
[Sales - Product 1],
[Sales - Product 2],
[Sales - Product 3],
[Sales - Product 4],
} on rows,
Crossjoin({[2007],[2008]},{Measures.[Amount],Measures.[USD/HR]} on columns
from cube

Not a very elegant MDX, huh.

However, we know that the following relation exist among the dimension members:

The key is to relate these dimension members in the cube so that in the MDX queries, we can take advantage of these relations and make the queries faster as well as the simpler.

In the below paragraph, we would use the following terminology:

Attribute & parent dimension members: The dimension members which provide the details to the parent dimension members to which it is related. E.g., in above example, the "Revenue Hours - Product X" dimension members, Attribute dimension members, provide the number of hours for "Sales - Product X" dimension members, parent dimension members.

There are two possible approaches to define the relation between these dimension members:

Approach 1: Define the attribute dimension member as the "measure" of the parent dimension members

Approach 2 : Define the attribute dimension member as the "Attribute" of the parent dimension members

Approach 1: Define the attribute dimension member as the "measure" of the parent dimension members

If we know these relations beforehand, then we can modify the dimensional members and load the value of related dimension member as additional "Measures" in the fact table at the time of data load.

We can load the same data as below:

The above structure lends itself more for analysis purpose and the end users can now make the desired report using MDX client tools. The representative MDX query would look something like below:

with member Measures.[USD/HR] as 'Measures.[Amount]/Measures.[Hours]'
select {
[Sales - Product 1],
[Sales - Product 2],
[Sales - Product 3],
[Sales - Product 4],
} on rows,
Crossjoin({[2007],[2008]},{Measures.[Amount],Measures.[USD/HR]} on columns
from cube

Approach 2 : Define the attribute dimension member as the "Attribute" of the parent dimension members

This approach can be taken if the earlier approach of "converting attribute dimension members as measures" is not feasible. Approach #2 is much less efficient approach than the Approach #1, but it is still elegant than the approach of lots of nested IIF statements.

In this approach, on account dimension, we create a property called "Hours" and for parent dimension members, we can store the reference of attribute dimension members.

E.g., suppose, in this case, we store the Member Keys as the Hours attribute for "Sales - Product X" dimension members. Now, the MDX query to generate the same report would be something like below:

with member Measures.[USD/HR] as 'Measures.[Amount]/(StrToMember("[Account].&[" + [Account].Currentmember.properties('Hours') + "]"))'
select {
[Sales - Product 1],
[Sales - Product 2],
[Sales - Product 3],
[Sales - Product 4],
} on rows,
Crossjoin({[2007],[2008]},{Measures.[Amount],Measures.[USD/HR]} on columns
from cube

Code : utility code for converting cellset to a data table

2008-05-25T23:22:00.001+05:30

Private Function getDataTable(ByVal cs As AdomdClient.CellSet) As DataTable
        'design the datatable
        Dim dt As New DataTable
        Dim dc As DataColumn
        Dim dr As DataRow

        'add the columns
        dt.Columns.Add(New DataColumn("Description")) 'first column
        'get the other columns from axis
        Dim p As AdomdClient.Position
        Dim name As String
        Dim m As AdomdClient.Member
        For Each p In cs.Axes(0).Positions
            dc = New DataColumn
            name = ""
            For Each m In p.Members
                name = name + m.Caption + " "
            Next
            dc.ColumnName = name
            dt.Columns.Add(dc)
        Next

        'add each row, row label first, then data cells
        Dim y As Integer
        Dim py As AdomdClient.Position
        y = 0
        For Each py In cs.Axes(1).Positions
            dr = dt.NewRow 'create new row

            ' Do the row label
            name = ""
            For Each m In py.Members
                name = name & m.Caption & ""

Next
dr(0) = name 'first cell in the row

            ' Data cells
            Dim x As Integer
            For x = 0 To cs.Axes(0).Positions.Count - 1
                dr(x + 1) = cs(x, y).FormattedValue 'other cells in the row
            Next

            dt.Rows.Add(dr) 'add the row
            y = y + 1
        Next

Return dt
End Function

Handling inter-dimensional members dependency and reducing cube sparsity using reference dimensions in Analysis Services 2005 SP2 : Cube design tip

2008-05-16T12:47:00.001+05:30

Traditionally the cubes structure is designed based on either star or snowflake schema. The dimensions in the cube is totally independent of each other and in the results can be obtained for any dimension members to any dimension. This is a classic illusion of an ideal world, i.e. even though the intention is good but the performance suffers since the cube is very sparse.

High cube sparsity adversely affect the MDX query performance. A very sparse cube can take longer to resolve calculated members and calculated cells, and MDX functions involving empty cells, such as CoalesceEmpty or NonEmptyCrossjoin, take slightly longer to process because of the large volume of empty cells that must be considered by such functions.

For example, the following diagram indicates three dimension hierarchies used to construct a cube for tracking orders.

Each customer, product & time dimensions have 5 members each. In the above design, between customer, product and time, theoretically 125 cells are possible ( 5 customer X 5 product X 5 time members).

However, in reality, may times, the existence of a real measure of a given dimensions is dependent on other dimensions. E.g. Sales rep can handle a few given territories, the given customer is handled by a few sales rep and the customer purchase a given set of products, some products are not even sold in other territories. So, up front in the design, we are sure that there would be a lot of dimension combinations for which real data would not exist.

Suppose customer and product dimensions have dependency on each other and the following following 11 valid combinations exists:

And suppose, these 11 combinations have data existing for all the 5 time periods, then the total number of valid combinations are 55 (11 customer-product X 5 time members).

It means that the actual density of the cube is only 44 % (55 real measures / 125 theoretical measures).

This cube is a theoretical example; in reality, many dimensions are much more sparse than indicated in this example. E.g. we have assumed that 11 customer product combinations have purchase history for all the 5 time periods, but if they have on an average 2 purchases out of five, then the actual number of cells would drop to 22 (11 customer-product X 2 time period) and the sparsity would drop further to 17% (22 real measures / 125 theoretical measures).

By themselves, the dimensions do not appear overly sparse; each dimension has members with relevant fact table data. If these dimensions are used together in a cube, however, the sparsity of the cube increases exponentially with each dimension, because the introduction of each dimension exponentially increases the number of cells within the cube. The above diagram, the shaded cells on customer - product face indicate the cells that actually contain data.

Dimensions with unrelated data can also greatly increase cube sparsity, especially if the dimensions are included as part of an associative relationship. For example, a business case is designed to compare the sales from retail customers with the sales from vendors, so a cube with three dimensions representing sales, customers, and vendors is created. The Customers dimension organizes retail customers by location, the Vendors dimension organizes vendors by sales region and vendor type, and the Orders dimension organizes order quantities by date. Both the Customers and Vendors dimension share elements with the Orders dimension, but not with each other. Because customers and vendors do not directly relate, from the viewpoint of the underlying data source, the result is a very sparse cube. In this case, it is easier to construct two cubes, one for vendors and one for customers, which share the Orders dimension.

Conversely, if beforehand, we know that dimensions are dependent on each other, then we can reduce the cube sparsity at the cube design level itself. Analysis Services 2000 did not support "Reference Dimensions" but it is something we can utilize in Analysis Services 2005.

We can design an intermediate dimension which contains the valid combinations of customer and product members at leaf level. This intermediate dimension joins with the fact table. The Customer and product dimensions are joined to fact table as a "Reference Dimension" utilizing the intermediate dimension.

The below diagram illustrates the conversion of above star schema to a "Reference-Intermediate" dimension structure. The shaded region is the valid combination of Customer-Product intermediate dimension and the time dimension, i.e. 55 cells and the sparsity is zero.

The intermediate dimension can be made invisible so that to a external OLAP cube consumer, there is no difference in the cube browsing experience even though the underlying cube structure has been changed drastically. The MDX queries too would run faster since the cube sparsity is reduced drastically by use of intermediate dimension.

Cube structure optimization for MDX query performance in Analysis Services 2005 SP2: Tips for Parent Child Hierarchies usage

2008-05-15T10:49:00.001+05:30

Note: The below findings have been found on the PC hierarchies. The same may or may not be applicable to level based hierarchies since I have not tested them. I need to convert the same PC hierarchies to level based ones and then retest to find out if the similar issues exists with them too.

The bottom line of this blog, Analysis Servies 2005 does not handle rollup operators and custom members on Parent Child hierarchies properly leading to significant MDX query performance bottlenecks.

There are ample amount of material available dealing with cube optimization to improve MDX query performance. The problem is that most of the materials almost give equal weightings to all the tips and we are not able to extract which tip would provide the biggest bang related to query performance.

If you deal with parent child hierarchies extensively in your cube design, many of the query optimization tips don't even apply since parent child dimensions are specific case.

Here, based on my personal experience, I have listed down the factors in the decreasing order:

1. Parent child dimension, rollup operators and custom members on dimensions are the most lethal combinations to kill the query performance (or kill the cube itself, no MDX queries run).

If there are multiple Parent Child hierarchies on which for each hierarchy, we enable rollup operator and custom members, then even for very small cube (5-6 PC hierarchy few hundred members and fact table of 50k rows), the cube just hangs and MDX query don't run. I have validated this in my previous blog. So, disable rollup operator and custom members as much as possible and find alternate ways of dealing with it.

By observing the profiler, millions of lines are generated with frequent occurrence of "get data from calculation cache" or query dimension and I suspect, there is some issue in calculation algorithm in Analysis Services where Parent Child hierarchies with rollup operator and custom members are involved. When I disable all of the rollup operator and custom members, the queries run in a couple of seconds. Even with I retain the custom members and rollup operator on only one PC hierarchy and disable on rest of the of the PC hierarchies, the query performance is good.

2. In MDX query, provide the default members of the parent child hierarchies which are not used in the query.

Per best practices, we would enable "isAggregatable" property = Yes for all the attributes. In the MDX query, if these hierarchies are not needed in the query result, we would not provide any values for other PC hierarchies since they are supposed to provide values for "All" members.

However, I found that if the other PC hierarchies contain lots of calculated members and if we don't provide the default members for these unused PC hierarchis in the MDX query, the Analysis Services seems to calculate the values of the other custom members even if they are not needed and the query runs longer. Providing the default value for other PC hierarchies significantly hastens the query time.

E.g., an MDX query, which ran just under 4 seconds by providing all the default members on "where" clause, ran eternally which I needed to cancel the query after 5 minutes and by that time, it had taken all of available RAM (2gb) and the CPU processing when I did not specify the default members for other PC hierarchies.

3. Rollup operators seems to degrade query performance much more than the custom members on PC hierarchies.

It was strange to find about it. I always though rollup operators must be faster than the custom members for the similar calculations.

In an cube, I disabled rollup operators and enabled custom members on PC hierarchies. The MDX query ran in under 4 seconds. Now, I disabled custom members and enabled rollup operators on the same PC hierarchies. Now the same MDX query, even after 30 min did not run and I needed to cancel the query. This I experienced multiple times even after I started the Analysis Services.

4. Conversion of PC hierarchy to level based hierarchy does not necessarily mean better query performance.

It was again a surprise. Many would suggest that level based hierarchies are better than PC hierarchies, but that advice needs to be taken with a pinch of salt. Level based hierarchies are certainly better than PC hierarchies. However, certain hierarchies are better represented in a PC fashion only e.g. a big employee organization hierarchy or financial account containing Income Statement or Balance sheet etc. The members in these hierarchies do not reside in any specific levels. In fact, the hierarchy continuously undergoes rapid changes whenever newer members are attached or existing members are deleted or moved under some other parents. When this happens, the number of levels changes dramatically and many member shift from one level to another. These hierarchies have many null members in many levels since the members at the leaf level may end in any level not necessarily at the lowest level.

Hierarchies like above when converted to level based one and then used in the cube affect the query time drastically. Under PC hierarchy, the MDX which took about 9 seconds to execute, after converting to level based hierarchy took about 1 min 33 seconds to execute! This effect was validated multiple times.

PC hierarchies where levels can be identified easily or all the leaf level members have proper parents in the above levels would improve query performance when converted to level based ones.

5. Maintain "Calculated Members" and delete unwanted ones

It has been observed that mere presence of calculated members on dimensions affect the query performance even if they are not referred in the MDX query context. We tend to create a lot of junk calculated members are they just hang out in hierarchy without any use. These kind of members needs to be regularly removed from hierarchy.

6. Watch out for slow running MDX functions while creating calculated members

Functions like StrToMember are inherently slow. In words of Mosha, ".....I really dislike StrToMember function for many reasons. For the havoc it wrecks in the query optimizer, for the unpredictable caching guarantees, for the very dynamic binding by means of reparsing its input..."

Slow running functions like this drastically amplify their inefficiencies if they get evaluated in the context of each and every cell of resultant cellset, e.g. if calculated measure use these slow running functions, they would get evaluated for each and every cell if calculated measure is on page level.

Chris Webb on one of his blog explains how to handle StrToMember properly and avoid its evaluation for multiple cells.

7. Take advantage of "Disable Prefetch Facts=True; cache ratio=1" connection properties

Parent child hierarchies don't store the aggregation for intermediate levels. When we execute the MDX query involving the parent child dimension, the analysis services' formula engine uses a number of heuristics when formulating subcube requests from the storage engine to attempt to optimize overall performance. These include fetching data in one subcube request into cache for use by subsequent queries and the ordering of subcube requests for an individual query to optimize the use of cache (by retrieving as much information as necessary to answer a query by using as few noncached subcube query requests as possible). This may result in excessive unnecessary subcube calculations. Per my experience, when the parent child dimensions are involved, using this connection string property has provided minor to many-many multiple time query time benefits.

8. Use effective partitioning strategy

There are many resources available to understand it.

9. Pre-calculate the formulas outside cube to avoid calculations inside the cube

The key requirement is to avoid simple calculations inside cube as much as possible so that cube can be freed up to do complex calculations.

Below are the two prime candidates which can be calculated outside the cube:

1. In the fact table itself as measure, e.g. using calculated column.

2. The calculated members along the dimension, if their values are resolved at leaf level and then they need to rollup like any other member, then it is prudent to calculate their formula values outside and load it the fact table.

(The article is work in progress & would be regularly updated. If you have any tips, please share with me to post it here.)

Level based attribute hierarchy: MDX query performance woes in SQL Server 2005 SP2: Is it fixed in post SP2 hotfix?

2008-04-30T12:16:00.001+05:30

The other day, I was working on a level based dimension hierarchy and the fact table with rather very few records.

I had applied all the dimensional modeling recommendations such as,

1. for each attribute,

disable unary operator and customrullupformula if not needed.
remove unwanted attributes
isaggregatable property = true
attributehierarchyenabled = false if not needed
attributehierarchyvisible = false if you need to access the attribute but not visible in dimension browser
define attribute relationship

2. In cube, all dimension to cube relationship is "Regular"

3. In cube partition, updated all the statistics (which is evident in above screenshot)

Testing process:

1. I took a set of leaf fact table level data set of five rows. In the profiler , the "Query Dimension" went on and on

Finally the query aborted and a "fatal error" message came:

Analysis Services had taken entire available RAM for this small query:

It seems there is a post SP2 hotfix which sounds like to adddress the above problem. A similar issue was reported in Analyiss Services 2005 forum from where I got the below information.

938077 (http://support.microsoft.com/kb/938077/)
FIX: The client application stops responding, and the Msmdsrv.exe process uses all the available memory after you perform the filtering operation and the browsing operation against an instance of SQL Server 2005 Analysis Services

In fact, there are a bunch of problem addressed in post SP2 hotfix.

Since, these hotfixes are not fully regression tested by Microsoft as yet, it is available through request only here.

Once, I apply the hotfix, would check if the above get resoled. I would update this blog then.

MDX Expression Builder : Need for a tool making it easier for functional users to write MDX expressions, queries.

2008-04-25T11:30:00.001+05:30

MDX is great for business problem analysis. Moving averages, % of parent, YoY growth etc are business needs and surely MDX can handle it. There is only one hitch though. Most of the business users can not write it. There are so many tools including SS Management studio which allows me to write complex SQL queries many a times just using my mouse, but hardly anything to write MDX expressions as flexibly.

To make a start, at least a tool can be provided with the below functionality:

1. The interface would display the dimension hierarchies and and an area to create MDX expression. The area would have two panes, one which shows the MDX expression in AST hierarchy and other pane shows the real MDX script text.

2. The user can write MDX expression directly in the MDX script text pane and wheh toggles to AST pane, the expression would be shows as a expression tree.

3. The user can drag and drop the functions, operators or dimension members onto the AST pane and when switch it back to MDX script text pane, the MDX query text would be regenerated.

4. Of course, some wizards, templates can be pre-stored for common business expressions which gives the functional user some starting point to use it as is or further customize it.

In MDX studio by Mosha, I can paste a MDX query ant it generates a Syntax tree which i find extremely useful to analyse complext MDX statements. The problem is it is only one way, i.e. MDX query text to tree and not the other way around. Long back, I had seen an evaluation version of a tool named as Hungry Dog (if i remember correctly) which used to do it and I can not find them anymore.

E.g. in below screen, the left side provides a tree representation of a MDX expression and right a text representation. The tree one is easier to understand. Over time, the it could be made much more user friendly.

Unless we have something like above, we can not service those super intelligent functional users who understand business need in depth but can not analyze it themselves fully due to lack of a good tool to define MDX expressions.

If anybody knows a good tool which works on Analysis Services, kindly let me know.

MDX parser and generator similar to Abstract Syntax Tree (AST)

2008-04-24T10:53:00.001+05:30

I was looking out for a utility / tool / class which can do following in Analysis Services 2005:

1. Parse & validate an MDX expression/query and present it in a object hierarchy
2. Vice-versa, create an MDX query / expression from object hierarchy

The above functionaity can be used in an application where the complexities of the MDX needs to be kept away from the users and it would be helpful in programatically managing the mdx queries.

Per Analysis Services book by Melomed, Gorbach et all, Analysis Services itself handles it where it parses an MDX request. First, it analyzes the MDX statement, finds syntax errors, and produces a tree-based data structure called an abstract syntax tree (AST). An AST has operators, such as MDX functions as inner nodes, and operands, such as function arguments as leaf nodes. In the second parsing phase, Analysis Services traverses an AST, resolves the names of the objects referenced in the query, and validates the signatures of the functions. This phase produces an expression tree. Each node on the expression tree is an object that knows how to calculate itself. It also has a hook by which it can plug in to the calculation execution plan.

For example, if you send the below MDX query to Analysis Services, it will produce the AST shown:

A Simple MDX Query

SELECT
Order(
          Filter(
               [Customer].[Customers].[Country].members,
               [Measures].[Unit Sales].Value >1000
          ),
          [Measures].[Unit Sales],
          BDESC
    )
    ON COLUMNS
   FROM [Warehouse and Sales]

In the first parsing phase, Analysis Services produces an AST.

In the second parsing phase, Analysis Services traverses an AST, resolves the names of the objects referenced in the query, and validates the signatures of the functions. This phase produces an expression tree. Each node on the expression tree is an object that knows how to calculate itself. It also has a hook by which it can plug in to the calculation execution plan. For the sample query, Analysis Services generates the expression tree as below:

During the second parsing phase, Analysis Services performs the semantic analysis and produces an expression tree.

After creating the expression tree, Analysis Services is ready to move to the query resolution phase.

I was wondering if the above functionalities can be exposed to outside application where they can convert an MDX to abstract syntax tree and vice versa.
Mondrian seems to have such class hierarchies exposed which seems to be programmatically exploitable. (http://mondrian.pentaho.org/api/mondrian/mdx/MdxVisitor.html)

If anyone knows approach for SQL 2005 or earlier, please let me know. If I come to know something, sure would post it here.

Leaves() : An example to understand it for both regular hierarchies as well as parent child hierarchies

2008-04-23T19:56:00.001+05:30

Leaves() Gotcha in Analysis Services 2005 SP2: Be aware of the number of tuples that the MDX script is going to affect.

The other day, I was trying to define a calculation for a member at the leaf level of all the dimension. After defining the formula and deploying the MDX script, the results were not displayed in the query in SS Management Studio 2005.

When I ran the same query in MDX Sample Application (Analysis Services 2000 utility to run MDX queries), it displayed the error as below:

It seems that the MDX script that I had setup was affecting more than 4.3 billion tuples in the set. The cube I used had 8 dimension with varying levels. I tried to search if this is documented, but could not find it.

Leaves() is an interesting MDX functions introduced in SQL 2005. It is intended to be used only inside MDX Scripts to define the scope of the calculations (i.e. in the left-hand side of the assignments).

There are many situations where in we wish to perform the calculation at the leaf level of dimension and then aggregate or roll it up like any other member.

Mosha, in his blog explained the use of leaf level for achieving the above objective.

This article explains how we can implement Leaves() in both parent child as well as regular hierarchies using AdventureWorks cube.

Leaves() function use with parent child hierarchy:

My objective was to create a measure 'BudAmount' which is equal to 'Amount' at the leaf level and then it needs to be aggregated like 'Amount' measure across any hierarchy in AdventureWorks' "Financial Reporting" measuregroup .

I read a post in MSDN forum which was similar to my need. In the post, it was needed to define a calculated measure at leaf level and then aggregate the calculated measure like any normal measure.

Chris Webb, in the answer to the post, suggested that:

"...It's certainly a fairly common problem, but Mosha has indeed blogged about the best way to handle this:

http://www.sqljunkies.com/WebLog/mosha/archive/2005/02/13/7784.aspx

Create a new real (ie not calculated) measure in your cube from a column that is always null in your fact table, and then assign the calculated value to the leaf level of all dimensions. Something like:

(Measures.CM, Leaves()) = iif(Measures.M>0.9, 1, null);"

Based on this newfound knowledge, I started to setup my BudAmount measure.

Below are the steps:

1. In the DSV, changed the FactFiance table to a named query and added a column named 'BudAmount' with value as 0 (zero).

2. In cube, added the measure 'BudAmount' alongside of 'Amount'

3. In mdx script, added the below script:

SCOPE (Leaves(),[Measures].[BudAmount]);
this = [Measures].[Amount];
END SCOPE;

4. One important thing which I had missed earlier and the Leandro Tubia pointed it out at the MSDN Analysis Services forum to setup the AggregatFunction of BudAmount to be same as Amount i.e. "ByAccount". Otherwise, for balance sheet account types, the amounts would be summed up rather than the last month values for higher levels of Time Dimension.

5. Full processed the cube (I am yet to understand the appropriate processing to be used, so full processing may not be needed, but just to be on safer side, using full processing)

6. Now, when you browse the cube, the value of BudAmount and Amount is same as desired.

Leaves() function use with with regular hierarchies:

In Adventureworks, I created a measure (per Chris Webb, it needs to be a real measure, not a calculated one) called 'SalesAmount1' on adventureworks cube which provides the same data as 'SalesAmount' measure. I mean if I query both the measure anyway, both should behave in same way and return the same amount at any level of other dimension.

Below are the steps:

1. In the DSV, in 'FactSalesSummary' query, add additional column as 'SalesAmount1' with value as 0.

2. Under 'Sales Summary' measure group, add the 'SalesAmount1' measure. (look for any white space in the name).

3. In each of Sales Summary partition, add the new 'SalesAmount1' column:

4. In the MDX script, define the calculation:

SCOPE (Leaves(),[MEASURES].[SalesAmount1]);
this = [Measures].[Sales Amount];
END SCOPE;

5. Check if the SalesAmount1 measure is same as [Sales Amount]. Yes, it is same.

SELECT
    NON EMPTY
    [Product].[Product Categories].[Category].MEMBERS
    ON ROWS ,
    {   [Measures].[Sales Amount], [Measures].[SalesAmount1]
    }
    ON COLUMNS
    FROM [Adventure Works]

6. Even if the cube is browsed in management studio, the value of both the measure is same. So, Leaves() is working in case of regular hierarchies.

Parent Child Attribute performance woes in SQL Server 2005 SP2: A case study

2008-04-22T12:54:00.001+05:30

Objective:

To understand why Parent Child attribute perform so bad vis-a-vis level based hierarchies. Also, the parent child attribute in SQL server 2005 SP2 performs much worse than SQL server 2000 SP4. The total number of fact table rows are very small (100 to be precise).

Note: the needed files to recreate the cubes and database are provided at the end of the blog.

Test Approach:

Three OLAP databases were setup on the same relational database as below:

1. AjitLevel - Level based dimension hierarchies in Analysis Services 2005 SP2 (AS2005SP2)

2. AjitPC - Parent Child attribute hierarchy in AS2005SP2

3. AjitPC2000 - Parent Child dimension in AS2000SP4

The relational database was AjitDB in Sql Server 2005 SP2. The FACT table contained just 100 data rows.

The MDX query which just queries the 11 children of a parent were run on all the three OLAP databases and the run time were noted. The results of all the 3 queries were same.

The below MDX query was run:

select descendants([Account].&[110]) on rows,
{[Measures].[MTD]} on columns
from repcube3

Test Results:

OLAP database	Query Runtime	Trace Details
AjitLevel	Instantaneous	37 rows
AjitPC	15 minutes on server desktop, 40 minutes on my old homePC	3.8 million rows. The size of trace file was 870 mb!
AjitPC2000	8 seconds on server desktop	Trace not available for AS2000

Screenshots:

MDX query result of Parent Child attribute dimension (note the time as 38 min 50 seconds)

Profiler Trace on Parent Child cube: (3.8 million rows, 870 mb trace file size)

Level based cube: (3 seconds)

Level based cube trace file: (37 rows)

Parent child dimension cube in Analysis Services 2000: (8 seconds query time, same result)

Files needed to recreate the cubes and database:

CreateTables.sql : Script to create the dimension and fact tables

AjitDBData.rar : Script to populate the data in dimension and fact tables

AjitPC.xmla : Script to create AjitPC cube

AjitLevel.xmla : Script to create AjitLevel cube

AjitPC2000.CAB : Archive of AjitPC2000 cube to be restored in Analysis Services 2000

TestQuery.mdx : Simple MDX query used in testing

AjitPCTrace.rar : zipped trace file of AijtPC cube trace recorded via SQL Server 2005 profiler. Its 3 mb and upon unzipping becomes 870 mb.

AjitLevelTrace.trc : trace file of AjitLevel cube query execution

Member Properties: getting email address for each manager along with the sales amount

2008-04-16T19:53:00.001+05:30

In the Adventure Works database I would like to return information about the set of Managers at Level 04 (one level of the Parent-Child hierarchy) plus the sum of the sales for that Manager and her reports.

Along with the manager's name, I need to display the manager's email address too.

with member [Measures].[Email] as '[Employee].[Employees].currentmember.properties("email address")'
SELECT
{[Measures].[Email] ,[Reseller Sales Amount]} on 0 ,
NON EMPTY filter(([Employee].[Employees].[Employee Level 04].MEMBERS),[Reseller Sales Amount]>0) ON 1
FROM [Adventure Works];

Microsoft Certification : request for exclusive certification for Analysis Services

2008-04-16T16:33:00.001+05:30

As an OLAP professional, I eat, sleep and drink "Analysis Services". A certification would help in stamping my expertise in this area (and may be enhance my salary too :) )

Analysis Services is a subject area deep enough that does not leave anytime for me to practice other Microsoft BI tools such as Reporting Services, Data Mining, Integration services and so on.

However, if I need to get certified in Analysis Services, there seems to be only one exam available below:

Exam 70-445: TS: Microsoft SQL Server 2005 Business Intelligence—Implementation and Maintenance

The exam covers the following subject areas:

Implementing and Maintaining Microsoft SQL Server 2005 Analysis Services (includes Data Mining too)
Implementing and Maintaining Microsoft SQL Server 2005 Integration Services
Implementing and Maintaining Microsoft SQL Server 2005 Reporting Services

Can Microsoft come with an exam exclusively focusing on Analysis Services where they test the in-depth knowledge of Analysis Services?

Analysis Services is a pretty in-depth technology that needs the following expertise:

Dimensional, cube, KPI, drill-through, Actions modeling
Business problem solving using analysis services
Underlying relational schema & data source view design
Aggregation and partition design
Write backs
MDX calculations and queries
Security
Performance improvements
.Net stored procedure extensions
Administration
Many more that I might have missed

MDX Troubleshooting: comparing large numbers of MDX resultsets

2008-04-15T09:01:00.001+05:30

Often I need to verify if the MDX results are correct after I fine tune the query or change the approach.

Usually, I copy and paste the cellset results in Excel and then compare them.

Recently, I got hold of a useful post from Chris Webb where he imported the cellset to a SQL table using SSIS and then used a Tablediff tool to compare the values. If we can make it a application, that would be great but till then, the approach serves the purpose.

You can read the blog here.

Aggregation design: useful tips

2008-04-13T12:32:00.001+05:30

Many useful tips can be garnered from the below sources:

1. SSAS 2005 Performance Guide

2. BIDs Helper:

3. Many-to-many design aggregation design

Analysis Services Many-to-Many Dimensions: Query Performance Optimization Techniques

Selecting dimension's default member based on a member property

2008-04-11T09:15:00.001+05:30

Suppose, I need to write an MDX statement that selects a certain month of my Time dimension based on a Month level member property, FLAG_LAST_MONTH, below are the two approaches:

SELECT {StrToMember(Date.CurrentMember.Properties("FLAG_LAST_MONTH"))} on rows,

{} on columns

from MyCube

Based on the property name: FLAG_LAST_MONTH, I assume that it has a value like "1" only for one month. In that case, you could use Filter() to select the desired month member:

SELECT Filter([Date].[Month].Members,

[Date].CurrentMember.Properties("FLAG_LAST_MONTH") = "1") on rows

from MyCube

If you try to use the below query, you get an error, since Date.CurrentMember.Properties("FLAG_LAST_MONTH") - doesn't return a member,

SELECT {Date.CurrentMember.Properties("FLAG_LAST_MONTH")} on rows,
{Date.Month.Members} on columns
from MyCube

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=911884&SiteID=1

Time Dimension: How to set Default Member to Current Month

2008-04-11T08:38:00.001+05:30

Question:

I want to set current Month as a Default member in my Time Dimension. So that every time i see my data it should display most current data.

Answer:

Here is an example using the Adventure Works cube. You can add this to the cube script for the Adventure Works cube and it will default the day to the current date using the Now() function. I had to use (Now() - 1000) to set the date back to 3/25/2004 due to the fact that the Adventure Works cube Date dimension ends at 8/31/2004, but I think you will get the idea. The other thing to note here is that the [Date].[Date] attribute has a "ValueColumn" defined that is of type "Date". This allows the filter statement to use a straight date vs. date comparison.

-- Now() = 12/19/2006

-- Now() - 1000 = 3/25/2004

ALTER CUBE CURRENTCUBE

UPDATE DIMENSION [Date],

DEFAULT_MEMBER = Tail(Filter([Date].[Date].Members,[Date].[Date].MemberValue < (Now() - 1000)),1)(0);

HTH,

Steve

Steve Pontello

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=1036989&SiteID=1

Setting dynamic default member in dimension X based on the current member of dimensions Y

2008-04-11T08:32:00.001+05:30

Is it possible to define a default member in dimension X based on the current member of dimensions Y.

E.g. I have 2 dimensions Cust (Customer) and Ver (version).

Suppose I have the following facts:

Customer 1, version 1

Customer 1, version 2

Customer 2, version 1

Customer 2, version 2

Customer 2, version 3

The default member in the version dimension should be the highest version for this specific Customer.

Customer 1, version 2

Customer 2, version 3

I wrote the following MDX

(TAIL(NONEMPTY({[Ver].[Ver - Ver].children}, [Cust].[Cust - Cust].CurrentMember), 1)).Item(0)

However this returns the following error when I try to browse the cube from BIDS.

DefaultMember(Ver,Ver) (1, 46) The dimension '[Cust]' was not found in the cube when the string, [Cust].[Cust - Cust], was parsed.

When I connect from Excel, the default member is ignored and the ALL level is used.

Answer:

What I would do is to add a record into your version table called "Latest" or "Current" or something like that. Then I would setup this new member as the default member and add a script like the following to the cube.

SCOPE ([Ver].[Ver - Ver].[Current]);

this = Aggregate(EXISTING [Cust].[Cust - Cust].Members

, TAIL(NONEMPTY({[Ver].[Ver - Ver].children} ), 1).Item(0)

)

END SCOPE;

This script finds all of the customers currently in context and then finds the last version for each one and aggregates them all together.

The problem with using .CurrentMember in a default member declaration is that .CurrentMember returns the member currently in context for a given query. The default member is established before any queries take place, so there is no .CurrentMember.

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2394139&SiteID=1

Parent child hierarchy to level base hierarchy conversion: hiding placeholder dimension members in client application

2008-04-06T07:18:00.001+05:30

When a parent child hierarchy, which is essentially a unbalanced hierarchy is converted to level based hierarchy, it becomes a ragged hierarchy.

In the hierarchy, for many members, the parent members are not present in the immediate above level and we need to put placeholder members as parent in those levels. The HideMemberIf property of a level in a hierarchy is set appropriately to hide these placeholder or missing members from end users.

However, in the client applications, these placeholder members do not show properly as below:

The hierarchy in the client application can be displayed properly by using the MDX Compatibility property in the connection string the instance of Analysis Services which must be set to 2 to display ragged hierarchies correctly.

The MDX Compatibility property determines how placeholder members in a ragged or unbalanced hierarchy are treated. If you set the MDX Compatibility property value to 1, you expose a placeholder member in a ragged hierarchy.

Now the same hierarchy is displayed correctly: