Friday, February 29, 2008

How to write a report on a dimensional model - part3

Today (tonight actually) I'll attempt to make some further clarifications about how crosstabs work.

For simplicity's sake in part2 I mentioned that along the axes of the crosstab we place ``dimensions''.

In fact these are member sets.

A member set is simply a set of members.
When you drag a dimension - Cogno calls it "regular dimension" - what really happens is that it will be coerced into its members. Search the Cognos documentation for Coersion Rules.
The point I'm trying to make is that you are not limited to working with dimensions. You can use 
  • expressions that return a set of members
  • a certain member (e.g. a given month in the Time hierarchy)
  • this member will be implicitly casted to a member set with one element only
  • memberset that is defined by you by picking any set of members you want
  • Search the Cognos documentation for Defining Member Sets
Another clarification about measures.
I think Cognos uses values for measures and these values are evaluated for every cell using tuple() function.
In other words you can have calculated measures by doing some arithmetic on tuples.
A usual scenario is a "smart" average calculation :
average_daily_something:= value(tuple(some_value_measure)) / value(tuple(some_days))

For example you want to get the average daily sales amount for a time period but the store is not open on weekends.

Sales - measure
Time - dimension

on weekends Sales will be equal to 0 but you don't want to count this into your daily average

you could do something like

Add a Workday measure that is 1 for workdays and 0 for weekends.
avg_daily_sales := value(tuple(Sales)) / value(tuple(Workday))

You'll come up with measure calculations that make sense in your business... the point is that in your forulas you can use value(tuple(...)) as operands and then use this "calculated measure" in the crosstab.



Wednesday, February 27, 2008

Thinking out louad about Cognos Framework Manager

I've been thinking about Cognos Framework Manager.
I think it's anoher overhyped thing.

Sure it can store meta-data that tells Cognos how to join tables, how to generate queries.
But wait a sec!

What are the options here?

  1. Relational model
    Do you really want to create a data warehouse from a relational model? Come on!

  2. DMR model
    This would make some sense... if it wasn't such a poorly implemented feature of Cognos that I can only recommend to avoid it if possible.
  3. OLAP cubes
    In this case the framework doesn't do much, just connects to the cube.

So what's the big deal then?

How to write a report on a dimensional model - part2

This is part 2 of my little essay about how to get the values that you want show up on the report where you want them.
The first part  talked about the tuple() function. If it wasn't clear enough, it will be... 
Crosstabs and tuple() are so related it's difficult to explain one without the other.

If you've used pivot tables in Excel then roughly that's what you can expect from a crosstab.
Crosstab is a table with values in the middle and headers along the edges. 
These headers can be 
  • labels for the numbers we see in the middle of the table
  • members of a dimension so that we give some context to those numbers.
Tables are the best way to "flatten" the dimensional data since most of us are not comfortable looking at 5 dimensional Rubik cubes :)

Crosstabs do provide a way for displaying multi ( > 2) dimensional data in a way that we can digest.
I'll provide examples going from the simplest case towards the more difficult.

A crosstab axes contains either a member set or a tuple() expression. (Even when it looks just like a simple measure.) Member sets are "expanded" during runtime - thereby creating multiple rows or columns. tuple() expressions are evaluated for every cell in the row or column.

I will use the same example as in part one:
Sales - measure
Branch, Product, Time - dimensions

1., one dimension - one measure

measure   |
----------+-------
dimension1| <1234>

The crosstab will work much like a simple list. When you think you need only a list use this instead. Lists are not meant for dimensional model.

You may notice that the horizontal axes is left empty. That's OK.

The upper left corner of the crosstab ddisplays the default measure. You can drag a measure there or into the "content" area.
Most people get confused by this and think that the crosstab can only display one measure.
It's not true. As you can see later a crosstab can display as many measures as you want.

The default measure is used to evaluate tuple() expressions where the measure is not specified.
I'll show examples of that later. Just keep it in mind.

e.g.

Sales |
------+-------
Branch| <1234>

"Sales" really becomes tuple(Sales)

Sales                    | 
-------------------------+---------
Richmond AutoMall        | 450,000
-------------------------+---------
North Vancouver AutoMall | 129,000
-------------------------+--------- 
Kingsway Nissan          |  39,000
-------------------------+---------

The dimension is expanded.
The tuple expression is evaluated in the context of the crosstab like this
tuple(Sales, currentMember(Branch), defaultMember(Time), defaultMember(product))

The defaultMember practically means Product and Time resulting in rolled-up values along these dimensions.

2., two dimensions - one measure

measure    | dimension1
-----------+-----------
dimension2 | <1234>

This is the "classical" crosstab.

e.g.

Sales | Time
------+------
Branch| <1234>

Note that "Sales" is still tuple(Sales)
that gets evaluated like
tuple(Sales, currentMember(Branch), currentMember(Time),  defaultMember(Product))

Both dimensions will be expanded resulting in something like this:

Sales                    | 1/1/2008 | ... | 1/31/2008
-------------------------+----------+ ... +----------
Richmond AutoMall        | 20,000            17,000
-------------------------+----------+ ... +----------
North Vancouver AutoMall | 23,000                 0
-------------------------+----------+ ... +----------
Kingsway Nissan          | 18,000            24,500


3., one dimension - many measures

The crosstab can have multiple measures. One of them is the default.

measure1   | measure1 | measure2  ...  measure_n
-----------+----------+---------+ ... +---------
dimension1 | <1234>   | <1234>  | ... | <1234>

In this example measure1 is the default measure.

Having a default measure we can do something like this:
fusion_sales := tuple([Product]->[Ford Fusion])

Sales  | Sales | fusion_sales
-------+-------+-------------
Branch | <1234>| <1234>

Note that the tuple() expression only specified the product dimension.
Cognos will figure out from the context that we are talking about Sales; because Sales is the default measure.
fusion_sales will be evaluated like this:
tuple(Sales, currentMember(Branch), defaultMember(Time), [Product]->[Ford Fusion])

It will give the total of Ford Fusion sales per branch.

The arrow ``->'' is Cognos's notation for specifying a member.
Try dragging in a member from a dimension... you'll see.  

(It's a powerful tool to create "custom measures" so to speak. So we won't have "ford Fusion Sales" as a separate metric in our cube..)

4., three or more dimensions - one measure

When we need more than 2 dimensions we have to nest them into one another.

measure                 | dimension3
------------------------+-----------
dimension1 | dimension2 | <1234>
           +------------+-----------
           | dimension2 | <1234>
-----------+------------+-----------

e.g.

Sales            | Time
-----------------+-------
Branch | Product | <1234>
       +---------+-------
       | Product | <1234>
-------+---------+-------

as the dimensions are expanded all combinations will be created

Sales                                    | 1/1/2008 ...
-----------------------------------------+---------
Richmond Autmoall        | Ford Fusion   | 
                         +---------------+---------
                         | Nissan Maxima |
                         +---------------+---------
                         | Honda Civic   |
-------------------------+---------------+---------
North Vancouver Automall | Ford Fusion   |    *
                         +---------------+---------
                         | Nissan Maxima |
                         +---------------+---------
                         | Honda Civic   |

* Sales of Ford Fusion at North Vancouver Automall on 1st Jan

Note how the nested dimension (Product) is repeatedly expanded for every member of the outer dimension (Branch).

5., many dimensions  - many measures

Using nesting you can go wild!

Measure1                | dimension1          |
                        +---------------------+
                        | measure1 | measure2 |
------------------------+----------------------
dimension2 | dimension3 | <1234>   | <1234>   |
           +------------+----------+-----------
           | dimension3 | <1234>   | <1234>   |
           +------------+----------+----------+                

In the next part I'll show some practical examples and talk about member sets.
Just to wet your apetite... think about not having the entire dimension along the axes, only some specific members you want...

Tuesday, February 5, 2008

Data level security filter - challenge

Cognos lets you put a security filter on the data.
This lets you limit who can see what.

How about a more complex case... when users can see aggregate data of higher levels but are only allowed to see certain members of a lower level.

e.g.
Sales is a measure that is associated with a Branch dimension.

Users from each branch are allowed to see the sales of their own branch and the aggregated total sales accross all branches - ``Branch(ALL)'' as Cognos calls it.
If we put a security filter on the Branch dimension or the Sales measure then we filter out all the data, so the aggregate value will only show one branch.
(At least that's what's gonna happen with a DMR model.)
What should we do?

I couldn't figure out a good answer for this problem other than not having data level security in the model.
The security can be implemented in the reports by enforcing some prompt based filter and limiting the selections based on the user's access etc.
This won't help much if you're willing to use tools like Analysis Studio where the user can directly interact with the model.

Friday, February 1, 2008

How to write a report on a dimensional model - part1

by using crosstabs and the tuple() function of course!

Let me show you how...

First of all: what is a tuple?
Tuple is basically the way to address a value in an n-dimensional array, the OLAP cube.
It's basic form is simple:
tuple(measure, dimension1, dimension2,...)

Tuple is really a coordinate.
To get the value of the cube at this coordinate we need to use the value() function.
value(tuple(measure, dimension1, dimension2,...)
Cognos does this automatically. Search for Coersion Rules in the Report Studio users guide.
It's good and bad... makes things easier, but causes confusion too... anyways.

In order to specify our coordinate we need to specify a member in each and every dimension that applies to the measure.

We don't need to write it all however. Cognos fills in the missing values but it is very important to understand how...
  • measure: defaults to the "default measure" - see upper left corner of crosstab
  • dimension: defaults to the current member of the dimension coming from the context - i.e. if the dimension appears on one of the axes in the crosstab, otherwise defaults to the default member
    The default member is the root "(All)" member. By defaulting to this we are in effect rolling up that dimension. In other words dimensions that are not in the context of the crosstab are rolled up. Makes perfect sense.
If you don't want this default behavior you can just edit the expression and specify exactly what you want. This is very cool, it's the most powerful tool a report writer has.

example
Sale is a measure that is associated with three dimensions:
- Time
- Product
- Branch

to fully specify a sale we need to supply a member of all three dimensions:
Sale(1/23/2008, 'ford fusion', 'Richmond Automall')

tuple(Sale) will add up all the sales
tuple(Sale, 1/23/2008) will add up all the sales on Jan 23 for all products at all branches 
tuple(Sale, 'ford fusion') will add up all the sales where the product was a Ford Fusion 
etc.

We need to understand that the tuple is executed in the context of a crosstab.
Dimensions that are not explicitly mentioned will be searched for among the axes to see if they have a current member.
This gives a whole new meaning to the examples above... 
Taking the last example for instance:

tuple(Sale, 'ford fusion') | Time
---------------------------+------
Branch                     | <1234>

This crosstab has Branch and Time along the axes which will suply the context for the tuple expression. This crosstab will display the Ford Fusion sales broken down per day and per branch.

If we don't want our expression to depend on the context of the crosstab we can use the completeTuple() function.
Every dimension that is not explicitly defined will default to its default member - meaning it will be rolled up.

In the next part I'll write about crosstabs. 
Using tuple() expressons in crosstabs you can do pretty much anything.

And here is the good news. Charts work exactly like crosstabs.
If you can get the data to show up in a crosstab you can create a chart plotting the same data by setting it to the same query.

By the way... forget about list. When you work with a dimensional model you don't need list.

Scope

Scope is a tricky term to understand and in my opinion is not properly done - at least not in DMR.

IMO scope is the lowest level of granularity in a hierarchy for a measure's value.
If we want a value for higher levels then we roll up.

Cool. What about lower levels?
Cognos does not give any value for lower levels.

I think it's wrong. They mix access with granularity. You do not have access to measures that are not in scope. What the heck? You are taking away my access? For what?
I think the value should just be repeated, propagated down to lower levels.

Let me explain by an example.

Let's say we have an Annual Sales Target that is defined for each year. Our Time Dimension has the usual levels: Year, Month, Date
It wouldn't make sense to associate a value of Sales Target with the Date or Month level.
  • every day or month would have the same value
  • it is an annual target, we need to compare it with a year's worth of sales anyways
  • how would we roll it up for a year?
so everything tells us that it should be set in scope for the Year level only.
Right.
Now what if I ask the question: What was the annual sales target on Jan 23, 2008?
Cognos won't supply a value.
Is this question meaningful? We need to roll-up the sales for an entire year before we can do anything meaningful, don't we? So we don't really need a value at the day level... do we?

How about this: for each day I want to calculate the rolling-total of sales for 365 days back in time; and then compare that with the annual sales target. So as the year progresses I'd have some idea if I'm on track. Now that would make sense!

Unfortunately Cognos doesn't work like this.

Moving the Annual Sales target to the lower levels in scope is a bad idea - remember our reasons for putting it at the year level?

Tip:
When you need a value at a lower level than its scope you can use a tuple(), parent() and currentMember() to get the value
e.g. this expression would work at the day level:
tuple([Annual Sales Target], parent(parent(currentMember([Time Dimension]))))

I like Python's philosophy: discourage bad behavior but do not prevent it.
Why are they preventing me from accessing the value from a higher level? Let me decide if it makes sense to use it!