The following script works in Oracle and creates a table - actually a view - to be used as time dimension.
create or replace view timedimension
as
select
to_date('2009-01-01', 'YYYY-MM-DD') + level - 1 dt
from
dual
connect by
level <= trunc(current_date, 'DDD') - to_date('2009-01-01', 'YYYY-MM-DD') + 1;
/
The table contains one record per day and the dates run from 2009-01-01 till present, keeping it as small as possible - which will come handy when joining in queries.
Tuesday, January 27, 2009
Crosstab Column Headers
Previously I described a technique to add column headers to crosstabs and make them look like lists.
will look like
I found that this method sometimes leads to empty cells on the right hand side of the crosstab - I believe due to a bug in Cognos.
Here I describe another way to do it that works fine.
- Add an empty first row to the crosstab by bringing in Crosstab Spaces.
- Set the text of these... add one for each column on the left hand side of the crosstab.
- Set "Define Contents = Yes" for all the facts cells in this first row
- Unlock the editing (lock icon)
- Drag a Text Item into each member cell and set the text to be the column heading of each column in the top (horizontal) side of the crosstab
- Hide the topmost row of the corsstab by setting Box Type = None
a crosstab that looks like this:
| measure1 | measure2
----------------------------------------+----------+---------
dimension member 1 | dimension member 2 | <1234> | <1234>
+--------------------+----------+---------
| dimension member 2 | <1234> | <1234>
+--------------------+----------+---------
will look like
| measure1 | measure2
| |
-------------------------------------------+----------+---------
crosstab space | crosstab space | textitem | textitem
member1" | "member2" |"measure1"|"measure2"
----------------------+--------------------+----------+---------
dimension member 1 | dimension member 2 | <1234> | <1234>
+--------------------+----------+---------
| dimension member 2 | <1234> | <1234>
+--------------------+----------+---------
Monday, January 26, 2009
From DMR to "Real" OLAP
Recently I've been working to migrate existing report from DMR model to "real" OLAP cubes.
I haven't posted much because I was gaining experience with Cognos PowerPlay Transformer - which I used as OLAP server. In this article I will share some of my experiences.
Feeding a PowerCube with Data
Cognos offers different ways to set the data source for building a cube. One is more weird than the other.
It supports CSV files... great. Who wants to do serious BI and build analytical cubes from comma separated text files?
It support reports as data source... also great. I guess it gathers the queries defined in a report... very very weird.
Anyways... it offers one method which I would call "normal" and let's just forget about the rest: IQD.
IQD stands for Impromptu Query Definition... Impromptu was a name of their product or something like that...
The point is it's SQL with some metadata around it and perfectly readable in a text editor.
Suggestions about IQD
First I used Framework Manager to generate IQDs by "externalizing" query subjects.
I found it unreliable... and it's not needed. Just write the SQL in a database utility and copy-paste into the IQD file.
Look at one IQD file in a text editor, it's a very easy format.
I found that the name of the query doesn't matter.
The columns in the SQL's result set are written after the SQL.
Be careful not to end these lines with commas as the comma becomes part of the column name and that's quite annoying.
On the long term I think it makes sense to have the cube update script create the IQDs. This way it can optimize the SQL that is run to generate the cube.
e.g. include a time filter in the query and build the cube incrementally.
Otherwise the same SQL would be run every time the cube is built... which means regenerating the cube from scratch. Even if the cube is smart enough (partitioned etc.) the query still returns way too many records.
Differences in Dimensional Model between PowerCubes and DMR
In DMR there are multiple dimension hierarchies and conformed dimensions. None of this is present in PowerCubes.
I simply did not find a way to have multiple hierarchies in the same dimension. If someone knows that this is possible please send a comment. IMO it is not supported in PowerCubes.
The lack of conformed dimensions mean that if say we have a Sales measure and an AdvertisingDollarsSpent measure and we store them in two different cubes and both have a Branch dimension then Cognos has no idea that a "branch A" in one cube is the same thing as "branch A" in another cube. It doesn't even know that January 1st is the same day in both...
Time Dimension
Even though it supports time dimension and can automatically generate the members (days) in it this doesn't always work.
It doesn't work for the topmost level (all days)... so if you are not using years, month, weeks - because like me you are using a custom calendar - then it won't work.
this sucks because it means you need to have a utility table in the database that contains a record for each and every day.
This can be nicely done by a well crafted view probably... if someone can share a script I'd appreciate it.
Dealing with Multiple Cubes from Reports
All this leads to the facts that writing reports is a lot harder. If you want to work with more than one cube things get more tricky...
And why wouldn't you?
Of course you can always group all measures used in a report into a single cube and problem solved... but I think it's bad practice. I shouldn't have to do that. Why adjust the datasource to the report?
Also... the cube update process can be parallelized if there are multiple simpler cubes rather than one giant cube.
So dealing with multiple cubes the problem is the filter page. You need to build the prompt on one dimension and that prompt won't work on another cube because it uses a different dimension (because as I just said dimensions are not conformed)
There is a way around it. If the dimensions look the same in all cubes you can just do some string replacement on the MUNs returned by the prompt macro to translate an MUN from one cube to another.
it looks something like:
#join(';',
substitute('\[cube1\], '[cube2]',
split(';',
promptmany('param1', 'MUN')
)
)
)#
- the value returned by promptmany() has to be split because otherwise substitute() would only replace the first occurence
- then it has to be put together again with join()
- square brackets have to be escaped with backslash in the call of substitute()
- promptmany() returning MUNs uses semicolon as separator
So if you built the prompt using a dimension from cube1 you can still use the prompt in queries using cube2... all you need to do is use something like the abve uglyness instead of a simple #promptmany()#
Attributes
Another difference with DMR is the rather bad support for attributes. There are only 4 attributes to set. They are called "short name", "long name" etc
Actually the naming in Transformer does not match the naming in Report Studio... so you need to figure it out once.
This also sets a limit on the number of attributes... especially since one is used for ID and another for caption... so we are down to 2 which is free for any purpose.
I also found that you want to click the "refresh" checkbox in the level properties, otherwise the attributes may not be set properly when the cube is built.
Migrating Queries from DMR to Transformer
Originally I thought I would take the queries used in the DMr model "as is". Then I found myself rewriting them... though it does not take much effort since there is already some starting point.
In DMR I did as much aggregation on the database as possible to improve performance - which is horrible in DMR anyways.
In transformer I found it less important, since the cube will persist the aggregation the report actually run much faster. I decided to rather keep the SQL populating the cubes simple. This will hopefully lead to less errors and is easier to verify.
One can set the aggregation function to deal with duplicates. this is pretty much the same as aggregating in the DB using "group by".
I guess this approach could be disputed... it may make sense to aggregate n the DB server if the size of the record set returned is an issue during cube update.
Number format
I ran into this and gave me some headache. For measures that ae floating point numbers you have to set the precision to something other than the default zero... otherwise Cognos will round it to the nearest integer (so zero precision does not mean unlimited, it means zero decimals)
Hopefully these experiences will help you evaluate if you want to use PowerCubes and will save some headache while working with them... any questions let me know and I'll try my best to answer them in future posts.
Thursday, December 11, 2008
Data Level Security in Cubes Using ParameterMaps
Data Level Security in Cubes Using Parameter Maps
In this article I will look at a real-world example of data level security filtering.
Cognos does provide some role based security filtering capability.
I think the problem with roles is that there is usually an attribute associated with the role... and that on the other hand is not supported out-of-the-box.
Consider the following example, to undertsand what I mean by "attribute associated with the role"
Suppose we have the usual GO-Sales kind of data. We have a Sales measure and one of our dimensions is Branch.
We have two roles: CEO and Branch_Manager
Our requirement is that CEO can see everything but a Branch_Manager can only see Sales by his or her own branch.
This is the point where role-based security goes week.
Somehow we need to store which Branch a Branch_Manager is allowed to see.
It's not sufficient to know that a user has Branch_Manager role... we need to know (and handle) a BranchID associated with that role.
This is what I call the "attribute associated with the role" - in our example the BranchID.
Usually this implies we need to store this somewhere, eg in the database or maybe in a directory server like LDAP.
To implement data level security I find the best option is to use Slicers all over the place where cubes are referenced - in queries used by crosstabs and charts on the reports, in queries used to populate prompts on the filter pages.
(Keep in mind, I firmly believe that lists should not be used with a dimensional datasource.)
The slicer will make sure that filtering is applied to the entire cube, regardless of the dimensions used for display.
To continue with the example we need to write a slicer that
So what tools are in or arsenal to tackle this?
Session Parameters
Session parameters provide a way to get an identifier of the current user.
e.g.
Session parameters can be used in macro expressions. The syntax is to prefix them with a ``$'' (dollar sign)
e.g.
#sq($account.userInfo)#
will result in
'Tamas Simon'
The exact details depend on how Cognos is configured...
Bottom line is that in every Cognos installation there should be some sort of identifier of the current user and it's accessible to report expressions using session parameters.
You can check the session parameters in Framework Manager by selecting Project / Session Parameters... from the menu.
Parameter Maps
Parameter maps are similar to session parameters in that they can be accessed from macro expressions.
They are name-value pairs, or rather "key-value" pairs - as Cognos calls it.
They are defined in the Framework Model.
They can be based on a query, just select a column for Key and another for Value.
Syntax is similar to session parameters.
``$'' (dollar sign) infornt of the name
``{}'' (curly braces) surrounding the key
``''' (single quote) surrounding the key if it is a string literal
e.g.
#$pmUserAccessableBranches{'Tamas Simon'}#
Here are some important notes I'd like to make:
1., There is no way to include or exclude parameter maps from a package. In my understanding when you re-publish a pckage all parameter maps are published with it.
2., The query subject that the parameter map is based on has to be published in the package. Otherwise you won't get any error or warning messages but the parameter map will be empty, always returning its default value if you have defined any.
I remember reading that the query is executed everytime an expression references the parameter map - but I have not confirmed this.
string aggregate function
3., Keys are unique. This is tricky... to continue our example imagine that we want to be able to assign multiple branches to the same user.
In other words we want to control which branches a user is allowed to access (maybe only one, maybe more)
So in the parameter map we need to have a list of IDs associated with a Key, all within a single string (stored as the Value)
e.g.
database table
we would want the parameter map to look like
pmUserAccessableBranches
This is tricky considering that we need to write a query to populate the parameter map... but it's doable.
Oracle for example does not provide a string aggregate function out-of-the-box but one can write his own.
select UserName as Key, stragg(BranchID) as Value from UserAccessableBranches group by UserName
You can find a string aggregate function e.g. here: http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php#user_defined_aggregate_function
With these three tools: session parameters, parameter maps based on DB query and string aggregation we can solve our data level filtering challenge.
We end up with slicers something like this:
filter( [Branches Dimension].[Branch], roleValue('_businessKey', [Branches Dimension].[Branch]) in ( #csv(split(' ', $pmUserAccessableBranches{$account.userInfo}))# ) )
from the inside out:
get the current user via session parameter: $account.userInfo
use this as a Key to look up the user accessable branches from the parameter map: $pmUserAccessableBranches{$account.userInfo}
massage the macro until it returns a comma separated string: #csv(split(' ', $pmUserAccessableBranches{$account.userInfo}))#
use the whole thing to filter branches... resulting in a slicer member set.
We could just generate a set() expression using the prompt...
set( #join(',', substitute('^', '[Branches Dimension].[Branch]->:[PC].[@MEMBER].[', substitute('$', ']', split(' ', $pmUserAccessableBranches{$account.userInfo}))))# )
This would work if the IDs are in synch with the cube. Otherwise we might get an error from Cognos, saying basically that we are refering to a member that does not exist. I find it safer to use the filtering expression.
In this article I will look at a real-world example of data level security filtering.
Cognos does provide some role based security filtering capability.
I think the problem with roles is that there is usually an attribute associated with the role... and that on the other hand is not supported out-of-the-box.
Consider the following example, to undertsand what I mean by "attribute associated with the role"
Suppose we have the usual GO-Sales kind of data. We have a Sales measure and one of our dimensions is Branch.
We have two roles: CEO and Branch_Manager
Our requirement is that CEO can see everything but a Branch_Manager can only see Sales by his or her own branch.
This is the point where role-based security goes week.
Somehow we need to store which Branch a Branch_Manager is allowed to see.
It's not sufficient to know that a user has Branch_Manager role... we need to know (and handle) a BranchID associated with that role.
This is what I call the "attribute associated with the role" - in our example the BranchID.
Usually this implies we need to store this somewhere, eg in the database or maybe in a directory server like LDAP.
To implement data level security I find the best option is to use Slicers all over the place where cubes are referenced - in queries used by crosstabs and charts on the reports, in queries used to populate prompts on the filter pages.
(Keep in mind, I firmly believe that lists should not be used with a dimensional datasource.)
The slicer will make sure that filtering is applied to the entire cube, regardless of the dimensions used for display.
To continue with the example we need to write a slicer that
- figure out who the current user is
- what role the current user has - is the user a Branch_Manager
- what attribute is associated with that role - value of the BranchID
So what tools are in or arsenal to tackle this?
Session Parameters
Session parameters provide a way to get an identifier of the current user.
e.g.
account.userInfo |
Tamas Simon |
Session parameters can be used in macro expressions. The syntax is to prefix them with a ``$'' (dollar sign)
e.g.
#sq($account.userInfo)#
will result in
'Tamas Simon'
The exact details depend on how Cognos is configured...
Bottom line is that in every Cognos installation there should be some sort of identifier of the current user and it's accessible to report expressions using session parameters.
You can check the session parameters in Framework Manager by selecting Project / Session Parameters... from the menu.
Parameter Maps
Parameter maps are similar to session parameters in that they can be accessed from macro expressions.
They are name-value pairs, or rather "key-value" pairs - as Cognos calls it.
They are defined in the Framework Model.
They can be based on a query, just select a column for Key and another for Value.
Syntax is similar to session parameters.
``$'' (dollar sign) infornt of the name
``{}'' (curly braces) surrounding the key
``''' (single quote) surrounding the key if it is a string literal
e.g.
#$pmUserAccessableBranches{'Tamas Simon'}#
Here are some important notes I'd like to make:
1., There is no way to include or exclude parameter maps from a package. In my understanding when you re-publish a pckage all parameter maps are published with it.
2., The query subject that the parameter map is based on has to be published in the package. Otherwise you won't get any error or warning messages but the parameter map will be empty, always returning its default value if you have defined any.
I remember reading that the query is executed everytime an expression references the parameter map - but I have not confirmed this.
string aggregate function
3., Keys are unique. This is tricky... to continue our example imagine that we want to be able to assign multiple branches to the same user.
In other words we want to control which branches a user is allowed to access (maybe only one, maybe more)
So in the parameter map we need to have a list of IDs associated with a Key, all within a single string (stored as the Value)
e.g.
database table
UserName | BranchID |
Tamas Simon | B1 |
Tamas Simon | B2 |
Tamas Simon | B3 |
we would want the parameter map to look like
pmUserAccessableBranches
Key | Value |
Tamas Simon | B1 B2 B3 |
This is tricky considering that we need to write a query to populate the parameter map... but it's doable.
Oracle for example does not provide a string aggregate function out-of-the-box but one can write his own.
select UserName as Key, stragg(BranchID) as Value from UserAccessableBranches group by UserName
You can find a string aggregate function e.g. here: http://www.oracle-base.com/articles/10g/StringAggregationTechniques.php#user_defined_aggregate_function
With these three tools: session parameters, parameter maps based on DB query and string aggregation we can solve our data level filtering challenge.
We end up with slicers something like this:
filter( [Branches Dimension].[Branch], roleValue('_businessKey', [Branches Dimension].[Branch]) in ( #csv(split(' ', $pmUserAccessableBranches{$account.userInfo}))# ) )
from the inside out:
get the current user via session parameter: $account.userInfo
use this as a Key to look up the user accessable branches from the parameter map: $pmUserAccessableBranches{$account.userInfo}
massage the macro until it returns a comma separated string: #csv(split(' ', $pmUserAccessableBranches{$account.userInfo}))#
use the whole thing to filter branches... resulting in a slicer member set.
We could just generate a set() expression using the prompt...
set( #join(',', substitute('^', '[Branches Dimension].[Branch]->:[PC].[@MEMBER].[', substitute('$', ']', split(' ', $pmUserAccessableBranches{$account.userInfo}))))# )
This would work if the IDs are in synch with the cube. Otherwise we might get an error from Cognos, saying basically that we are refering to a member that does not exist. I find it safer to use the filtering expression.
Tuesday, November 11, 2008
How to use the "Range Prompt" - some undocumented features
The TextField prompt has an interesting feature: it can be used to get two values instead of one by setting its ``Range'' property to YES.
This is handy when you want to filter by a range that has a lower and/or an upper value.
The extra that the prompt gives you is that you can have open intervals ie. cases where the user only specifies and upper or a lower value.
e.g. filter lines where some ratio is
It saves you from having to do some sort of javascript voodoo with radio buttons etc.
The trick is that you want to set the prompt as optional (= not required) because otherwise it would be just two textfields, which is not any better than using two textfield from the first place.
Now... since the prompt is optional the user does not have to input values.
If the user leaves both the lower and the upper values unspecified... the prompt does not return anything.
And this is a problem...
...because you are trying to use the prompt in an expression using the in_range operator that looks something like this
filter(my_dimension, my_measure in_range ?pRange?)
and this blows up!
After the prompt is evalued it becomes the following expression:
filter(my_dimension, my_measure in_range )
...that's right, the prompt did not return absolutely anything
To fix this you need to use the long form of the prompt macro and set the datatype to ``range''.
This is undocumented as of 8.3 ... but it works
to fix the expression specify a default range that the prompt should return when the user left both lower and upper values unspecified. e.g. can be an open range starting with 0
filter(my_dimension, my_measure in_range #prompt('pRange', 'range', '{0:}')#)
This is handy when you want to filter by a range that has a lower and/or an upper value.
The extra that the prompt gives you is that you can have open intervals ie. cases where the user only specifies and upper or a lower value.
e.g. filter lines where some ratio is
- less than 80%
- beteen 60% and 80%
- greater than 80%
It saves you from having to do some sort of javascript voodoo with radio buttons etc.
The trick is that you want to set the prompt as optional (= not required) because otherwise it would be just two textfields, which is not any better than using two textfield from the first place.
Now... since the prompt is optional the user does not have to input values.
If the user leaves both the lower and the upper values unspecified... the prompt does not return anything.
And this is a problem...
...because you are trying to use the prompt in an expression using the in_range operator that looks something like this
filter(my_dimension, my_measure in_range ?pRange?)
and this blows up!
After the prompt is evalued it becomes the following expression:
filter(my_dimension, my_measure in_range )
...that's right, the prompt did not return absolutely anything
To fix this you need to use the long form of the prompt macro and set the datatype to ``range''.
This is undocumented as of 8.3 ... but it works
to fix the expression specify a default range that the prompt should return when the user left both lower and upper values unspecified. e.g. can be an open range starting with 0
filter(my_dimension, my_measure in_range #prompt('pRange', 'range', '{0:}')#)
Thursday, September 4, 2008
[updated] Relative Package Names
It is possible to have a report be built on a package that is specified with a relative (to the report) location.
It is not possible to set relative package names through the UI in ReportStudio but you can copy the XML source into an editor, make the changes and paste it back to RS.
The reference to the package is right at the beginning of the XML in the <modelPath> tag.
I tested with relative path on 8.3 and it worked.
This is great for keeping multiple environments on the same Cognos box.
The report age can be the same. There is no need to relink the report with a different package. The package can be redeployed into different folders (new feature in 8.3) with the same name. The packages would be practically the same only they would use different datasources. Changing the datasource is easy... just run a sed script on the model's XML and publish.
If you need to maintain a lot of environments eg. QA, staging, production, support on the same Cognos then you will find this very useful!
Update
Damn, it doesn't work. Report Studio axccepts the relative path but before saving the report it substitutes it with an absolute path.
the good news is that in 8.3 you can relink a report to another package through the portal by setting the properties... there's no need to open up the report in RS.
It is not possible to set relative package names through the UI in ReportStudio but you can copy the XML source into an editor, make the changes and paste it back to RS.
The reference to the package is right at the beginning of the XML in the <modelPath> tag.
I tested with relative path on 8.3 and it worked.
This is great for keeping multiple environments on the same Cognos box.
The report age can be the same. There is no need to relink the report with a different package. The package can be redeployed into different folders (new feature in 8.3) with the same name. The packages would be practically the same only they would use different datasources. Changing the datasource is easy... just run a sed script on the model's XML and publish.
If you need to maintain a lot of environments eg. QA, staging, production, support on the same Cognos then you will find this very useful!
Update
Damn, it doesn't work. Report Studio axccepts the relative path but before saving the report it substitutes it with an absolute path.
the good news is that in 8.3 you can relink a report to another package through the portal by setting the properties... there's no need to open up the report in RS.
Monday, September 1, 2008
Crosstabs with Column Headers
aka How to make Crosstabs that look like lists
will look like
If it was a list it would have column headers and would look like this:
To achieve this using a crosstab you need to add extra Text Items and hide some cells by setting their box type to ``none''
Hiding the top left corner will pull the extra Text Items to the left so that the "headers" will nicely align with their columns.
But adding these new text item still creates extra empty columns in the crosstab... these have to be hidden as well.
The resulting crosstab will look just like a list, displaying column headers.
You may need to play around with styling it to get everything in the right color....
| sales
------+--------------+---------
branch| sales person | <1234>
+--------------+---------
| sales person | <1234>
will look like
| sales
-----------+---------+--------
1st Avenue | Aaron A | 1,000
+---------+--------
| Betty B | 2,000
-----------+---------+--------
2nd Street | Clare C | 1,500
...
If it was a list it would have column headers and would look like this:
branch | sales person | sales
-----------+--------------+--------
1st Avenue | Aaron A | 1,000
+--------------+--------
| Betty B | 2,000
-----------+--------------+--------
2nd Street | Clare C | 1,500
...
To achieve this using a crosstab you need to add extra Text Items and hide some cells by setting their box type to ``none''
HIDE | "branch" | "sales person" | sales
---------------------+----------+----------------+------
branch| sales person | HIDE | HIDE | <1234>
+--------------+----------+----------------+-------
| sales person | HIDE | HIDE | <1234>
Hiding the top left corner will pull the extra Text Items to the left so that the "headers" will nicely align with their columns.
But adding these new text item still creates extra empty columns in the crosstab... these have to be hidden as well.
The resulting crosstab will look just like a list, displaying column headers.
You may need to play around with styling it to get everything in the right color....
Subscribe to:
Posts (Atom)