Thursday, February 27, 2014

Self-service BI



Most BI solutions today require business users to look at standard pre-configured reports that have been put in place by the IT department. Self-service BI allows end users to generate their own customized reports without the intervention of the IT staff. This benefits both parties as it makes end users more independent while freeing up the time of the IT staff.

While gathering requirements, the IT staff may not completely understand how the data is required to be presented to the user. The self-service approach ensures that nothing is lost in translation.

SSBI solutions are intended to be used by users who do not have much technical knowledge. Two important factors in a self-service approach are:

  • User interface: The UI must be intuitive and easy to use.
  • Metadata dictionary: The IT staff needs to provide the users with a metadata dictionary for the users to understand how the data can be used.

Self-service BI sounds like the perfect solution as it empowers users to get things done without the involvement of the IT staff. However, there is a down side to the self-service approach.

Handing over the power to end users means having a lot more users accessing data and querying the system. Having multiple users firing queries and generating reports simultaneously could overload the system. Security is another concern as the system would need to ensure that access to sensitive data is controlled based upon a user’s position in the organization.

Another problem is that of organizing reports. Multiple analysts could each have multiple reports and analyses of the data. While making a decision it is important to have all the data in one place. Organizing the data could be an expensive and time-consuming process.

Self-service BI with its many privileges sounds promising and it does seem like the pros could outweigh the cons. There are several companies offering such solutions. For example, Business Objects' Crystal Reports and Microsoft’s project Gemini. However it is still early times and it isn't easy to forecast how SSBI could be implemented as a full-blown enterprise solution.

Refrences:

Thursday, February 13, 2014

Post #3: Dimensions They Are a-Changin’ (Slowly)

Dimensions change over time and there are well-established methods of handling these changes. In this blog post I will talk about some of the different methods of handling slowly changing dimensions (SCDs).


Type 0: Retain original

In this case we just retain the original value as the dimension’s value never changes. We use type 0 for things like credit score and date dimensions where we would never want to change the dimension. Type 0 is quite rare.

Type 1: Overwrite

With type 1, we just overwrite the original value of the dimension. It is used in cases where we don’t really need to record any history of the attribute’s value or in case we need to correct an error.

For example:
Before:
Supplier Key
Supplier Name
State
123142
Foo
AZ

After:
Supplier Key
Supplier Name
State
123142
Bar
AZ



Type 2: Add a new row

This is the most important type as it is encountered very frequently. With type 2, we record the history of changes. We need to add a 3 new columns to the table:
  1. row effective date or date/time stamp
  2. row expiration date or date/time stamp
  3. current row indicator.

So, if a type 2 attribute changes, we add a new row to the dimension table with the row effective date set to the current date and the row expiration date usually set to a far-off date like 9/9/9999. We also need to set the row expiration date of the affected row to the current date.

For example:
Before:
Supplier Key
Supplier Name
State
Row effective date
Row expiration date
Current row indicator
123142
Foo
AZ
1/7/2013
2/13/2014
Expired

After:
Supplier Key
Supplier Name
State
Row effective date
Row expiration date
Current row indicator
123149
Bar
AZ
2/13/2014
9/9/9999
Current


Type 3: Add a new attribute

For type 3, when changing the value of an attribute, we add a new column which will contain the new value. This approach is useful when we want to record changes but the frequency of change is low. It is not used as much as type 2.

For example:
Before:
Supplier Key
Supplier Name
State
123142
Foo
AZ

After:
Supplier Key
Old Supplier Name
State
Supplier Name
123142
Foo
AZ
Bar

Type 4: Add mini-dimension

This technique is used when a group of attributes in a dimension quickly changes. It is split off to a mini-dimension. The type 4 mini-dimension requires its own unique primary key; the primary keys of both the base dimension and mini-dimension are captured in the associated fact tables. The mini-dimension is the one that tracks all changes. The original dimension contains the current data.

For example:
Supplier Key
Supplier Name
State
123142
Foo
AZ


Supplier Key
Supplier Name
State
Row effective date
Row expiration date
123142
Bar
AZ
2/13/2013
2/13/2014
123142
Baz
AZ
2/13/2012
2/12/2014

Reference:

Ralph Kimball, M. R. The Datawarehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Thursday, February 6, 2014

Post #2: Cloud based BI

I have been working on a project that involves cloud based BI. So for this post, I will be writing about what I have learnt from my experience so far.

Quite like the entire software industry, the trend to move towards the cloud has hit the BI industry. A lot of startups like GoodData, Birst and BIME have cloud products as their primary sources of revenue. Major players like Oracle, Microsoft, Microstrategy and SAP are slowly trying to expand into this space.

There are two variations of cloud BI tools – SaaS BI solutions which run on the cloud where all you need is an internet-enabled device, and desktop BI tools with an online option which allow you to directly share reports on the cloud. The former is definitely more interesting.

Cloud solutions for BI have advantages that are quite similar to cloud computing in general: 
  • No installation required: Most solutions offered are SaaS. This means you don’t need to install the software on the client machines and no hardware compatibility issues. 
  • High Scalability / Increased flexibility: An organization can quickly increase/decrease the number of users without much of a hassle. It can adapt to changing business needs.
  •  Lower costs: There is also no major expenditure like traditional IT environments. As cloud based BI solutions are offered as SaaS, there is no installation cost and no maintenance cost. Upgrades are also free. Of course cost also depends on the licensing model used. There is some variation here, vendors currently offer licenses on a ‘per user’ (with or without platform fee) and hourly basis.
  •  Easy access: As all the data is on the cloud, it can be accessed from anywhere through any device. A lot of BI solutions also offer a mobile app to view reports. This also makes for easier sharing of data and facilitates self-service BI.


However, cloud based solutions come with certain caveats:
  •  Integration of the tool with data sources is still difficult. Often, connectors need to be designed for specific organizational needs.
  • Data visualizations aren’t as powerful as their on premise counterparts. Cloud BI tools are getting there, they just aren’t there yet.
  • Security of data is not within the control of the organization. Organizations must rely on the vendor (who may actually do a better job actually).


As stated in this article from Forbes, soon, it won’t be a question of “yes” or “no” rather a question of when and how much. 

References: