Thursday, January 29, 2009

Comparison of Bill Inmon and Ralph Kimball paradigm:

Comparison of Bill Inmon and Ralph Kimball paradigm:

In the data warehousing field, we often hear about discussions on where a person / organization's philosophy falls into Bill Inmon's camp or into Ralph Kimball's camp. We describe below the difference between the two.

Bill Inmon's paradigm: Data warehouse is one part of the overall business intelligence system. An enterprise has one data warehouse, and data marts source their information from the data warehouse. In the data warehouse, information is stored in 3rd normal form.

Bill Inmon: Endorses a Top-Down design
Independent data marts cannot comprise an effective EDW. Organizations must focus on building EDW.

"You can catch all the minnows in the ocean and stack them together and they still do not make a whale,"
Bill Inmon, January 8, 1998.


Ralph Kimball's paradigm: Data warehouse is the conglomerate of all data marts within the enterprise . Information is always stored in the dimensional model.
Kimball model also proposes the data warehouse bus architecture. This architecture is comprised of:
- A staging area (which can have an E/R or relationally designed 3NF design or flat file format), which cannot be accessed by an end-user of the data warehouse bus (Presentation Layer) .
-The Data Warehouse Bus itself which includes several atomic data marts, several aggregated data marts and a personal data mart but no single or centralized data warehouse component.
The Data Warehouse Bus:
- Is dimensional;
- Contains transaction and summary data;
- Includes data marts, which have single subject or fact tables; and
- Can consist of multiple data marts in a single data base.


There is no right or wrong between these two ideas, as they represent different data warehousing philosophies. In reality, the data warehouse in most enterprises are closer to Ralph Kimball's idea. This is because most data warehouses started out as a departmental effort, and hence they originated as a data mart. Only when more data marts are built later do they evolve into a data warehouse.

Ralph Kimball: Endorses a Bottom-Up design
EDW effectively grows up around many of the several independent data marts – such as for sales, inventory, or marketing

"...The data warehouse is nothing more than the union of all the data marts...,"
Ralph Kimball, December 29, 1997.

The main difference is Top-Down vs. Bottom-Up Approach.
Let us see the advantages & disadvantages:

Advantages of Top-Down Approach: [Bill Inmon Approach]:

  • A truly corporate effort, an enterprise view of data
  • Inherently architect ed- not a union of disparate DataMarts
  • Central rules and control
  • May be developed fast using iterative approach
Disadvantages of Top-Down [Bill Inmon Approach]:

  • Takes longer to build even with iterative method
  • High exposure/risk to failure
  • Needs high level of cross functional skills
  • High outlay without proof of concept
  • Difficult to sell this approach to senior management and sponsors
Advantages of Bottom-Up Approach [Ralph Kimball Approach]:
  • Faster and easier implementation of manageable pieces
  • Favorable ROI and proof of concept
  • Less risk of failure
  • Inherently incremental; can schedule important DataMarts first
  • Allows project team to learn and grow
Disadvantages of Bottom-Up Approach [Ralph Kimball Approach]:
  • Each DataMart has its own narrow view of data
  • Permeates redundant data in every DataMart
  • Difficult to integrate if the overall requirements are not considered in the beginning
Kimball encourages to have perspective of both “vertically” and “horizontally”
while gathering business requirements while developing/designing data marts using Bus Architecture to integrate in to Enterprise Data warehouse.

Vertical
  • Don’t just rely on the business data analyst to determine requirements
  • Inputs from senior managers about their vision, objectives, and challenges are critical
  • Ignoring this vertical span might cause failure in understanding the organization’s direction and likely future trends
Horizontal
  • Look horizontally across the departments before designing the Data Warehouse
  • Critical in establishing the enterprise view i.e Bus Architecture
  • Challenging to do if one particular department if funding the project. [Even one departments funds this project , but all the departments together become the company]
  • Ignoring horizontal span will create isolated, department-centric databases that are inconsistent and can’t be integrated while considering for Enterprise Data Warehouse
  • Complete coverage in a large organization is difficult , but CEO or CTO represents all of the departments. They should have the proper understanding of the full business and recommend what is good for whole company not just one department.
  • One rep. from each dept. interacting with the core development team can be of immense help
  • Creating a Business Decision Making group when there is a difference in the perspective from each department would help in resolve these kind of conflicts.
New Practical approach by Kimball
  • Plan and define requirements at the overall corporate level
  • Create a surrounding architecture for a complete warehouse
  • Conform and standardize the data content
  • Implement the Data Warehouse as a series of Supermarts, one at a time
[Notes: Soon I will have detailed steps & process to be followed by Kimball , please check back or subscribe to my blog]

SUPERMARTS

  • Totally monolithic approach vs. totally stovepipe approach
  • A step-by-step approach for building an Enterprise Data Warehouse from granular data
  • A Supermart s a data mart that has been carefully built with a disciplined architectural framework
  • A Supermart is naturally a complete subset of the Data Warehouse.
  • A Supermart is based on the most granular data that can possible be collected and stored
  • Conformed dimensions and standardized fact definitions
Data Warehouse to Data marts


Operational Source System (OSS) To Directly transforms data to Enterprise Data Warehouse Let us call -- ETL Process (1) .

Enterprise Data warehouse to Data Marts
Let us call -- ETL Process (2).






Enterprise Data Warehouse











[Further comparison will be added often ]

20 comments:

  1. Hello Srini,

    Kudos on an excellent in-depth comparison of Kimball and Inmon.

    I am a newcomer to this field and your blog is of great help. I have a question. I have seen jobs that require Ralph Kimball ETL Architecture Certification. How does one go by obtaining it?

    I live in NJ, by the way. Obviously local would be great but failing that, something online would also help.

    Again, great article and thanks in advance.

    Regards,
    Kumar

    ReplyDelete
  2. Thanks for sharing this blog with us and keep it going well work i love it
    Chicago warehouse

    ReplyDelete
  3. Thanks for sharing useful information. I learned something new from your bog. Its very interesting and informative. keep updating. If you are looking for any Data science related information, please visit our website Data science training institute in Bangalore

    ReplyDelete
  4. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Java Training in Chennai

    Java Training in Velachery

    Java Training inTambaram

    Java Training in Porur

    Java Training in Omr

    Java Training in Annanagar


    ReplyDelete
  5. I wish to show thanks to you just for bailing me out of this particular sap ps.As a result of checking through the net and meeting techniques that were not productive, I thought my life was done.

    sap ps training in bangalore

    ReplyDelete
  6. Theenterprise data warehouse is a collection of data marts, which are smaller data warehouses that exist for specific information needs. It is used to store and manage the data for a large organization, and is a very important tool for decision-making. The data warehouse is created using data mining to make the most of data that is stored in flat files, relational databases, or other data sources.

    ReplyDelete
  7. Very Informative blog thank you for sharing. Keep sharing.

    Best software training institute in Chennai. Make your career development the best by learning software courses.

    android app development training in chennai
    devops training in chennai
    azure training in chennai

    ReplyDelete