Where to put the Business Calculation Rules in Data Warehousing?

0
2403

There are 3 main places to put the Business Intelligence of Calculations

  1. Centralized in the database model pre-calculated using ETL (Extract Transform and Load) or even using Virtual Columns that evaluate at run time.
  2. At the semantic metadata layer (universe objects for business objects evaluated at run time for SQL generation)
  3. At the report level using Variables (post run time where the data is in the report cube)

There are pros and cons to each approach and using this guide of reward and drawbacks you can assess where is the best place to put the intelligence.

Database Model Pros

  • Enforce the integrity of the source data and relationships.  The impact is the calculation is built the way the data was modeled for all consumers
  • Ensure consistency and a single version of the truth
  • Calculate and stored once, retrieve many times
  • Change once and impacted everywhere

Database Model Cons

  • Requirements need to be communicated to IT and it that is complex requirement then use cases can be missed
  • Change is slow and IT resource intensive
  • True Impact Analysis only comes after rigorous Testing and Communication between build and testing teams.

Recommendation: If a calculation is performed frequently and has high resource overhead consider pushing it to the Database Model.  Green field builds allow this preferred approach compared to the semantic approach where physical data models are considered NON-MALLEABLE

Semantic Metadata Layer (universe objects for business objects) Pros

  • Build once – Use many times or by many users
  • Access to database functions
  • Enterprise Consistency across all Reports for all LOB’s, Change in 1 place and update all reports depend on that object
  • Updates automatically propagate when the reports are refreshed

Semantic Metadata Layer (universe objects for business objects) Cons

  • Dependent on IT resource of universe designer
  • Complex Requirements require more time and repression testing since changes impact all users of the universe

Recommendation: If a calculation is performed frequently and is highly shared by many LOB’s in many reports consider pushing it to the Universe Semantic Layer

Report Level using Variables Pros

  • User Driven and has reduced development times (no knowledge transfer of requirement, fast feedback loops and immediate testing)
  • Independent of SQL restrictions

Report Level using Variables Cons

  • Localized to that user report can lead to enterprise inconsistencies
  • User training for a minimum level of technical expertise
  • Data Volume in the report can create report drag on rendering

    Recommendation: If a calculation is performed as a one off and is  needed immediate consider pushing it to the report Layer