The Biggest Reason Why Data Science, Machine Learning and Business Intelligence Fails For Many Applications

0
2710

Many Projects spend many resources in terms of time, effort, and money to collect and process business data. The developers spent 80% of the project time developing the software with little thought into reporting and analytics. The real question is, if the developer captures the data, then why are reporting and analytics still failing? Why is it so hard to get the data back out and tell valuable business stories?

Using Domain-Driven Design (DDD) techniques, a good developer’s mindset starts coding at the business services layer. DDD means they are thinking about the business process and how to wrap the code around that flow. The end-user sees the UX, so that also gets lots of upfront attention. The user sends immediate feedback on using the application and how easy it is to use. Instant feedback will slant the development of awareness in favor of the UX and business process.

“BEGIN WITH THE END IN MIND” – BILL INMAN

Working code is only the first step in development but is shortsighted. The end game is extendable and maintainable code, where we spend 60% of our time in the software lifespan. There is minimal strategy given to the persistent data model or the database model.

The end-user never sees the data store, and who cares how it is stored as long as the application can easily read and write to the database. The BI engine is then asked to report on, in most cases, a very normalized and silo data model.

The only way to fix this issue is to have a data architect who thinks about both development and BI to design the model that the developer must use.

The developer should read and write to a 3rd Normal Form designed database. Reporting should be on a star schema or snowflake design specific to reporting, ad-hoc queries, and machine learning. It is a small price to pay for getting on the BI’s superhighway once it becomes big data.

The most common pitfalls when Designing Data Models

  1. We do not need an analytics data model; we will copy the source system tables.
  2. We do not have a data modeler, so we do without and enjoy cost savings.
  3. We built the data model already; we need reports now (if the data model does not reflect the report specifications and the general business requirements, then you did not design a model for your reporting)
  4. We will use stored procedures; this is extremely limited and brittle, leading to more significant costs over time.
  5. We can get a talented new hire, who knows the technology tools, but this will rarely work unless the designer is a data modeler and a business analyst.

Thanks for reading and sharing ❤

Follow me on LinkedIn | Instagram | Facebook