Seven stages of predictive analytics implementation

For the past few decades, business intelligence tools have been essential for companies wanting to stay ahead of the competition. Their use has become so widespread that a new approach was required and it came in the form of predictive analytics. The natural evolution of business intelligence, predictive analytics provides a deeper understanding of future trends, relying on historical data and statistical models. As with artificial intelligence, however, it is only as good as the input and the reasoning behind the algorithm used.

Implementing predictive analytics can help prevent fraud, predict customer churn, forecast for cash-flow and revenue, and improve marketing campaigns – but it must be properly executed.

01 Project definition

It is essential to be specific about what you hope to achieve by implementing predictive analytics methodology. Before starting, set out expected outcomes and clear deliverables, as well as the input which will be used. Establish that all data sources are available, up to date and in the expected format for the analysis.

02 Data collection

Since predictive analytics is all about using large volumes of data to get insights about trends and stay ahead of the game, the data collection phase is crucial for the success of the initiative. Most likely this will include information from multiple sources, so there needs to be a unitary approach to data. Sometimes information will be collated and cross-queried for a comprehensive picture of the underlying phenomenon.

Most of the time, data will be collected into a data lake – not to be confused with a data warehouse, which has some significant structural differences. A data lake contains information in a raw state. This means it can range from structured (tables) to semi-structured, like XML or unstructured (social media comments). For the success of the project, it is mandatory to understand the differences and employ the right tools.

03 Data analysis

Once you have all the data you need in place, it is time to dissect it. The investigation will hopefully reveal trends, help prevent fraud, reduce risks or optimize processes. Surprisingly, 80 per cent of this stage has to do with cleaning and structuring data, rather than modeling it. Once this is completed, results must be interpreted and actionable goals defined.

04 Statistics

Statistics is just as important as big data when implementing predictive analytics, particularly when testing and validating assumptions. Very often, those in charge of the project will have a specific hypothesis about the behavior of consumers, conditions which indicate fraud and so on. By statistical methods, these are put to the test and decisions are made based on numbers, not hunches.

Be ready to have your ideas challenged by data and accept that sometimes the obvious logical outcomes are not supported by reality.

05 Modeling

When it comes to modeling, it is often best to use existing tools. There are countless libraries, built on open-source programming languages like Python and R. There is no time to reinvent the wheel, it is more important to know the available options and choose the best one for the job. The ultimate goal should to democratize modeling and make it available to business analysts, as well as data scientists.

06 Deployment

Once data has gone through statistical analysis and the model has been calibrated, results need to be interpreted and integrated into daily routines.

As suggested, once the model is created and deemed sufficient, it should be used to dictate the daily choices and govern the processes in the organisation. It is not enough to have numbers which show what would be best for the company unless that translates to actionable steps and measurable results.

07 Monitoring

Reality is not static; neither is data. A model can be valid for a certain period while the external conditions do not change significantly. It is good practice to revisit the models periodically and test them with new data to make sure they have not lost their significance.

This is especially important for those models used for marketing campaigns. The preferences of the customers and trends in consumer markets sometimes change so fast that previous expectations quickly become yesterday’s news.


There is a transition taking place from relying on reports and past data towards looking at the future and preparing for it. The extremely competitive marketplace is pushing companies to find new ways to get ahead of their peers and to rely more on data than on hunches or on simply carrying on business as usual. They must understand opportunities before these arise and be ready when it happens.


Key takeaways
  1. Be clear about the scope of your project and define from the beginning the expected results and outcomes. Don’t just go with the flow.
  2. Make sure you have the right data. Store it in a data lake to be able to use it repetitively for different purposes, in different environments.
  3. Perform the analysis on clean, organized data and try to interpret the results in actionable ways.
  4. Trust the numbers, not the hunches. Perform in-depth statistical analysis.
  5. Don’t strive to reinvent the wheel. Scan through existing free tools.
  6. Create fool-proof processes based on the results of the investigation.
  7. Revisit your models frequently enough.