‘Automated’ data migration dreams

There are many products in the marketplace that profess to deliver an automated data migration experience. But is this really possible? 

Data migration is the process of moving data from one, or typically multiple, data sources into a target solution. In my early career that involved numerous business users or ‘temps’ furiously typing data into a newly developed software solution, just in time for the business to cutover – the results were predictable. 

Fast forward to today and the emphasis has been completely reversed with the ERP customer now expected to provide the data, ready for loading in pre-defined load files, for a solution that they typically will not know in any detail. 

In reality, in order to deliver this successfully, the migration process must contend with many competing elements, including; source data extraction, mapping, validation, cleansing, transformation to load specifications, and loading into a target solution. All of which incidentally evolves throughout the whole programme. On top of this data should be reconciled and validated at each stage.  

I hate to debunk the myth but there is no magic tool or wand that can be waved to make this happen

This process, once built, needs to be repeated several times over the course of an implementation programme to support test cycles and to demonstrate various KPIs on quality are met. This all leads to the point where cutover is performed and the new solution goes live. 

But is automated data migration possible? The short answer is no. Well in reality, not out of the box it isn’t. I hate to debunk the myth but there is no magic tool or wand that can be waved to make this happen and allow us to all sit and pat ourselves on the back for one of the most essential stages of an ERP system upgrade. While there are several processes that can be automated in delivering a repeatable DM solution there is still a considerable amount of manual input required to make this achievable.  It is this human element of input that requires careful planning and governance to ensure that the right stakeholders are involved along with the buy-in and accountability to deliver a quality outcome.

A perfect example of this is in the area of data mapping. Data is mapped from legacy to target along with definition of any transformation requirements. This needs inputs from the data migration team (having analysed the data set in question to define all the variants and scenarios), the business data owner (who understands their own data in its existing context), and finally the SI (who understands the target solution). Given the target solution evolves through design and testing phases this process is re-visited in several iterations.

Another example of human input is in the scope definition which is agreed in the data migration strategy with inputs from the business. Although often desired, it is not always practical or viable to migrate all legacy data to the target solution. It is therefore a requirement that the business stakeholders define, by data set, the rules by which the data is extracted and filtered before any transformation takes place.

Key areas for automation

Focussing on the key areas that should be utilised for automation; we will start with data validation and transformation. As mentioned previously, these processes will need to be repeated in a controlled environment. Ideally, as an output to this process the business will require reporting to identify data validation and cleansing issues. There will be opportunities identified to cleanse or rationalise the data through the transformation process. However, there will always be a requirement to cleanse more complex and unique issues at source through the appropriate business channels. 

There are several ETL (Extract, Transform and Load) products on the market to support these functions. Some are better than others and some considerably more expensive than others. Unfortunately, none of these products can be installed out of the box and instantly transform and validate your data. They all require configuration for the inputs and outputs of each data set as well as all the agreed mapping definitions and transformation rules. There will also be some tinkering required to set up all the boundary tests to refine the validation outputs. This requires a considerable amount of effort for a most ERP programmes.

Whilst massive strides have been taken the machines are not taking over yet

Once all of the above has been completed the process can be run over and over again. The files output can then be presented to the loading interface of the ERP target solution. There will obviously be maintenance throughout the project life cycle as the design of the target solution evolves, upgrades are applied, and variations appear in the source data as new extracts are processed through the toolsets; but this is to be expected. 

Another area of consideration for automation is in the reconciliation process. One of the typically more complex, critical, and resource intensive areas of ERP to reconcile is payroll. Obviously, it is important to get all implementations 100 percent perfect, however if we don’t make sure all your employees are being paid correctly this will likely create a lot of attention at go live and beyond.

Again, there are proprietary toolsets on the market that perform legacy payroll to target comparisons and report differences to the penny, prioritising and categorising differences. This is sometimes delivered through performing (typically) three months worth of payroll parallel running, post cutover. However, with the right approach, toolsets and testing strategies, it is possible to incorporate parallel payroll testing into the standard test cycles. This is something I successfully witnessed delivering for some of the largest clients on ERP platforms to date.  

From a best practice perspective, the reconciliation and validation process should be exercised during each and every ETL in order to highlight issues. With timing constraints of any normal programme, utilising toolsets in this space not only reduces the validation window but also increases the quality to a measurable outcome. 

Hopefully, by the time cutover finally arrives, the ETL process will be both seamless and automated with very few exceptions or human interactions other than quality approvals.  

In summary, automation can only ever be as good as the human that controls and instructs it. Where there is a high level of repetition of a given task then automation will, more often than not, be the best solution. However, when strategic or business-impacting decisions are required, especially where there are multiple stakeholders, then there is no substitute for human intuition.

So, whilst massive strides have been taken in improving the capability of automation in the data management space, the machines are not taking over (yet).