Testing Data Migration Step 2

In the last blog we looked at two major confusions that bedevil data migration testing – confusing building for quality with testing for defects and confusing data design issues with data migration faults. 

I have set aside a session at the next Data Migration Matters event (DMM8, 2nd June, London) exclusively to discuss the issue of Testing Data Migrations.  Check out the timetable at:


I hope to see as many of you there as possible.  Let’s see if we can’t get some consensus around testing.

Now back to the blog.

In the last blog we looked at two major confusions that bedevil data migration testing – confusing building for quality with testing for defects and confusing data design issues with data migration faults.  

This blog was meant to look at reconciliation.  However I have had a number of questions regarding Data Design so reacting in an agile fashion I’m going to take a moment to look at this.  For ease of understanding I will set this problem up as a plain vanilla ERP implementation using a waterfall approach with a supplier or systems integrator delivering a Computer Off The Shelf (COTS) package into a manufacturing or service delivery company client. 

The supplier is responsible for understanding the operations of the COTS package on the one hand and for analysing the operations of the client on the other and then for bringing the two together in a perfect handshake.  Part of this fit has to be the data design.  Or does it?  Understanding the confusion surrounding this goes some way to understanding the confusion around testing I believe.

So what is “Data Design”?

Well what we are talking about here is the way the data structures built into a COTS package are used to support the data structures the client needs to carry out their business. 

Let’s take something simple – Lead to Cash (L2C).  Commercial organisations exist to sell stuff, at a profit.  So they all need a process of getting a lead and turning it into a sale and then delivering the product and collecting the cash.  This L2C process is fundamental to capitalism.

Without dwelling on all the detail, this journey from lead to cash involves the establishment of certain master or framework data items.  We have customers (both actual and potential), we have products both physical and logical (as in my case where I sell data migration consultancy).   If we concentrate on the products, all ERP packages will have some kind of Product-Master structure (please go with me on this one – I'm not going to dwell on the difference between a parts master and a product master).  Therefore all implementations of the lead to cash process will need a Product-Master establishing that is suitable for their physical and/or logical products.

But these Masters will not be the same in a house builder and a medical supplies manufacturer.

but who gets to design it?

So who is responsible for designing the use of the COTS Product-Master for our phantom client?  We need both the domain knowledge of the client and the COTS package knowledge of the supplier but it is always best practice to have one lead.  Are we target led or source led?

Well to my mind it must be the supplier.  They are the ones who know the target best and know how the Product-Master is related to accounts, product lifecycle management, supplier management etc. etc. within the target application.  The art of fitting the client’s business requirement to the structures on offer in the COTS Package requires knowledge of the COTS Package and the implication of design choices and this expertise is what the supplier brings to the party. 

Only the client has knowledge of the client product set that is the other half of the mix.  It is my contention that it should be the responsibility of the supplier to have the analysis skills that will reveal this knowledge in such a way that they can then perform the data design and deliver an operating platform that will enhance the activities of the client.  We need to bear in mind that most organisations do not replace the systems that support their businesses very often.  Therefore there is no reason why they should have developed the skills to analyse and articulate their business processes in a format ideally suited to a third party’s implementation requirements.  The supplier on the other hand has regular need for these items to be developed so it makes sense for them to have cultivated the skills needed to unearth these processes and the ancillary data.  They should have made these skills part of their own product set.

This is rarely contested but the level of detail in the data design often is.  It’s all very well placing the onus for data design of Product-Master on the shoulders of the supplier, but what about the detailed definition of items like the format of part numbers or the breakdown of a product set into discrete deliverables? 

There are two reasons for still saying that detailed Data Design belongs with the supplier.  Firstly, although the majority of the Data Design will stay as is – if you are a car manufacturer before the migration you will be one at the end so you will have models and versions of models and so on – there are some data items that are only present because that is the way the client has traditionally done things.  It is for the supplier, if they are to add value, to challenge these vestigial elements and substitute them for ones that will take advantage of the new system’s capabilities.  The risk otherwise is that the new system becomes a poor reproduction of the old which undermines the value proposition of making a change in the first place. 

The second reason is linked to the first.  In modern highly integrated COTS packages the setting of some values has impacts across the application.  It takes knowledgeable experts to understand the implication of something as apparently simple as the parts numbering system and its relationship to the part breakdown structure.  This means we need critically engaged target system experts to facilitate the result.

So to optimise our investment and to avoid cock-ups it should be the COTS package experts who are driving the bus.

However all too often I find, on arrival at an in-flight project, that the subtle difference between creating a general structure for your Product-Master and taking this down to the level of detail that can actually run the business, is a point of misunderstanding between the supplier and the client.  Often then as it is the Data Migration team who best know the legacy data and the issues seems to be one to do with data the task of providing this metadata falls on them.  This is wrong.

As an aside, if the main supplier will not commit to delivering the detailed data design but s waiting for the client to produce them, and given that the client may not have the skills to produce them, then the client should look to sub-contract that element out.

Whoever is performing the data design it still remains the case that this is not a data migration task.  If you are moving house, the removal men expect to be told where the bedrooms, lounge and kitchen in you new residence are.  (OK so they may be able to make a reasonable guess as to which room is  the kitchen but you get my point).  They are not expecting to decide for you which cupboards you wish to use and certainly not to have to architect the dwelling.  This should be the same with us lifters and shifters of data.  Tell us where to stick the stuff and we will organise things so that is where it gets stuck.

If all of this has been a little dense then please allow me to recapitulate:

  • Data Migrations are always part of bigger programmes
  • There is usually an incumbent supplier or systems integrator contracted to implement the new Computer Off The Shelf package
  • Data Design and Data Migration are not the same
  • Data Design is the alignment of the client metadata with the data structures available on the target system
  • Best practice is for the supplier to be responsible for detailed Data Design as well as detailed  process design
  • If the supplier is not going to perform this task then unless you have the skills in-house get assistance from a supplier who can provides this service
  • In any case Data Design should be seen as a separate task from data migration and planned into the project from the beginning
  • Data migration is the finding, extraction, transformation and loading of data of the appropriate quality in the right place at the right time.  It is not responsible for defining where and what the right place is

Back to reconciliation

First of all then what is data migration reconciliation (AKA data migration audit)?  Well put simply it answers the business side question “How will I know that everything I had in my old system, that I wanted moving to the new system, made it across”.   It does not completely answer the kindred question “....and how do I know it landed in the right place” because that involves both data migration issues (did we move the data according to the specification) and target design issues (did the programme perform the data design correctly to match business processes so that the locations to move the data into were available with appropriate behaviours).

Next time out we really will look at these two linked questions.