Testing Data Migrations Step 1

Reminder Data Migration Matters 8 early bird tickets on sale until 17/4/15.

With the next Data Migration Matters event immanent, I intend to run this discussion up to the date of the event and we have set aside a session in DMM8 to discuss the vexed question of Data Migration Testing for which these blogs are the precursors.  So join the discussion on-line and then come along to DMM8 to make yourself heard - literally

If there is one topic that generates more online chat than any other in the Data Migration space, it is the one about Testing Data Migrations.  Check out the various forum and you will see what I mean. I am going to argue  that at bottom this is due to a confusion about what the Data Migration project is about and therefore how to test that it has been successful.

However before I go any further with this let me make it plain that I am not a test analyst.  I have the utmost respect for their craft and I do not want to invade their space.  So anything I say here is not intended to be a lecture to far more skilled hands than me in this area.  It relates specifically to the perceived issues of testing Data Migrations rather than testing in general.

It is also true that testing, just like every other aspect of IT it seems, has its own tribes.  And I certainly do not want to get involved in the internecine particulars of disputes of which, as I say, I am not really qualified to opine.  So if I talk about Test Scenario’s or Test Scripts or test cases please accept them as the words of an informed bystander not with the very particular meanings that one school or another will ascribe to them.

This being the first blog of a series I want to lay down some fundamentals therefore Step 1, and the subject of this blog, is that you can’t test quality into a product.  It does not matter if that product is a motor car, a fine meal or a data migration. 

Design and build quality in.  Test defects out.

A lil' more testing and I'm sure I'll be green

A lil' more testing and I'm sure I'll be green

This may sound like distinction without a difference but think about it.  If you wanted to build a motor car that was green and economical, one that managed 100 plus miles per gallon (35km to the litre for our metric friends), you would not start with a Cadillac Eldorado and try to test the MPG into it (if that is one of your quality exit criteria – the Eldorado is a perfect example of 1950’s American flamboyant self confidence and needs no improving).

Yet this is often precisely the puzzle we are trying to solve in our data migration testing.  We are looking at the issue of quality from the wrong end of the project timeline.

But why is this?  Well in part modern procurement processes are causing an issue with data design.  I have written about this before in other contexts but it is an issue that will continue to bedevil both procurer's and suppliers of new enterprise apps until the purchasers amend their buying practices and suppliers react accordingly.

In brief the move to fixed price contracts for new system delivery and the premium on time to market and price has meant that the suppliers have been forced to move data design down the timeline.  Depending on whom you employ, and the predilections of the buyer, we typically have a cascade of Discovery -> High Level Design -> Low Level Design -> Build -> Test -> Deliver as our phases.  The Low Level Design phase is a misnomer.  In the struggle to win business and keep time lines and costs down, the supplier is constrained to move the detailed work on what particular fields, including custom fields, will be used for and therefore the precise details of their go live start up values into the build phase.  Of course we in the Data Migration end of things need these exact details to perform our data migration.  By the time the Build is complete we get a cascade of data requirements with user acceptance testing, bulk load testing etc. looming.

The temptation to tacitly assume that we can sort it all out in the testing is just plain wrong.  The “Throw the data at the target and see what sticks” approach to data migration is sadly making a reappearance in the hurly burly of modern implementations.

On less well ordered projects this cascade can also be incoherent and contradictory and this is when the second confusion emerges....

Test the data migration not the data design

When time is pushed and the detailed data design arrives way down the timeline, it is easy to confuse genuine data migration defects (wrongly selected data, badly transformed data, incomplete data sets etc) with data design defects (different format requirements in different parts of the solution, incorrectly designed functionality etc.).  On a well run complex programme, roles and responsibilities for outcomes are understood in advance.  In a later blog I will show some techniques for managing this, but in essence data migration testing should be about what is says on the label – testing the data migration not the quality of the solution. 

This is also true of the gaps we in the data migration team are often the first to be aware of in the solution.  Having analysed the source data we may see that a particular object – say a working at height risk assessment – contains 10 fields in the source but only six in the target.  How this is best managed I will also cover later but for now I make the statement that it is a design not a data migration issue.

And one final, if not confusion then point of distinction – I prefer to separate data migration testing from data migration reconciliation (or as it is sometimes called data migration auditing). 

It may be that they answer the same question (did all the things I wanted moved from the source end up in the right place in the target) but because they typically require different techniques both in requirements analysis and in the migration build, it is less confusing if we separate them.

The next blog will be on reconciliation testing followed by end to end, user acceptance, functional, mock load and soak tests.  Finally, when we are agreed that we have the set of data migration test types we will look at how to manage the two confusions above.

I look forward to hearing from you either over the normal media or in person at DMM8