From an interview perspective, you should know and understand the following well understood design patterns and guidelines that make the process of data migration smoother, faster and accurate with least amount of surprises during go-live.
- Develop with Production Data: Use real data from the production system for tests during the development of the migration code.
- Migrate along Domain Partitions: Divide and conquer the migration effort by migrating largely independent parts of the domain model one after another.
- Measure Migration Quality: Implement code that collects and stores all sorts of information about the outcome of the migration during every run.
- Periodic Quality Reports: Generate detailed reports about the measured quality of the migrated data and make it available to all affected stake holders.
- Robust Processing: To prevent the migration process to halt from unexpected failure, apply extensive exception handling to cope with all kinds of problematic input data.
- Data Cleansing: To prevent the new application from being swamped with useless data right from the start, enhance your transformation processes with data cleansing mechanisms.
- Incremental Transformation: Perform an initial data migration before the new application goes live. Migrate data that has changed since then immediately before the new application is launched.
- During migration (extraction and load), the processes should log the following items to facilitate the monitoring, debugging and verification process:
- Global and per entity start time
- Global and per entity end time
- Number of entities to process
- Entity
- Source table(s) and schema
- Destination schema
- Number of records processed
- Errors/Warnings encountered
- Data attributes should be anonymized before being processed for testing purposes. The following data elements should be anonymized before they are processed by “load” systems. Anonymization process should use well defined values that the test team is aware of.
- Phone numbers
- Passwords
- IP Addresses
- Build automated tools for analyzing the logs generated during Migration. The advantages of logging and analyzing the logs have huge advantages:
- Potential migration issues are discovered earlier and can be fixed thereby reducing the cost and effort
- Data inconsistencies issues can be fixed sooner
- Facilitates the coordination between factories and provides a common language for analyzing and debugging
What other strategies have you used? What has worked well for you in the past? What did not go well? The readers and would love to hear your thoughts and experiences.

Thank you. this is great information. I am going through a data migration project and this helped a lot.
ReplyDelete