Four steps to transforming data

Learn the four essential steps to transform data effectively for better aggregation, matching, and reporting. Explore how data interpretation, quality checks, translation, and post-translation validation enhance data management. Unlock the value of your enterprise information asset with data transformation.


Another brief post this week on an area that we do not focus on very often: data transformation.

Enhance your understanding of data integrity and its pivotal role in organizational success through effective data quality assurance strategies.

Data transformation is a relatively mundane yet fundamental data management capability – particularly when dealing with similar data from multiple sources.

Three simple examples of data transformation:

  • System A represents Male and Female and 0 and 1, while System B represents Male and Female as M and F
  • System A stores dates in the US format (MM/DD/YYYY) whilst System B uses the South African format (DD/MM/YYYY)
  • System A uses a comma as a decimal point ( 123,33 ) while System B uses a point (123.33)

In each case, we need to transform the data into a common format for aggregating, matching and reporting.

In a recent blog post, our partner Precisely outlines the four steps inherent in any data transformation process. Precisely’s ETL (Extract, Transform and Load) and CDC (Change Data Capture) capabilities are particularly highly rated by analysts based on their ability to transform data from diverse formats e.g. when moving data from the mainframe to relational databases and Hadoop.

The four steps to data transformation

Step one: Data interpretation

Before we can begin any transformation process we need to understand what our data looks like. This is most efficiently done using data profiling tools

Step two: Pre-translation quality check

At this step, we validate data quality against known requirements and standards to identify any errors or issues that we may not expect and must address

Step three: Data translation

This is the physical process of changing data into the required format.

An interesting observation by Precisely is that we may not only change the content of a data set but also the format e.g. converting a CSV file to a modern XML format

Step four: Post-translation quality check

We repeat step two after running our translation to measure the improvement and test the outcome.

By planning for and following these four steps your should quickly begin to improve the usefulness of your aggregated data

Discover the essentials for ensuring data integrity with effective data quality assurance methodologies and practices.

Discover effective strategies and solutions to manage data and information effectively and overcome challenges with data quality vs information quality

Join 8,450 other subscribers

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.



Related posts

Discover more from Data Quality Matters

Subscribe now to keep reading and get our new posts in your email.

Continue reading