Data integration defined

Data integration is a common industry term referring to the requirement to combine data from multiple separate business systems into a single unified view, often called a single view of the truth. This unified view is typically stored in a central data repository known as a data warehouse.
For example, customer data integration involves the extraction of information about each individual customer from disparate business systems such as sales, accounts, and marketing, which is then combined into a single view of the customer to be used for customer service, reporting and analysis.
Are you ready to boost your business intelligence capabilities? Our guide on Business Intelligence Tools can help you get there.
ETL and ELT
There are a number of different data integration approaches which can be used to populate a data warehouse, two of the most common are:
- ETL – Extract, Transform and Load. Data is extracted from multiple source systems, transformed into a usable format, and then loaded into a data warehouse where users can query and report on the unified view of the data.
- ELT – Extract, Load and Transform. Data is extracted from source systems, loaded into a database staging area, and then transformed into a usable format within the database. This approach may be necessary to meet check-pointing, auditing, and security requirements.

The trade-off with ELT is that storage space and CPU usage within the database can be relatively expensive. Using ETL to perform the transformation step outside of the database reduces the storage and CPU load on the data warehouse, making ETL a more cost-effective approach for many applications.
Data Integration Criteria

Three major data integration criteria to consider when building a data warehouse are:
- Freshness – Since data integration processes are executed periodically, data freshness refers to the delay between when a change occurs on a source system to when the change appears in the data warehouse.
- Granularity – Data from source systems is often summarised or aggregated during the data integration process. Data granularity refers to the level of detail that will be stored in the data warehouse.
- Matching – The business rules that define how data from different systems should be matched can be quite complex. For a simple example – is “Bill Smith” the same individual as “William Smith”?
These criteria will affect the performance and storage requirements for the data warehouse, and hence the cost of the hardware and software required for the project.
—
Big data and business intelligence are changing the way businesses operate, allowing them to gain valuable insights into their customers, markets, and operations. Learn more about choosing the right tools in our guide
BI is increasingly dependent on streaming data flows, but these new(ish) capabilities are not without their challenges. We share 5 critical success factors for streaming data pipelines.

Leave a comment