Union Transformation in Informatica is an active transformation and is used for vertical merging of data from multiple sources either heterogeneous or homogeneous. This will helps us to merge data by row after row from multiple sets as from homogeneous or heterogeneous. This is as same as that of UNION ALL set operator in standard SQL i.e. It gives duplicate rows as well.
Limitations:
- We can create multiple input groups, but only one output group.
- All input groups and the output group must have matching ports. The precision, data type, and scale must be identical across all groups.
- The Union transformation does not remove duplicate rows.
- We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.
Learn how to create Union Transformation in Informatica
- Import source and target tables in to source analyzer and target designer with ODBC.
- Create mapping and drag & drop both source and target table in to the mapping workspace.
Mapping Scenario
Source | Transformation | Logic | Target |
Glb_Regions | SQ_Glb_Regions | N/A | STG_UN_REGIONS |
US_Regions | SQ_US_Regions | N/A | STG_UN_REGIONS |
Union transformation | Union all of two source tables | STG_UN_REGIONS |
- Go to transformation menu and create Union transformation.
- Now drag any one source table columns to Union.
- Double click on Union and go to Groups tab and create two groups
- Then drag second source columns to second group in Union.
- Drag all required columns to target from Union. And final mapping looks as below.
- Now create session, workflow and execute workflow.
How to test Union Transformation mapping
Code checking
Before testing or executing any mapping, we first need to check that weather we have given right links or not. And also need to check properties of transformation weather required properties defined or not.
Data Checking
Generally in ETL process what we do is we just extract data from source with Informatica nothing but a selecting data from source and we do transform data inside ETL then load to target.
we can test ETL process with a simple formula as below.
Target Data = Source data + Transformation logic inside Informatica
With the above formula we have to test two main test cases.
- No of rows from Target is equal to no of rows with source data plus Transformation logic inside Informatica.
For the above mapping, sql’s for this test case are
Select count(*) from Stg_Un_Regions.
Select count(*) from (SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE,
GLB_REGIONS.COUNTRY_NAME, GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE,
GLB_REGIONS.CREATED_BY, GLB_REGIONS.LAST_UPDATE_DATE,
GLB_REGIONS.LAST_UPDATED_BY FROM GLB_REGIONS
UNION ALL
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME,
US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY,
US_REGIONS.LAST_UPDATE_DATE, US_REGIONS.LAST_UPDATED_BY FROM US_REGIONS)
Now fire source query on source database and see count. In same way fire target query on target database and see the count then compare source count and target count.
- Data should match in target with source and Transformation logic inside Informatica.
SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE, GLB_REGIONS.COUNTRY_NAME,
GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE, GLB_REGIONS.CREATED_BY,
GLB_REGIONS.LAST_UPDATE_DATE, GLB_REGIONS.LAST_UPDATED_BY FROM GLB_REGIONS
UNION ALL
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME,
US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY,
US_REGIONS.LAST_UPDATE_DATE, US_REGIONS.LAST_UPDATED_BY FROM US_REGIONS;
SELECT POSTAL_CODE,COUNTRY_CODE,COUNTRY_NAME,REGION,CREATION_DATE,
CREATED_BY,LAST_UPDATE_DATE,LAST_UPDATED_BY FROM STG_UN_REGIONS;
If Possible try to test whole table data.
- If the source and target systems lies in same database under different users then you can use MINUS statement between source sql and target table to verify total records data is matching or not.
For example
SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE,
GLB_REGIONS.COUNTRY_NAME, GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE,
GLB_REGIONS.CREATED_BY, GLB_REGIONS.LAST_UPDATE_DATE,
GLB_REGIONS.LAST_UPDATED_BY FROM READONLY.GLB_REGIONS
UNION ALL
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME,
US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY, US_REGIONS.LAST_UPDATE_DATE,
US_REGIONS.LAST_UPDATED_BY FROM READONLY.US_REGIONS
MINUS
SELECT POSTAL_CODE,COUNTRY_CODE,COUNTRY_NAME,REGION,CREATION_DATE,
CREATED_BY,LAST_UPDATE_DATE,LAST_UPDATED_BY FROM STAGEDB.STG_UN_REGIONS;