Union Transformation in Informatica is an active transformation and is used for vertical merging of data from multiple sources either heterogeneous or homogeneous. This will helps us to merge data by row after row from multiple sets as from homogeneous or heterogeneous. This is as same as that of UNION ALL set operator in standard SQL i.e. It gives duplicate rows as well.

Limitations:

  1. We can create multiple input groups, but only one output group.
  2. All input groups and the output group must have matching ports. The precision, data type, and scale must be identical across all groups.
  3. The Union transformation does not remove duplicate rows.
  4. We cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.

Learn how to create Union Transformation in Informatica

  • Import source and target tables in to source analyzer and target designer with ODBC.
  • Create mapping and drag & drop both source and target table in to the mapping workspace.

Mapping Scenario

SourceTransformationLogicTarget
Glb_RegionsSQ_Glb_RegionsN/ASTG_UN_REGIONS
US_RegionsSQ_US_RegionsN/ASTG_UN_REGIONS
Union transformationUnion all of two source tablesSTG_UN_REGIONS
  • Go to transformation menu and create Union transformation.
  • Now drag any one source table columns to Union.
Union Transformation in Informatica 1
  • Double click on Union and go to Groups tab and create two groups
Union Transformation in Informatica 1
  • Then drag second source columns to second group in Union.
Union Transformation in Informatica 1
  • Drag all required columns to target from Union. And final mapping looks as below.
Union Transformation in Informatica 1
  • Now create session, workflow and execute workflow.

How to test Union Transformation mapping

Code checking

Before testing or executing any mapping, we first need to check that weather we have given right links or not. And also need to check properties of transformation weather required properties defined or not.

Data Checking

Generally in ETL process what we do is we just extract data from source with Informatica nothing but a selecting data from source and we do transform data inside ETL then load to target.

we can test ETL process with a simple formula as below.

Target Data = Source data + Transformation logic inside Informatica

With the above formula we have to test two main test cases.

  • No of rows from Target is equal to no of rows with source data plus Transformation logic inside Informatica.

For the above mapping, sql’s for this test case are

</>
Copy
Select count(*) from  Stg_Un_Regions.
</>
Copy
Select count(*) from (SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE, 
GLB_REGIONS.COUNTRY_NAME, GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE, 
GLB_REGIONS.CREATED_BY, GLB_REGIONS.LAST_UPDATE_DATE, 
GLB_REGIONS.LAST_UPDATED_BY FROM GLB_REGIONS

UNION ALL

</>
Copy
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME, 
US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY,
 US_REGIONS.LAST_UPDATE_DATE, US_REGIONS.LAST_UPDATED_BY FROM US_REGIONS)

Now fire source query on source database and see count. In same way fire target query on target database and see the count then compare source count and target count.

  • Data should match in target with source and Transformation logic inside Informatica.
</>
Copy
SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE, GLB_REGIONS.COUNTRY_NAME,
 GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE, GLB_REGIONS.CREATED_BY,
 GLB_REGIONS.LAST_UPDATE_DATE, GLB_REGIONS.LAST_UPDATED_BY FROM GLB_REGIONS

UNION ALL

</>
Copy
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME, 
US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY, 
US_REGIONS.LAST_UPDATE_DATE, US_REGIONS.LAST_UPDATED_BY FROM US_REGIONS;
</>
Copy
SELECT POSTAL_CODE,COUNTRY_CODE,COUNTRY_NAME,REGION,CREATION_DATE,
CREATED_BY,LAST_UPDATE_DATE,LAST_UPDATED_BY FROM STG_UN_REGIONS;

If Possible try to test whole table data.

  • If the source and target systems lies in same database under different users then you can use MINUS statement between source sql and target table to verify total records data is matching or not.

For example

</>
Copy
SELECT GLB_REGIONS.POSTAL_CODE, GLB_REGIONS.COUNTRY_CODE, 
GLB_REGIONS.COUNTRY_NAME, GLB_REGIONS.REGION, GLB_REGIONS.CREATION_DATE,
GLB_REGIONS.CREATED_BY, GLB_REGIONS.LAST_UPDATE_DATE, 
GLB_REGIONS.LAST_UPDATED_BY FROM READONLY.GLB_REGIONS

UNION ALL

</>
Copy
SELECT US_REGIONS.POSTAL_CODE, US_REGIONS.COUNTRY_CODE, US_REGIONS.COUNTRY_NAME,
 US_REGIONS.REGION, US_REGIONS.CREATION_DATE, US_REGIONS.CREATED_BY, US_REGIONS.LAST_UPDATE_DATE,
 US_REGIONS.LAST_UPDATED_BY FROM READONLY.US_REGIONS

MINUS

</>
Copy
SELECT POSTAL_CODE,COUNTRY_CODE,COUNTRY_NAME,REGION,CREATION_DATE,
CREATED_BY,LAST_UPDATE_DATE,LAST_UPDATED_BY FROM STAGEDB.STG_UN_REGIONS;