DT Project Chaining

Success

Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs

Success

Success Accelerators

Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success

My Engagements

All your Engagements at one place
Communities

A collaborative platform to connect and grow with like-minded Informaticans across the globe

Communities

Product Communities

Connect and collaborate with Informatica experts and champions

Discussions

Have a question? Start a Discussion and get immediate answers you are looking for

User Groups

Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica

Get Started

Community Guidelines
Knowledge Center

Troubleshooting documents, product guides, how to videos, best practices, and more

Knowledge Center

Knowledge Base

One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more

Support TV

Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more

Documentation

Information library of the latest product documents

Velocity (Best Practices)

Best practices and use cases from the Implementation team
Learn

Rich resources to help you leverage full capabilities of our products

Learn

Trainings

Role-based training programs for the best ROI

Certifications

Get certified on Informatica products. Free, Foundation, or Professional

Product Learning Paths

Free and unlimited modules based on your expertise level and journey

Experience Lounge

Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
Resources

Library of content to help you leverage the best of Informatica products

Resources

Tech Tuesdays Webinars

Most popular webinars on product architecture, best practices, and more

Product Availability Matrix

Product Availability Matrix statements of Informatica products

SupportFlash

Monthly support newsletter

Support Documents

Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule

Product Lifecycle

End of Life statements of Informatica products

Ideas

Events

Change Request Tracking

Marketplace

| Sign up

Velocity
Strategy

Strategy

Data Strategy

Centers of Excellence

Enterprise Data Governance

Enterprise Architecture

Program & Change Management
Solutions

Solutions

Cloud Data Warehouse & Data Lake

Data Lake

Data Warehouse Modernization

Analytics Modernization

Application Integration

360 Engagement

Multidomain MDM

Customer 360 SaaS

Product 360

Supplier 360

Reference 360

Data Governance & Privacy

Data Catalog & Metadata Management

Data Privacy

Regulatory Compliance

Data Quality

Data Access and Provisioning
Stages

Stages

Cloud Data Warehouse & Data Lake

360 Engagement

Data Governance & Privacy

Following a rigorous methodology is key to delivering customer satisfaction and expanding analytics use cases across the business.
More
- Success
  
  Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
  
  Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
  
  Success Accelerators
  
  Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  
  My Engagements
  
  All your Engagements at one place
- Communities
  
  A collaborative platform to connect and grow with like-minded Informaticans across the globe
  
  A collaborative platform to connect and grow with like-minded Informaticans across the globe
  
  Product Communities
  
  Connect and collaborate with Informatica experts and champions
  
  Discussions
  
  Have a question? Start a Discussion and get immediate answers you are looking for
  
  User Groups
  
  Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  
  Get Started
  
  Community Guidelines
- Knowledge Center
  
  Troubleshooting documents, product guides, how to videos, best practices, and more
  
  Troubleshooting documents, product guides, how to videos, best practices, and more
  
  Knowledge Base
  
  One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
  
  Support TV
  
  Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
  
  Documentation
  
  Information library of the latest product documents
  
  Velocity (Best Practices)
  
  Best practices and use cases from the Implementation team
- Learn
  
  Rich resources to help you leverage full capabilities of our products
  
  Rich resources to help you leverage full capabilities of our products
  
  Trainings
  
  Role-based training programs for the best ROI
  
  Certifications
  
  Get certified on Informatica products. Free, Foundation, or Professional
  
  Product Learning Paths
  
  Free and unlimited modules based on your expertise level and journey
  
  Experience Lounge
  
  Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
- Resources
  
  Library of content to help you leverage the best of Informatica products
  
  Library of content to help you leverage the best of Informatica products
  
  Tech Tuesdays Webinars
  
  Most popular webinars on product architecture, best practices, and more
  
  Product Availability Matrix
  
  Product Availability Matrix statements of Informatica products
  
  SupportFlash
  
  Monthly support newsletter
  
  Support Documents
  
  Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
  
  Product Lifecycle
  
  End of Life statements of Informatica products
  
  Ideas
  
  Events
  
  Change Request Tracking
  
  Marketplace

Last Updated Date Jun 26, 2021 |

Stages Cloud Data Warehouse & Data Lake Best Practice

Challenge

The usage of B2B Data Transformation to process the documents in any format can be optimized by an advanced concept called Chaining. Data Transformation chaining involves the use of more than one Data Transformation component like Parser, Mapper, Serializer, etc. or one or more Data Transformation service to achieve the desired objectives.

Description

Data Transformation chaining philosophy revolves around the use of one or more small logical components, each of which accomplish a specific task to achieve a more complex objective. This methodology encourages using reusable components and breaking down work into smaller chunks to enable easier development and maintenance.

As an example, organizations get data from different sources and store the data in warehouses to generate the big picture about some specific business function. Each source system may stipulate using date in a specific format. It becomes necessary to convert the date, from varied systems and bring them to a standard format to be stored in the warehouse. The same issue would exist in various departments and various projects. Using Data Transformation, an organization can build a reusable date processing component which would accept date in any format, analyze such format and convert the date into a standard format. Different processes can then use the reusable component in their projects (Parser projects/Mapper projects /Serializer projects) and ensure consistency in the date formats that go into the warehouse. The process of using multiple Data Transformation segments in a project to achieve a complete file transformation is called Data Transformation Chaining.

Benefits

The advantages of Data Transformation Chaining are summarized below:

Chaining allows application logic to reside in separate segments (Parser, Mapper etc.).
Application logic can be broken into logical service objects, thereby allowing reusable components to be built and used multiple times.
Improved performance can be achieved as the number of steps required to convert a document can be brought down significantly.
Maximum processing can be achieved at the Data Transformation level and Power Center can be utilized more as an execution platform.
Chaining of documents allows a certain set of re-usable components to be bundled and exposed to a different application. This implies that an application running in C# or java can invoke the same service to achieve the same result thereby allowing cross development environment sharing.
By chaining Data Transformation services or components, there is a reduced interaction with the invoking application like PowerCenter and code maintenance at the invocation layer is kept to a minimum necessary.
Memory utilization at run time can be brought down significantly as the output is sent to another Data Transformation service as input and not as a buffer to PowerCenter.

There is no need to worry about the port sizes at the PowerCenter level as any string limitation is taken care of by Data Transformation internally. Only the Input and Output port sizes need to be defined.

Process

Chaining in a Data Transformation project is achieved by the use of any of the following “Actions”

RunParser: Executes another parser within the same project. This allows a whole new source or the output of the existing parser component to be parsed, the output of which can be used back in the main parser. E.g. create a dynamic lookup object from a completely different input file at run time.
RunMapper: Executes another mapper within the same project. This component allows a conversion of an xml document from one format to another format on the fly. This is typically used when XMLs have different grouping and ordering scenarios.
RunSerializer: Executes another serializer within the same project. This component allows files to be created in different formats as part of a main component processing and yet allow the main component to continue further processing.

Scope

Scope of data for parsing, mapping or serialization would be the amount of data available from the main component (parser, mapper or serializer) to the secondary component that is being invoked. This can be categorized as follows:

Implicit Scope: The secondary component uses the data holders available in the scope of the action. For example, if the action is within a Group, it runs on the output of the Group as shown below.

In above code snippet, the input for the “Secondary Serializer” is implicit. The input value that will be used as the source in this case would be the output of the repeating group – “MyRepeatGroup”. Hence the value stored in the complex variable “$Inputs” would be the source. The secondary serializer would be invoked repeatedly for each occurrence of the repeating group.

In the above code snippet, the input for the “Full Serializer” is implicit. The input value that will be used as the source in this case would be the output of the “MainParser”. Hence the whole content of the xml from the “MainParser” is fed as the source document for the serializer.
Explicit Scope: The secondary component provides an explicit reference to a data holder which serves as the source to the secondary component. This method is generally used for parsing additional input files that may be cross referenced during the different stages of data transformation in a file from one format to another format.

In the above code snippet, the component “RunParser” invokes “My_Parser” and uses the value contained in the variable - “v_Temp”.

Note: Similar approach applies for “RunMapper” and “RunSerializer” components as well.

Best Practices for Data Transformation Chaining

Informatica recommends using implicit data sources when invoking a secondary component. This will ensure reduced memory usage by using a single in-memory data holder. If the data is copied to a variable and then later referenced explicitly, there is a duplication of data and that can be avoided.
If parsing an additional input port, it is advisable to use data by reference to the invocation component. The secondary parser should be invoked at the last moment when the data from the secondary input port is required.
It is best to limit the number of invocations of the secondary component and hence optimally reduce the number of I/O operations.
It is best to clear any variables once their utility is over to reduce in-memory storage.