Determining Bottlenecks

Success

Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs

Success

Success Accelerators

Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success

My Engagements

All your Engagements at one place
Communities

A collaborative platform to connect and grow with like-minded Informaticans across the globe

Communities

Product Communities

Connect and collaborate with Informatica experts and champions

Discussions

Have a question? Start a Discussion and get immediate answers you are looking for

User Groups

Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica

Get Started

Community Guidelines
Knowledge Center

Troubleshooting documents, product guides, how to videos, best practices, and more

Knowledge Center

Knowledge Base

One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more

Support TV

Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more

Documentation

Information library of the latest product documents

Velocity (Best Practices)

Best practices and use cases from the Implementation team
Learn

Rich resources to help you leverage full capabilities of our products

Learn

Trainings

Role-based training programs for the best ROI

Certifications

Get certified on Informatica products. Free, Foundation, or Professional

Product Learning Paths

Free and unlimited modules based on your expertise level and journey

Experience Lounge

Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
Resources

Library of content to help you leverage the best of Informatica products

Resources

Tech Tuesdays Webinars

Most popular webinars on product architecture, best practices, and more

Product Availability Matrix

Product Availability Matrix statements of Informatica products

SupportFlash

Monthly support newsletter

Support Documents

Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule

Product Lifecycle

End of Life statements of Informatica products

Ideas

Events

Change Request Tracking

Marketplace

| Sign up

Velocity
Strategy

Strategy

Data Strategy

Centers of Excellence

Enterprise Data Governance

Enterprise Architecture

Program & Change Management
Solutions

Solutions

Cloud Data Warehouse & Data Lake

Data Lake

Data Warehouse Modernization

Analytics Modernization

Application Integration

360 Engagement

Multidomain MDM

Customer 360 SaaS

Product 360

Supplier 360

Reference 360

Data Governance & Privacy

Data Catalog & Metadata Management

Data Privacy

Regulatory Compliance

Data Quality

Data Access and Provisioning
Stages

Stages

Cloud Data Warehouse & Data Lake

360 Engagement

Data Governance & Privacy

Following a rigorous methodology is key to delivering customer satisfaction and expanding analytics use cases across the business.
More
- Success
  
  Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
  
  Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
  
  Success Accelerators
  
  Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  
  My Engagements
  
  All your Engagements at one place
- Communities
  
  A collaborative platform to connect and grow with like-minded Informaticans across the globe
  
  A collaborative platform to connect and grow with like-minded Informaticans across the globe
  
  Product Communities
  
  Connect and collaborate with Informatica experts and champions
  
  Discussions
  
  Have a question? Start a Discussion and get immediate answers you are looking for
  
  User Groups
  
  Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  
  Get Started
  
  Community Guidelines
- Knowledge Center
  
  Troubleshooting documents, product guides, how to videos, best practices, and more
  
  Troubleshooting documents, product guides, how to videos, best practices, and more
  
  Knowledge Base
  
  One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
  
  Support TV
  
  Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
  
  Documentation
  
  Information library of the latest product documents
  
  Velocity (Best Practices)
  
  Best practices and use cases from the Implementation team
- Learn
  
  Rich resources to help you leverage full capabilities of our products
  
  Rich resources to help you leverage full capabilities of our products
  
  Trainings
  
  Role-based training programs for the best ROI
  
  Certifications
  
  Get certified on Informatica products. Free, Foundation, or Professional
  
  Product Learning Paths
  
  Free and unlimited modules based on your expertise level and journey
  
  Experience Lounge
  
  Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
- Resources
  
  Library of content to help you leverage the best of Informatica products
  
  Library of content to help you leverage the best of Informatica products
  
  Tech Tuesdays Webinars
  
  Most popular webinars on product architecture, best practices, and more
  
  Product Availability Matrix
  
  Product Availability Matrix statements of Informatica products
  
  SupportFlash
  
  Monthly support newsletter
  
  Support Documents
  
  Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
  
  Product Lifecycle
  
  End of Life statements of Informatica products
  
  Ideas
  
  Events
  
  Change Request Tracking
  
  Marketplace

Last Updated Date Jan 26, 2021 |

Stages Cloud Data Warehouse & Data Lake Best Practice

Challenge

There are many variables involved in identifying and rectifying performance bottlenecks. A methodical approach is needed to determine where bottlenecks exist in order to address performance issues and optimize the integration process.

Description

The first step in performance tuning is to identify performance bottlenecks. Carefully consider the following five areas to determine where bottlenecks exist; using a process of elimination, investigating each area in the order indicated:

Target
Source
Mapping
Session
System

Best Practice Considerations

Use Thread Statistics to Identify Target, Source, and Mapping Bottlenecks

The thread statistics in the session logs are a good first place to begin when identifying source, target, or mapping (transformation) bottlenecks. By default, an Integration Service uses one reader, one transformation, and one target thread to process a session. Within each session log, the following thread statistics are available:

Run time Amount of time the thread was running
Idle time Amount of time the thread was idle due to other threads within application or Integration Service. This value does not include time the thread is blocked due to the operating system.
Busy Percentage of the overall run time the thread is not idle. This percentage is calculated using the following formula:

(run time idle time) / run time x 100

By analyzing the thread statistics found in an Integration Service session log, it is possible to determine which thread is being used the most.

If a transformation thread is 100 percent busy and there are additional resources (e.g., CPU cycles and memory) available on the Integration Service server, add a partition point in the segment.

If the reader or writer thread is 100 percent busy, do the following:

Determine if there is an SQL Override Query in the Source Qualifier transformation in the ETL mapping. If so, confirm the amount of time it is taking for the SQL query to complete. Performance tune the SQL query by looking at the table size, joins and possible aggregations being performed.
Perform analysis of the operating system statistics, such as CPU Statistics, Virtual Memory Statistics, Disk Statistics and Network Statistics to help ascertain the cause of a negative impact on the target system.
Analyze the latency on the network to further determine the negative impact on table updates/deletes being performed by the ETL mapping

Use the Swap Method to Test Changes in Isolation

Attempt to isolate performance problems by running test sessions. You should be able to compare the sessions original performance with that of tuned sessions performance.

The swap method is very useful for determining the most common bottlenecks. It involves the following five steps:

Make a temporary copy of the mapping, session and/or workflow that is to be tuned, then tune the copy before making changes to the original.
Implement only one change at a time and test for any performance improvements to gauge which tuning methods work most effectively in the environment.
Document the change made to the mapping, session and/or workflow and the performance metrics achieved as a result of the change. The actual execution time may be used as a performance metric.
Delete the temporary mapping, session and/or workflow upon completion of performance tuning.
Make appropriate tuning changes to mappings, sessions and/or workflows.

Evaluating the Five Areas of Consideration

Target Bottlenecks

Relational Targets

The most common performance bottleneck occurs when the Integration Service writes to a target database. This type of bottleneck can easily be identified with the following procedure:

Make a copy of the original workflow
Configure the session in the test workflow to write to a flat file and run the session.
Read the thread statistics in session log

If session performance increases significantly when writing to a flat file, you have a write bottleneck. Consider performing the following tasks to improve performance:

Drop indexes and key constraints
Increase checkpoint intervals
Use bulk loading
Use external loading
Minimize deadlocks
Increase database network packet size
Optimize target databases

Flat file targets

If the session targets a flat file, you probably do not have a write bottleneck. If the session is writing to a SAN or a non-local file system, performance may be slower than writing to a local file system. If possible, a session can be optimized by writing to a flat file target local to the Integration Service. If the local flat file is very large, you can optimize the write process by dividing it among several physical drives.

If the SAN or non-local file system is significantly slower than the local file system, work with the appropriate network/storage group to determine if there are configuration issues within the SAN.

Source Bottlenecks

Relational sources

If the session reads from a relational source, you can use a filter transformation, a read test mapping, or a database query to identify source bottlenecks.

Using a Filter Transformation.

Add a filter transformation in the mapping after each source qualifier. Set the filter condition to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then you have a source bottleneck.

Using a Read Test Session.

You can create a read test mapping to identify source bottlenecks. A read test mapping isolates the read query by removing any transformation logic from the mapping. Use the following steps to create a read test mapping:

Make a copy of the original mapping.
In the copied mapping, retain only the sources, source qualifiers, and any custom joins or queries.
Remove all transformations.
Connect the source qualifiers to a file target.

Use the read test mapping in a test session. If the test session performance is similar to the original session, you have a source bottleneck.

Using a Database Query

You can also identify source bottlenecks by executing a read query directly against the source database. To do so, perform the following steps:

Copy the read query directly from the session log.
Run the query against the source database with a query tool such as SQL Plus.
Measure the query execution time and the time it takes for the query to return the first row.

If there is a long delay between the two-time measurements, you have a source bottleneck.

If your session reads from a relational source and is constrained by a source bottleneck, review the following suggestions for improving performance:

Optimize the query.
Create tempdb as in-memory database.
Use conditional filters.
Increase database network packet size.
Connect to Oracle databases using IPC protocol.

Flat file sources

If your session reads from a flat file source, you probably do not have a read bottleneck. Tuning the line sequential buffer length to a size large enough to hold approximately four to eight rows of data at a time (for flat files) may improve performance when reading flat file sources. Also, ensure the flat file source is local to the Integration Service.

Mapping Bottlenecks

If you have eliminated the reading and writing of data as bottlenecks, you may have a mapping bottleneck. Use the swap method to determine if the bottleneck is in the mapping.

Begin by adding a Filter transformation in the mapping immediately before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, you have a mapping bottleneck. You can also use the performance details to identify mapping bottlenecks: high Rowsinlookupcache and High Errorrows counters indicate mapping bottlenecks.

Follow these steps to identify mapping bottlenecks:

Create a test mapping without transformations

Make a copy of the original mapping.
In the copied mapping, retain only the sources, source qualifiers, and any custom joins or queries.
Remove all transformations.
Connect the source qualifiers to the target.

Check for High Rowsinlookupcache counters

Multiple lookups can slow the session. You may improve session performance by locating the largest lookup tables and tuning those lookup expressions.

Check for High Error rows counters

Transformation errors affect session performance. If a session has large numbers in any of the Transformation_errorrows counters, you may improve performance by eliminating the errors.

Session Bottlenecks

Session performance details can be used to flag other problem areas. Create performance details by selecting Collect Performance Data in the session properties before running the session.

View the performance details through the Workflow Monitor as the session runs, or view the resulting file. The performance details provide counters about each source qualifier, target definition, and individual transformation within the mapping to help you understand session and mapping efficiency.

To view the performance details during the session run:

Right-click the session in the Workflow Monitor.
Choose Properties.
Click the Properties tab in the details dialog box.

To view the resulting performance data file, look for the file session_name.perf in the same directory as the session log and open the file in any text editor.

All transformations have basic counters that indicate the number of input row, output rows, and error rows. Source qualifiers, normalizers, and targets have additional counters indicating the efficiency of data moving into and out of buffers. Some transformations have counters specific to their functionality. When reading performance details, the first column displays the transformation name as it appears in the mapping, the second column contains the counter name, and the third column holds the resulting number or efficiency percentage.

Low buffer input and buffer output counters

If the BufferInput_efficiency and BufferOutput_efficiency counters are low for all sources and targets, increasing the session DTM buffer pool size may improve performance.

Aggregator, Rank, and Joiner readfromdisk and writetodisk counters

If a session contains Aggregator, Rank, or Joiner transformations, examine each Transformation_readfromdisk and Transformation_writetodisk counter. If these counters display any number other than zero, you can improve session performance by increasing the index and data cache sizes.

If the session performs incremental aggregation, the Aggregator_readtodisk and writetodisk counters display a number other than zero because the Integration Service reads historical aggregate data from the local disk during the session and writes to disk when saving historical data. Evaluate the incremental Aggregator_readtodisk and writetodisk counters during the session. If the counters show any numbers other than zero during the session run, you can increase performance by tuning the index and data cache sizes.

System Bottlenecks

After tuning the source, target, mapping, and session, you may also consider tuning the system hosting the Integration Service.

The Integration Service uses system resources to process transformations, session execution, and the reading and writing of data. The Integration Service also uses system memory for other data tasks such as creating aggregator, joiner, rank, and lookup table caches.

Use system performance monitoring tools to monitor the amount of system resources the Server uses and identify system bottlenecks. Below are additional changes to make to all sessions:

Improve network speed - Slow network connections can slow session performance. Have the system administrator determine if the network runs at an optimal speed. Decrease the number of network hops between the Integration Service process and databases.
Use multiple CPUs - Take advantage of multiple CPUs to run multiple sessions in parallel and run multiple pipeline partitions in parallel.
Reduce paging - When an operating system runs out of physical memory, it starts paging to disk to free physical memory. Configure the physical memory for the Integration Service process machine to minimize paging to disk.
Use processor binding - In a multi-processor UNIX environment, the Integration Service may use a large amount of system resources. Use processor binding to control processor usage by the Integration Service process. Also, if the source and target database are on the same machine, use processor binding to limit the resources used by the database.