• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date Jan 26, 2021 |

Challenge

There are many variables involved in identifying and rectifying performance bottlenecks. A methodical approach is needed to determine where bottlenecks exist in order to address performance issues and optimize the integration process.

Description

The first step in performance tuning is to identify performance bottlenecks. Carefully consider the following five areas to determine where bottlenecks exist; using a process of elimination, investigating each area in the order indicated: 

  1. Target
  2. Source
  3. Mapping
  4. Session
  5. System

Best Practice Considerations

Use Thread Statistics to Identify Target, Source, and Mapping Bottlenecks

The thread statistics in the session logs are a good first place to begin when identifying source, target, or mapping (transformation) bottlenecks. By default, an Integration Service uses one reader, one transformation, and one target thread to process a session. Within each session log, the following thread statistics are available:

  • Run time Amount of time the thread was running
  • Idle time Amount of time the thread was idle due to other threads within application or Integration Service. This value does not include time the thread is blocked due to the operating system.
  • Busy Percentage of the overall run time the thread is not idle. This percentage is calculated using the following formula:

(run time idle time) / run time x 100

By analyzing the thread statistics found in an Integration Service session log, it is possible to determine which thread is being used the most.

If a transformation thread is 100 percent busy and there are additional resources (e.g., CPU cycles and memory) available on the Integration Service server, add a partition point in the segment.

If the reader or writer thread is 100 percent busy, do the following:

  1. Determine if there is an SQL Override Query in the Source Qualifier transformation in the ETL mapping. If so, confirm the amount of time it is taking for the SQL query to complete. Performance tune the SQL query by looking at the table size, joins and possible aggregations being performed.
  2. Perform analysis of the operating system statistics, such as CPU Statistics, Virtual Memory Statistics, Disk Statistics and Network Statistics to help ascertain the cause of a negative impact on the target system.
  3. Analyze the latency on the network to further determine the negative impact on table updates/deletes being performed by the ETL mapping

Use the Swap Method to Test Changes in Isolation

Attempt to isolate performance problems by running test sessions. You should be able to compare the sessions original performance with that of tuned sessions performance.

The swap method is very useful for determining the most common bottlenecks. It involves the following five steps:

  1. Make a temporary copy of the mapping, session and/or workflow that is to be tuned, then tune the copy before making changes to the original.
  2. Implement only one change at a time and test for any performance improvements to gauge which tuning methods work most effectively in the environment.
  3. Document the change made to the mapping, session and/or workflow and the performance metrics achieved as a result of the change. The actual execution time may be used as a performance metric.
  4. Delete the temporary mapping, session and/or workflow upon completion of performance tuning.
  5. Make appropriate tuning changes to mappings, sessions and/or workflows.

Evaluating the Five Areas of Consideration

Target Bottlenecks

Relational Targets

The most common performance bottleneck occurs when the Integration Service writes to a target database. This type of bottleneck can easily be identified with the following procedure:

  1. Make a copy of the original workflow
  2. Configure the session in the test workflow to write to a flat file and run the session.
  3. Read the thread statistics in session log

If session performance increases significantly when writing to a flat file, you have a write bottleneck. Consider performing the following tasks to improve performance:

  • Drop indexes and key constraints
  • Increase checkpoint intervals
  • Use bulk loading
  • Use external loading
  • Minimize deadlocks
  • Increase database network packet size
  • Optimize target databases

Flat file targets

If the session targets a flat file, you probably do not have a write bottleneck. If the session is writing to a SAN or a non-local file system, performance may be slower than writing to a local file system. If possible, a session can be optimized by writing to a flat file target local to the Integration Service. If the local flat file is very large, you can optimize the write process by dividing it among several physical drives.

If the SAN or non-local file system is significantly slower than the local file system, work with the appropriate network/storage group to determine if there are configuration issues within the SAN.

Source Bottlenecks

Relational sources

If the session reads from a relational source, you can use a filter transformation, a read test mapping, or a database query to identify source bottlenecks.

Using a Filter Transformation.

Add a filter transformation in the mapping after each source qualifier. Set the filter condition to false so that no data is processed past the filter transformation. If the time it takes to run the new session remains about the same, then you have a source bottleneck.

Using a Read Test Session.

You can create a read test mapping to identify source bottlenecks. A read test mapping isolates the read query by removing any transformation logic from the mapping. Use the following steps to create a read test mapping:

  1. Make a copy of the original mapping.
  2. In the copied mapping, retain only the sources, source qualifiers, and any custom joins or queries.
  3. Remove all transformations.
  4. Connect the source qualifiers to a file target.
Use the read test mapping in a test session. If the test session performance is similar to the original session, you have a source bottleneck.
 
Using a Database Query

You can also identify source bottlenecks by executing a read query directly against the source database. To do so, perform the following steps:

  • Copy the read query directly from the session log.
  • Run the query against the source database with a query tool such as SQL Plus.
  • Measure the query execution time and the time it takes for the query to return the first row.

If there is a long delay between the two-time measurements, you have a source bottleneck.

If your session reads from a relational source and is constrained by a source bottleneck, review the following suggestions for improving performance:

  • Optimize the query.
  • Create tempdb as in-memory database.
  • Use conditional filters.
  • Increase database network packet size.
  • Connect to Oracle databases using IPC protocol.

Flat file sources

If your session reads from a flat file source, you probably do not have a read bottleneck. Tuning the line sequential buffer length to a size large enough to hold approximately four to eight rows of data at a time (for flat files) may improve performance when reading flat file sources. Also, ensure the flat file source is local to the Integration Service.

Mapping Bottlenecks

If you have eliminated the reading and writing of data as bottlenecks, you may have a mapping bottleneck. Use the swap method to determine if the bottleneck is in the mapping.

Begin by adding a Filter transformation in the mapping immediately before each target definition. Set the filter condition to false so that no data is loaded into the target tables. If the time it takes to run the new session is the same as the original session, you have a mapping bottleneck. You can also use the performance details to identify mapping bottlenecks: high Rowsinlookupcache and High Errorrows counters indicate mapping bottlenecks.

Follow these steps to identify mapping bottlenecks:

Create a test mapping without transformations

  1. Make a copy of the original mapping.
  2. In the copied mapping, retain only the sources, source qualifiers, and any custom joins or queries.
  3. Remove all transformations.
  4. Connect the source qualifiers to the target.

Check for High Rowsinlookupcache counters

Multiple lookups can slow the session. You may improve session performance by locating the largest lookup tables and tuning those lookup expressions.

Check for High Error rows counters

Transformation errors affect session performance. If a session has large numbers in any of the Transformation_errorrows counters, you may improve performance by eliminating the errors.

Session Bottlenecks

Session performance details can be used to flag other problem areas. Create performance details by selecting Collect Performance Data in the session properties before running the session.

View the performance details through the Workflow Monitor as the session runs, or view the resulting file. The performance details provide counters about each source qualifier, target definition, and individual transformation within the mapping to help you understand session and mapping efficiency.

To view the performance details during the session run:

  • Right-click the session in the Workflow Monitor.
  • Choose Properties.
  • Click the Properties tab in the details dialog box.

To view the resulting performance data file, look for the file session_name.perf in the same directory as the session log and open the file in any text editor.

All transformations have basic counters that indicate the number of input row, output rows, and error rows. Source qualifiers, normalizers, and targets have additional counters indicating the efficiency of data moving into and out of buffers. Some transformations have counters specific to their functionality. When reading performance details, the first column displays the transformation name as it appears in the mapping, the second column contains the counter name, and the third column holds the resulting number or efficiency percentage.

Low buffer input and buffer output counters

If the BufferInput_efficiency and BufferOutput_efficiency counters are low for all sources and targets, increasing the session DTM buffer pool size may improve performance.

Aggregator, Rank, and Joiner readfromdisk and writetodisk counters

If a session contains Aggregator, Rank, or Joiner transformations, examine each Transformation_readfromdisk and Transformation_writetodisk counter. If these counters display any number other than zero, you can improve session performance by increasing the index and data cache sizes.

If the session performs incremental aggregation, the Aggregator_readtodisk and writetodisk counters display a number other than zero because the Integration Service reads historical aggregate data from the local disk during the session and writes to disk when saving historical data. Evaluate the incremental Aggregator_readtodisk and writetodisk counters during the session. If the counters show any numbers other than zero during the session run, you can increase performance by tuning the index and data cache sizes.

System Bottlenecks

After tuning the source, target, mapping, and session, you may also consider tuning the system hosting the Integration Service.

The Integration Service uses system resources to process transformations, session execution, and the reading and writing of data. The Integration Service also uses system memory for other data tasks such as creating aggregator, joiner, rank, and lookup table caches.

Use system performance monitoring tools to monitor the amount of system resources the Server uses and identify system bottlenecks. Below are additional changes to make to all sessions:

  • Improve network speed - Slow network connections can slow session performance. Have the system administrator determine if the network runs at an optimal speed. Decrease the number of network hops between the Integration Service process and databases.
  • Use multiple CPUs - Take advantage of multiple CPUs to run multiple sessions in parallel and run multiple pipeline partitions in parallel.
  • Reduce paging - When an operating system runs out of physical memory, it starts paging to disk to free physical memory. Configure the physical memory for the Integration Service process machine to minimize paging to disk.
  • Use processor binding - In a multi-processor UNIX environment, the Integration Service may use a large amount of system resources. Use processor binding to control processor usage by the Integration Service process. Also, if the source and target database are on the same machine, use processor binding to limit the resources used by the database.

Table of Contents

Success

Link Copied to Clipboard