In general, mapping-level optimization takes time to implement, but can significantly boost performance. Sometimes the mapping is the biggest bottleneck in the load process because business rules determine the number and complexity of transformations in a mapping.
Before deciding on the best route to optimize the mapping architecture some basic issues need to be resolved. Tuning mappings is a grouped approach. The first group can be of assistance almost universally, bringing about a performance increase in all scenarios. The second group of tuning processes may yield only small performance increase, or can be of significant value, depending on the situation.
Some factors to consider when choosing tuning processes at the mapping level include the specific environment, software/ hardware limitations, and the number of rows going through a mapping. This Best Practice offers some guidelines for tuning mappings.
In addition to analyzing mappings for tuning, remember to tune the target and source for peak performance. To optimize mappings by reducing the number of transformations required, reduce the number of rows of data that must be processed as early in the mapping as possible. In addition to deleting unnecessary links between transformations ensure that transformations are performed in an optimal order to reduce I/O requirements.
There are other techniques, typically implemented during the design and build of the mapping to optimize performance from the start. These are items that should be reviewed when performance tuning. Tuning involves isolating the cause of a bottleneck, and then correcting the issue, if possible. Be cautious regarding techniques that change the logic in the mapping, as this could have unintended consequences.
For transformations that use data cache (such as Aggregator, Joiner, Rank, and Lookup transformations), limit connected input/output or output ports. Doing so can reduce the amount of data the transformations store in the data cache. Having too many Lookups and Aggregators can encumber performance because each requires index cache and data cache. Since both are fighting for memory space, decreasing the number of these transformations in a mapping can help improve speed. Splitting them up into different mappings is another option, as is adding memory to the session or task and to the individual transformations, which generally is a big boost to performance.
Limit the number of Aggregators in a mapping. A high number of Aggregators can increase I/O activity on the cache directory. Unless the seek/access time is fast on the directory itself, having too many Aggregators can cause a bottleneck. Similarly, too many Lookups in a mapping causes contention of disk and memory, which can lead to thrashing, leaving insufficient memory to run a mapping efficiently. Be sure to use Sorters before the Aggregator or Joiner to improve the performance.
If several mappings use the same data source, consider a single-pass reading, where a single mapping is used to read the data rather than several mappings. If there are several sessions that use the same sources, consolidate the separate mappings with either a single Source Qualifier Transformation or one set of Source Qualifier Transformations as the data source for the separate data flows.
Similarly, if a function is used in several mappings, a single-pass reading reduces the number of times that function is called in the session. For example, to subtract percentage from the PRICE ports for both the Aggregator and Rank transformations, work can be minimized by subtracting the percentage before splitting the pipeline.
When SQL overrides are required in a Source Qualifier, Lookup Transformation, or in the Update Override of a target object, be sure the SQL statement is tuned. The extent to which and how SQL can be tuned depends on the underlying source or target database system.
PowerCenter Server automatically makes conversions between compatible datatypes. When these conversions are performed unnecessarily, performance slows. For example, if a mapping moves data from an integer port to a decimal port, then back to an integer port, the conversion may be unnecessary.
In some instances, datatype conversions can help improve performance. This is especially true when integer values are used in place of other data types for performing comparisons using Lookup and Filter transformations.
Large numbers of evaluation errors significantly slow performance of the PowerCenter Server. During transformation errors, the PowerCenter Server engine pauses to determine the cause of the error, removes the row causing the error from the data flow, and logs the error in the session log. Review the session log and correct the errors that are flagged.
Transformation errors can be caused by many things including: conversion errors, conflicting mapping logic, any condition that is specifically set up as an error, and so on. The session log can help point out the cause of these errors. If errors recur consistently for certain transformations, re-evaluate the constraints for these transformations. If a session is needed that generates many transformation errors, performance may be improved by setting a lower tracing level. However, this is not a long-term response to transformation errors. Any source of errors should be traced and eliminated.
The session log is a treasure trove of information for performance tuning. In addition to transformation errors, there is information on memory used for lookups, sorting, joining, aggregations, etc. If messages indicate 2-pass sorting (or more) or that the algorithms used for aggregating is not sort-merge, then review the logic in the mapping to add sorters and to add memory to the transformations or to the Auto Memory settings for the session overall.
There are a several ways to optimize lookup transformations that are set up in a mapping.
Cache small lookup tables. When caching is enabled, the PowerCenter Server caches the lookup table and queries the lookup cache during the session. When this option is not enabled, the PowerCenter Server queries the lookup table on a row-by-row basis.
Note: The tuning options in this Best Practice assume that memory and cache sizing for lookups are sufficient to ensure that caches will not page to disk.
A better rule of thumb than memory size, is to determine the size of the potential lookup cache by the number of rows expected to be processed. For example, consider the following example.
In Mapping X, the source and lookup contain the following number of records:
ITEMS (source): |
5000 records |
MANUFACTURER: |
200 records |
DIM_ITEMS: |
100000 records |
Number of Disk Reads
|
Cached Lookup |
Un-cached Lookup |
LKP_Manufacturer |
|
|
Build Cache |
200 |
0 |
Read Source Records |
5000 |
5000 |
Execute Lookup |
0 |
5000 |
Total # of Disk Reads |
5200 |
10000 |
LKP_DIM_ITEMS |
|
|
Build Cache |
100000 |
0 |
Read Source Records |
5000 |
5000 |
Execute Lookup |
0 |
5000 |
Total # of Disk Reads |
105000 |
10000 |
Consider the case where MANUFACTURER is the lookup table. If the lookup table is cached, it will take a total of 5200 disk reads to build the cache and execute the lookup. If the lookup table is not cached, then it will take a total of 10,000 total disk reads to execute the lookup. In this case, the number of records in the lookup table is small in comparison with the number of times the lookup is executed. This lookup should be cached. This is the more likely scenario.
Consider the case where DIM_ITEMS is the lookup table. If the lookup table is cached, it will result in 105,000 total disk reads to build and execute the lookup. If the lookup table is not cached, then the disk reads would total 10,000. In this case the number of records in the lookup table is not small in comparison with the number of times the lookup will be executed. Thus, the lookup should not be cached.
Use the following eight step method to determine if a lookup should be cached:
There are several methods for sharing lookup caches:
There is an option to use a SQL override in the creation of a lookup cache. Options can be added to the WHERE clause to reduce the set of records included in the resulting cache.
In the case where a lookup uses more than one lookup condition, set the conditions with an equal sign first in order to optimize lookup performance.
The PowerCenter Server must query, sort, and compare values in the lookup condition columns. As a result, indexes on the database table should include every column used in a lookup condition. This can improve performance for both cached and un-cached lookups.
If the lookup source does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The PowerCenter Server then saves and reuses cache files from session to session, eliminating the time required to read the lookup source.
Filtering data as early as possible in the data flow improves the efficiency of a mapping. Instead of using a Filter Transformation to remove a sizeable number of rows in the middle or end of a mapping, use a filter on the Source Qualifier or a Filter Transformation immediately after the source qualifier to improve performance.
Avoid complex expressions when creating the filter condition. Filter transformations are most effective when a simple integer or TRUE/FALSE expression is used in the filter condition.
Filters or routers should also be used to drop rejected rows from an Update Strategy transformation if rejected rows do not need to be saved.
Replace multiple filter transformations with a router transformation. This reduces the number of transformations in the mapping and makes the mapping easier to follow.
Aggregator Transformations often slow performance because they must group data before processing it.
Use simple columns in the group by condition to make the Aggregator Transformation more efficient. When possible, use numbers instead of strings or dates in the GROUP BY columns. Also avoid complex expressions in the Aggregator expressions, especially in GROUP BY ports.
Use the Sorted Input option in the Aggregator. This option requires that data sent to the Aggregator be sorted in the order in which the ports are used in the Aggregator's group by. The Sorted Input option decreases the use of aggregate caches. When it is used, the PowerCenter Server assumes all data is sorted by group and, as a group is passed through an Aggregator, calculations can be performed, and information passed on to the next transformation. Without sorted input, the server must wait for all rows of data before processing aggregate calculations. Use of the Sorted Inputs option is usually accompanied by a Source Qualifier which uses the Number of Sorted Ports option.
Use incremental aggregation when capturing changes from the source that change less than half the target. When using incremental aggregation, apply captured changes in the source to aggregate calculations in a session. The PowerCenter Server updates the target incrementally, rather than processing the entire source and recalculating the same calculations every time the session runs.
Data from the same source can be joined in the following ways:
There may be a need to join data from the same source in order to perform a calculation on part of the data and join the transformed data with the original data. When the data is joined using this method, the original data can be maintained and parts of that data can be transformed within one mapping.
When joining data from the same source, two branches of the pipeline can be created. When branching a pipeline, a transformation must be added between the Source Qualifier and the Joiner transformation in at least one branch of the pipeline. Sorted data must be joined and configure the Joiner transformation for sorted input.
To join unsorted data, create two instances of the same source and join the pipelines.
For example, for a source with the following ports:
To be able to view in the target table the employees who generated sales that were greater than the average sales for their respective departments, create a mapping with the following transformations:
Note: Data from output groups of the same transformation can also be joined such as the Custom transformation or XML Source Qualifier transformations. Place a Sorter transformation between each output group and the Joiner transformation and configure the Joiner transformation to receive sorted input.
Joining two branches can affect performance if the Joiner transformation receives data from one branch much later than the other branch. The Joiner transformation caches all the data from the first branch and writes the cache to disk if the cache fills. The Joiner transformation must then read the data from disk when it receives the data from the second branch. This can slow processing.
Same source data can also be joined by creating a second instance of the source. After the second source instance is created, join the pipelines from the two source instances.
Note: When joining data using this method, the PowerCenter Server reads the source data for each source instance, so performance can be slower than joining two branches of a pipeline.
Use the following guidelines when deciding whether to join branches of a pipeline or join two instances of a source:
Use the database to do the join when sourcing data from the same database schema. Database systems usually can perform the join more quickly than the PowerCenter Server, a join condition should be used when joining multiple tables from the same database schema. Note: Avoid using SQL Overrides, even though they can do some of the same joins, etc., as they hide metadata and might be difficult to maintain.
Use Normal joins whenever possible. Normal joins are faster than outer joins and the resulting set of data is also smaller.
Join sorted data when possible. Session performance can be improved by configuring the Joiner transformation to use sorted input. When configuring the Joiner transformation to use sorted data, the PowerCenter Server improves performance by minimizing disk input and output. The greatest performance improvement is seen when working with large data sets.
For the Joiner transformation, always designate as the master source the source with fewer rows. For optimal performance and disk storage, designate the master source as the source with the fewer rows. During a session, the Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
For a sorted Joiner transformation, designate as the master source the source with fewer duplicate key values. For optimal performance and disk storage, designate the master source as the source with fewer duplicate key values. When the PowerCenter Server processes a sorted Joiner transformation, it caches rows for one hundred keys at a time. If the master source contains many rows with the same key value, the PowerCenter Server must cache more rows, and performance can be slowed.
Optimizing sorted joiner transformations with partitions. When using partitions with a sorted Joiner transformation, optimize performance by grouping data and using n:n partitions.
Add a hash auto-keys partition upstream of the sort origin. To obtain expected results and get best performance when partitioning a sorted Joiner transformation, data must be grouped and sorted. To group data, ensure that rows with the same key value are routed to the same partition. The best way to ensure that data is grouped and distributed evenly among partitions is to add a hash auto-keys or key-range partition point before the sort origin. Placing the partition point before sorting the data ensures that grouping is maintained and that data is sorted within each group.
Use n:n partitions. Performance may be improved for a sorted Joiner transformation by using n:n partitions. When n:n partitions are used, the Joiner transformation reads master and detail rows concurrently and does not need to cache all the master data. This reduces memory usage and speeds processing. When 1:n partitions are used, the Joiner transformation caches all the data from the master pipeline and writes the cache to disk if the memory cache fills. When the Joiner transformation receives the data from the detail pipeline, it must then read the data from disk to compare the master and detail pipelines.
Optimize Sequence Generator Transformations. Sequence Generator transformations need to determine the next available sequence number; thus, increasing the “Number of Cached Values” property can increase performance. This property determines the number of values the PowerCenter Server caches at one time. If it is set to cache no values, then the PowerCenter Server must query the repository each time to determine the next number to be used. Consider configuring the Number of Cached Values to a value greater than 1000. Note that any cached values not used in a session are lost since the sequence generator value in the repository is set when it is called next time to give the next set of cache values. While this seems trivial, not setting the value high enough for the needs of the mapping can significantly slow performance.
As a final step in the tuning process, expressions used in transformations can be tuned. When examining expressions, focus on complex expressions and try to simplify them when possible.
To help isolate slow expressions, do the following:
Processing field level transformations takes time. If the transformation expressions are complex, then processing is even slower. Likely candidates for optimization are the fields with the most complex expressions. Keep in mind that there may be more than one field causing performance problems.
Factoring out common logic can reduce the number of times a mapping performs the same logic. If a mapping performs the same logic multiple times, moving the task upstream in the mapping may allow the logic to be performed just once. For example, a mapping has five target tables. Each target requires a Social Security Number lookup. Instead of performing the lookup right before each target, move the lookup to a position before the data flow splits.
Similarly, within an Expression transformation, use a variable to hold the results of expression logic that is used multiple times in the expression, rather than computing the results each time.
Anytime a function is called it takes resources to process. There are several common examples where function calls can be reduced or eliminated.
Aggregate function calls can sometime be reduced. In the case of each aggregate function call, the PowerCenter Server must search and group the data. Thus, the following expression:
SUM(Column A) + SUM(Column B)
Can be optimized to:
SUM(Column A + Column B)
In general, operators are faster than functions, so operators should be used whenever possible. For example, for an expression that involves a CONCAT function such as:
CONCAT(CONCAT(FIRST_NAME, ), LAST_NAME)
It can be optimized to:
FIRST_NAME || LAST_NAME
Remember that IIF() is a function that returns a value, not just a logical test. This allows many logical statements to be written in a more compact fashion. For example:
IIF(FLG_A=Y and FLG_B=Y and FLG_C= Y, VAL_A+VAL_B+VAL_C,
IIF(FLG_A=Y and FLG_B=Y and FLG_C= N, VAL_A+VAL_B,
IIF(FLG_A=Y and FLG_B=N and FLG_C= Y, VAL_A+VAL_C,
IIF(FLG_A=Y and FLG_B=N and FLG_C= N, VAL_A,
IIF(FLG_A=N and FLG_B=Y and FLG_C= Y, VAL_B+VAL_C,
IIF(FLG_A=N and FLG_B=Y and FLG_C= N, VAL_B,
IIF(FLG_A=N and FLG_B=N and FLG_C= Y, VAL_C,
IIF(FLG_A=N and FLG_B=N and FLG_C= N, 0.0))))))))
Can be optimized to:
IIF(FLG_A=Y, VAL_A, 0.0) + IIF(FLG_B=Y, VAL_B, 0.0) + IIF(FLG_C= Y, VAL_C, 0.0)'
The original expression had 8 IIFs, 16 ANDs and 24 comparisons. The optimized expression results in three IIFs, three comparisons, and two additions.
Be creative in making expressions more efficient. The following is an example of rework of an expression that eliminates three comparisons down to one:
IIF(X=1 OR X=5 OR X=9, 'yes', 'no')
Can be optimized to:
IIF(MOD(X, 4) = 1, 'yes', 'no')
Use of the DECODE function is more efficient than the IIF function, as the IIF analyzes the results of every part of the statement, even if the logic shows a result. The DECODE stops as soon as a condition is satisfied.
Avoid calculating or testing the same value multiple times. If the same sub-expression is used several times in a transformation, consider making the sub-expression a local variable. The local variable can be used only within the transformation in which it was created. Calculating the variable only once and then referencing the variable in following sub-expressions improves performance.
The PowerCenter Server processes numeric operations faster than string operations. For example, if a lookup is performed on a large amount of data on two columns, EMPLOYEE_NAME and EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance.
When the PowerCenter Server performs comparisons between CHAR and VARCHAR columns, it slows each time it finds trailing blank spaces in the row. To resolve this, treat CHAR as the CHAR On Read option in the PowerCenter Server setup so that the server does not trim trailing spaces from the end of CHAR source fields.
Because there is always overhead involved in moving data among transformations, try, whenever possible, to reduce the number of transformations. Also, resolve unnecessary links between transformations to minimize the amount of data moved. This is especially important with data being pulled from the Source Qualifier Transformation.
Pre- and post-session SQL commands can be specified in the Properties tab of the Source Qualifier transformation and in the Properties tab of the target instance in a mapping. To increase the load speed, use these commands to drop indexes on the target before the session runs, then recreate them when the session completes.
Apply the following guidelines when using SQL statements:
For relational databases, SQL commands can be executed in the database environment when connecting to the database. Use this for source, target, lookup, and stored procedure connections. For instance, isolation levels can be set on the source and target systems to avoid deadlocks. Follow the guidelines listed above for using the SQL statements.
Use local variables in Aggregator, Expression, and Rank transformations.
Rather than parsing and validating the same expression each time, define these components as variables. This also allows simplification of complex expressions. For example, the following expressions:
AVG( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) )
SUM( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) )
can use variables to simplify complex expressions and temporarily store data:
Variables can be used to store data from prior rows. This can help with procedural calculations. To compare the previous state to the state just read:
IIF( PREVIOUS_STATE = STATE, STATE_COUNTER + 1, 1 )
Variables also provide a way to capture multiple columns of return values from stored procedures.