• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date May 25, 2021 |

Challenge

Successfully identify the need and scope of reusability. Create inventories of reusable objects with in a folder or shortcuts across folders (Local shortcuts) or shortcuts across repositories (Global shortcuts).

Successfully identify and create inventories of mappings based on business rules.

Description

Reusable Objects

Prior to creating an inventory of reusable objects or shortcut objects, be sure to review the business requirements and look for any common routines and/or modules that may appear in more than one data movement. These common routines are excellent candidates for reusable objects or shortcut objects. In PowerCenter, these objects can be created as:

  • single transformations (i.e., lookups, filters, etc.)
  • a reusable mapping component (i.e., a group of transformations - mapplets)
  • single tasks in workflow manager (i.e., command, email, or session)
  • a reusable workflow component (i.e., a group of tasks in workflow manager - worklets).

Please note that shortcuts are not supported for workflow level objects (Tasks).

Identify the need for reusable objects based on the following criteria:

  • Is there enough usage and complexity to warrant the development of a common object?
  • Are the data types of the information passing through the reusable object the same from case to case or is it simply the same high-level steps with different fields and data.

Identify the Scope based on the following criteria:

  • Do these objects need to be shared with in the same folder. If so, then create re-usable objects with in the folder
  • Do these objects need to be shared in several other PowerCenter repository folders? If so, then create local shortcuts
  • Do these objects need to be shared across repositories? If so, then create a global repository and maintain these re-usable objects in the global repository. Create global shortcuts to these reusable objects from the local repositories.

Note: Shortcuts cannot be created for workflow objects.

PowerCenter Designer Objects

Creating and testing common objects does not always save development time or facilitate future maintenance. For example, if a simple calculation like subtracting a current rate from a budget rate that is going to be used for two different mappings, carefully consider whether the effort to create, test, and document the common object is worthwhile. Often, it is simpler to add the calculation to both mappings. However, if the calculation were to be performed in a number of mappings, if it was very difficult, and if all occurrences would be updated following any change or fix, then the calculation would be an ideal case for a reusable object. When you add instances of a reusable transformation to mappings, be careful that the changes do not invalidate the mapping or generate unexpected data. The Designer stores each reusable transformation as metadata, separate from any mapping that uses the transformation.

 

The second criterion for a reusable object concerns the data that will pass through the reusable object. Developers often encounter situations where they may perform a certain type of high-level process (i.e., a filter, expression, or update strategy) in two or more mappings. For example, if you have several fact tables that require a series of dimension keys, you can create a mapplet containing a series of lookup transformations to find each dimension key. You can then use the mapplet in each fact table mapping, rather than recreating the same lookup logic in each mapping. This seems like a great candidate for a mapplet. However, after performing half of the mapplet work, the developers may realize that the actual data or ports passing through the high-level logic are totally different from case to case, thus making the use of a mapplet impractical. Consider whether there is a practical way to generalize the common logic so that it can be successfully applied to multiple cases. Remember, when creating a reusable object, the actual object will be replicated in one to many mappings. Thus, in each mapping using the mapplet or reusable transformation object, the same size and number of ports must pass into and out of the mapping/reusable object.

 

Document the list of the reusable objects that pass this criteria test, providing a high-level description of what each object will accomplish. The detailed design will occur in a future subtask, but at this point the intent is to identify the number and functionality of reusable objects that will be built for the project. Keep in mind that it will be impossible to identify one hundred percent of the reusable objects at this point; the goal here is to create an inventory of as many as possible, and hopefully the most difficult ones. The remainder will be discovered while building the data integration processes.

PowerCenter Workflow Manager Objects

In some cases, we may have to read data from different sources and go through the same transformation logic and write the data to either one destination database or multiple destination databases. Also, sometimes, depending on the availability of the source, these loads have to be scheduled at different time. This case would be the ideal one to create a re-usable session and do Session overrides at the session instance level for the database connections/pre-session commands / post session commands.

 

Logging load statistics, failure criteria and success criteria are usually common pieces of code that would be executed for multiple loads in most Projects. Some of these common tasks include:

  • Notification when number of rows loaded is less then expected
  • Notification when there are any reject rows using email tasks and link conditions
  • Successful completion notification based on success criteria like number of rows loaded using email tasks and link conditions
  • Fail the load based on failure criteria like load statistics or status of some critical session using control task
  • Stop/Abort a Workflow based on some failure criteria using control task
  • Based on some previous session completion times, calculate the amount of time the down stream session has to wait before it can start using worklet variables, timer task and assignment task

Re-usable worklets can be developed to encapsulate the above-mentioned tasks and can be used in multiple loads. By passing workflow variable values to the worklets and assign then to worklet variables, one can easily encapsulate common workflow logic.

Mappings

A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. In a simple world, a single source table would populate a single target table. However, in practice, this is usually not the case. Sometimes multiple sources of data need to be combined to create a target table, and sometimes a single source of data creates many target tables. The latter is especially true for mainframe data sources where COBOL OCCURS statements litter the landscape. In a typical warehouse or data mart model, each OCCURS statement decomposes to a separate table.

The goal here is to create an inventory of the mappings needed for the project. For this exercise, the challenge is to think in individual components of data movement. While the business may consider a fact table and its three related dimensions as a single object in the data mart or warehouse, five mappings may be needed to populate the corresponding star schema with data (i.e., one for each of the dimension tables and two for the fact table, each from a different source system).

Typically, when creating an inventory of mappings, the focus is on the target tables, with an assumption that each target table has its own mapping, or sometimes multiple mappings. While often true, if a single source of data populates multiple tables, this approach yields multiple mappings. Efficiencies can sometimes be realized by loading multiple tables from a single source. By simply focusing on the target tables, however, these efficiencies can be overlooked.

A more comprehensive approach to creating the inventory of mappings is to create a spreadsheet listing all of the target tables. Create a column with a number next to each target table. For each of the target tables, in another column, list the source file or table that will be used to populate the table. In the case of multiple source tables per target, create two rows for the target, each with the same number, and list the additional source(s) of data.

The table would look similar to the following:

Number

Target Table

Source

1

Customers

Cust_File

2

Products

Items

3

Customer_Type

Cust_File

4

Orders_Item

Tickets

4

Orders_Item

Ticket_Items

 

When completed, the spreadsheet can be sorted either by target table or source table. Sorting by source table can help determine potential mappings that create multiple targets.

 

When using a source to populate multiple tables at once for efficiency, be sure to keep restartabilty and reloadability in mind. The mapping will always load two or more target tables from the source, so there will be no easy way to rerun a single table. In this example, potentially the Customers table and the Customer_Type tables can be loaded in the same mapping.

 

When merging targets into one mapping in this manner, give both targets the same number. Then, re-sort the spreadsheet by number. For the mappings with multiple sources or targets, merge the data back into a single row to generate the inventory of mappings, with each number representing a separate mapping.

 

The resulting inventory would look similar to the following:

 

Number

Target Table

Source

1

Customers        Customer_Type

Cust_File

2

Products

Items

4

Orders_Item

Tickets Ticket_Items

At this point, it is often helpful to record some additional information about each mapping to help with planning and maintenance.

First, give each mapping a name (apply naming standards). These names can then be used to distinguish mappings from one other and also can be put on the project plan as individual tasks.

Next, determine for the project a threshold for a high, medium, or low number of target rows. For example, in a warehouse where dimension tables are likely to number in the thousands and fact tables in the hundred thousands, the following thresholds might apply:

  • Low 1 to 10,000 rows
  • Medium 10,000 to 100,000 rows
  • High 100,000 rows +

Assign a likely row volume (high, medium or low) to each of the mappings based on the expected volume of data to pass through the mapping. These high level estimates will help to determine how many mappings are of high volume; these mappings will be the first candidates for performance tuning.

Add any other columns of information that might be useful to capture about each mapping, such as a high-level description of the mapping functionality, resource (developer) assigned, initial estimate, actual completion time, or complexity rating.

Table of Contents

Success

Link Copied to Clipboard