• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date May 25, 2021 |

Challenge

Informatica's Data Processor Transformation (DP) product addresses a gap in Data Engineering Integration, Data Quality, and PowerCenter to extract data from or transform unstructured data. The requirement to process or transform unstructured or file-based data is either known at the start of a new project or it is realized after a platform/architecture is set up. This article provides best practices and considerations while configuring Data Processor on a Unix platform.

Description

When configuring DP on new or existing hardware (either in conjunction with Data Engineering Integration, Data Quality, PowerCenter or co-existing with other host applications on the same application server) consider the following questions to determine what type of hardware to use for Data Processor:

  • If the hardware already exists:
    • Is the processor or operating system supported by the Data Engineering Integration Platform?
    • Are the necessary operating systems and patches applied?
    • How many CPUs does the machine currently have? Can the CPU capacity be expanded?
  • How much memory does the machine have? If the hardware does not already exist:
    • Has the organization standardized on a hardware or operating system vendor?
    • What type of operating system is preferred and supported?
  • License requirements
  • Data Engineering or Data Quality transformation license
  • Unlimited data processor transformation in MRS
  • Unstructured Data Option (Mandatory)
  • Data Transformation xMap license option (Mandatory)

Regardless of the hardware vendor chosen, the hardware must be configured and sized appropriately to support the Data Engineering/Data Quality platform and the complex data processor transformation requirements . The hardware requirements for the Data Processor environment depends upon the data volumes, number of concurrent users, application server and operating system used, among other factors. For exact sizing recommendations, contact Informatica Professional Services for a Data Processor Sizing and Baseline Architecture engagement.

Planning for the Data Transformation Installation

Host Software Environment

There are several variations of the hosting environment from which Data Processor services will be called. This has implications on how the Data processor is installed and configured. The most common configurations are:

  • Data Processor Transformation to be used in conjunction with Data Engineering Transformation, Data Quality.
  • Data Processor transformation exported as a mapplet and used in PowerCenter.
  • Data Processor transformation mapplet used with Web Methods

Depending on what host options are chosen, installation may vary.

  • Installation of Data Processor Transformation with Data Engineering or Data Quality platform- Have the necessary licenses and the additional plug-in to operate Data Engineering or Data Quality. Refer to the appropriate installation guide or contact Informatica support for details on installing Data Engineering or Data Quality.
  • Installation of Data processor Transformation for a PowerCenter Environment- When using Data Processor Transformation services in a PowerCenter Environment, it is expected that the Data processor transformation will first be built in Data Engineering or Data Quality and then exported as a mapplet for PowerCenter.
  • To use the data processor transformation outside of Data Engineering, Data Quality, or PowerCenter, export the Data processor transformation as a web service.

Other Decision Points

Where Will the Data Processor Transformation Repository be Located?

The choices for the location of the service repository are: (1) a path on the local file system on the server or (2) use of a shared network drive. The typical justification for using a shared network drive is to simplify service deployment by sharing data processor transformation objects.

What Are Multi-user Considerations?

Security Considerations

Modeling Repository service hosting Data Processor Transformation, necessary permissions need to be provided to Projects, Folders. The identity associated with the caller of the Data Transformation services will also need to have permissions to execute mapplets or exposed web service end points corresponding to the Data Processor transformation.

Special considerations should be given to scenarios (e.g., web services) where the user that runs the Data Processor Transformation service is different than the user associated with the calling application.

Log File and Tracing Locations

Log files and tracing options should be configured for appropriate recycling policies. The calling application must have permissions to read, write and delete files to the path that is set for storing these files.

Data Processor Transformation Pre-Install Checklist

The Data Processor Transformation has client and server components. Only the server (or engine) component is installed on UNIX platforms. The client or developer tool is only supported on the Windows platform. Reviewing the environment and recording the information in a detailed checklist facilitates the Data Processor transformation install.

Verify that the minimum requirements for Operating System, Disk Space, Processor Speed and RAM are met and record them in the checklist.

Data Engineering or Data Quality Requirements

For new Data Engineering or Data Quality installations, the Data Integration Service and Model Repository Service are bundled. The Data Processor Transformation is available with Data Engineering and Data Quality installations.

For existing Data Engineering or Data Quality installations, enable the licenses needed for Data Processor Transformation.

To integrate with existing or new PowerCenter installations, extract the Data processor transformation mapplet from Data Engineering or Data Quality

 Ensure the following:

  • Which version of Data Engineering or Data Quality is being used (10.4.x required)?
  • Installation of Developer tool on Client PC or Workstation. If PowerCenter is required, Are the PowerCenter client tools installed on the client PC?
  • Ensure necessary licenses have been acquired to use Data Processor transformation license.

For more information, refer to the Product Availability Matrix.

Non-Data Engineering, Data Quality or PowerCenter Integration Requirements

  • Export the Data Engineering or Data Quality mapplet and enable a web services end point.
  • Export the mapplet into PowerCenter and enable the web services hub service.

Data Engineering (DEI) or Data Quality (DEQ) Installation and Configuration

The Data Processor configuration with DEI or DEQ involves:

  • Installing the DEI or DEQ domain components on a Unix server
  • Installing the Developer tool on client workstations.
  • Enabling the licenses needed for the Data Processor Transformation

Before installing DEI or DEQ and configuring the Data Processor transformation, complete the following steps:

  1. Verify that the hardware meets the minimum system requirements for DEI / DEQ. Ensure that the combination of hardware and operating system are supported by DEI / DEQ. Ensure that sufficient space has been allocated to DEI / DEQ (Domain, MRS, Content Management, Warehouse, Profile Repositories).
  2. Apply all necessary patches to the operating system.
  3. Ensure that the Data Processor transformation license file has been obtained from technical support.
  4. Possess administrative privileges for the installation user id. For *nix systems, ensure that read, write and executive privileges have been provided for the installation directory.

Adhere to following sequence of steps to successfully configure the Data processor transformation:

  1. Install the Developer tool client on a Windows PC or Workstation. The client should be installed by a user with Administrative privileges.
  2. Complete the configuration of the Developer, connection to DEI / DEQ MRS repository, pre-install checklist, and apply license keys.

Data Processor Transformation Configuration Components

  • DEI / DEQ server components installation
  • Start domain services
  • Create, Enable Model repository (MRS), Data Integration, Content Management, Monitoring MRS services
  • Client installation

The table below provides a description of each component:

Component

Applicable Platform

Description

Domain server platform components

Both UNIX and Windows

DEI or DEQ platform service

Model repository service

Both Unix and Windows

The Model Repository Service is an application service that manages the Model repository. The Model repository stores metadata created by Informatica clients and application services in a relational database to enable collaboration among the clients and services.

Data Integration Service

Both Unix and Windows

The Data Integration Service is an application service that performs data integration jobs for the Analyst tool, the Developer tool, and external clients.

Content Management Service

Both Unix and Windows

The Content Management Service is an application service that manages reference data. A reference data object contains a set of data values that you can search while performing data quality operations on source data. The Content Management Service also compiles rule specifications into mapplets. A rule specification object describes the data requirements of a business rule in logical terms.

Monitoring MRS

Both Unix and Windows

The monitoring Model Repository Service is a Model Repository Service that monitors statistics for Data Integration Service jobs. You configure the monitoring Model Repository Service in the domain properties.

Existing PowerCenter Installations only – PowerCenter Repository service

Both Unix and Windows

The PowerCenter Repository Service manages the PowerCenter repository. It receives requests from Informatica clients and application services to store or access metadata in the Model repository.

Existing PowerCenter Installations only – PowerCenter Integration service

Both Unix and Windows

The PowerCenter Integration Service receives requests from PowerCenter client tools to run data integration jobs. It writes results to different databases, and it writes run-time metadata to the PowerCenter repository. When you create the service, you need to associate another application service with it.

Developer Tool

Windows Only

Client tool required for DEI or DEQ. Configure Data processor transformation with developer tool

Existing PowerCenter l Installation – PowerCenter Client

Windows Only

Client tool required for PowerCenter Installations.

Table of Contents

Success

Link Copied to Clipboard