Success

Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs

Success

Success Accelerators

Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success

My Engagements

All your Engagements at one place
Communities

A collaborative platform to connect and grow with like-minded Informaticans across the globe

Communities

Product Communities

Connect and collaborate with Informatica experts and champions

Discussions

Have a question? Start a Discussion and get immediate answers you are looking for

User Groups

Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica

Get Started

Community Guidelines
Knowledge Center

Troubleshooting documents, product guides, how to videos, best practices, and more

Knowledge Center

Knowledge Base

One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more

Support TV

Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more

Documentation

Information library of the latest product documents

Velocity (Best Practices)

Best practices and use cases from the Implementation team
Learn

Rich resources to help you leverage full capabilities of our products

Learn

Trainings

Role-based training programs for the best ROI

Certifications

Get certified on Informatica products. Free, Foundation, or Professional

Product Learning Paths

Free and unlimited modules based on your expertise level and journey

Experience Lounge

Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
Resources

Library of content to help you leverage the best of Informatica products

Resources

Tech Tuesdays Webinars

Most popular webinars on product architecture, best practices, and more

Product Availability Matrix

Product Availability Matrix statements of Informatica products

SupportFlash

Monthly support newsletter

Support Documents

Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule

Product Lifecycle

End of Life statements of Informatica products

Ideas

Events

Change Request Tracking

Marketplace

| Sign up

Cloud Data Ingestion and Replication

Comprehensive Ingestion & Replication solution for all ingestion patterns from Files, Databases, Applications & Streaming sources.

Get started with Database Ingestion and Replication

Overview

Database Ingestion and Replication is part of the Intelligent Data Management Cloud (IDMC) Cloud Data Ingestion and Replication service. It allows large-scale data ingestion from common relational databases to various targets, including cloud-based and big-data targets. This feature requires a separate license and offers a user-friendly interface for setting up, deploying, running, and monitoring ingestion jobs.

There are three types of load operations that Database Ingestion and Replication can perform:

Initial load: Transfers data from a source to a target at a single point in time. This is useful for migrating data to a cloud-based system, materializing targets for incremental updates, or adding data to data lakes or warehouses.
Incremental load: Continuously updates the target with data changes since the last run or from a specified start point. This is ideal for keeping reporting, analytics, and online machine learning systems current.
Initial and incremental load: Begins with an initial load and then switches to continuous incremental updates to the same source tables giving flexibility to perform both initial and incremental load in the same job.

The service automatically maps the source to target tables and fields based on name matching, with options to customize target table names through defined rules.

CDIR (Prev CMI)

Why Best

Solution Capabilities

Use Cases

Architecture

What is Change Data Capture?

Change data capture (CDC) allows users to detect and manage incremental changes at the data source. Data consumers can absorb changes in real time with minimal impact on the data source or the transport system between the data source and the consumer. CDC captures changes from database transaction logs which are then published to a destination such as a cloud data lake, cloud data warehouse, or message hub.

The benefits of CDC include:

Greater efficiency: With CDC, only data that has changed is synchronized which saves time and enhances the accuracy of data and analytics
Lower impact on production databases: CDC has minimal impact on the source. This facilitates high-volume data transfers to the analytics target.
Improved time to value and lower TCO: CDC helps build data pipelines faster, saving time for data engineers and architects thereby reducing total cost of ownership (TCO).

Types of CDC

Timestamp-based CDC: Leverages a table timestamp column and retrieves only those rows that have changed since the data was last extracted. This is the simplest method to extract incremental data with CDC but can slow down production performance by consuming source CPU cycles
Trigger-based CDC: It defines triggers that fire before or after INSERT, UPDATE, or DELETE commands and are used to create change logs in a change table. This increases processing overheads and slows down source production operations.
Log-based CDC: Transactional databases store all changes in a transaction log that helps the database to recover in the event of a crash. With log-based CDC, new database transactions (inserts, updates, and deletes) are read from source databases’ transactions without making application-level changes and without having to scan operational tables.

Benefits of Log-based CDC:

This is the most preferred and fastest CDC method
It is non-intrusive and least disruptive for production database sources
No overhead on the database server performance

Supported Sources and Targets

Preparing Sources and Targets for Database Ingestion

Configuring a Database Ingestion Task

Configuration of a Database Ingestion task requires the following steps:

Preliminary Checks
Configuring Basic Task Information
Configuring Source Information
Configuring Target Information
Configuring the Runtime Options

Click on the below resources to get detailed information about how to configure the Database Ingestion task for your choice of source and targets:

Before you begin

Define basic task info

Configuring the source

Configuring the target

Configuring runtime options

Working with Oracle as a Source for your DB Ingestion Job with incremental or combined load?

The methods for accessing Oracle redo logs for Change Data Capture (CDC) processing in Database ingestion and replication jobs, specifically incremental load and combined initial and incremental load jobs, can vary based on your specific environment and requirements. Here are some alternative ways:

Direct Log Access: Jobs can directly access the physical Oracle redo logs on the on-premises source system to read change data. This method can provide the best performance if you store the logs on a solid-state disk (SSD).
NFS-Mounted Logs: Jobs can access Oracle database logs from a shared disk using a Network File Sharing (NFS) mount or another method such as Network Attached Storage (NAS) or clustered storage.
ASM-Managed Logs: Jobs can access Oracle redo logs that are stored in an Oracle Automatic Storage Management (ASM) system. To read change data from the ASM-managed redo logs, the ASM user must have SYSASM or SYSDBA authority on the ASM instance.
ASM-Managed Logs with a Staging Directory: Jobs can access ASM-managed redo logs from a staging directory in the ASM environment. This method can provide faster access to the log files and reduce I/O on the ASM system.
BFILE Access to Logs in the Oracle Server File System by Using Directory Objects: On an on-premises Oracle source system, you can configure Database Ingestion and Replication to read online and archived redo logs from the local Oracle server file system by using Oracle directory objects with BFILE locators.

Please note that the specific method to be used depends on your environment and requirements. Reference videos below:

Default Data Type Mappings in Database Ingestion Jobs

Handling Source Schema Changes

Overriding Schema drift options

There are some best practices to follow when handling schema drift in Database Ingestion jobs:

Overriding schema drift options when resuming a database ingestion and replication job: You can override the schema drift options when you resume a database ingestion and replication job that is in the Stopped, Aborted, or Failed state. The overrides affect only those tables that are currently in the Error state because of the Stop Table or Stop Job Schema Drift option. Use the overrides to correct or resolve these errors. You can override schema drift options and resume an incremental load job or a combined initial and incremental load job from the All Jobs tab on the Data Ingestion and Replication page in Operational Insights.

Schema Drift Options: There are several schema drift options you can select when resuming a job with an override. Each option has a different impact on how the job handles schema changes, so choose the one that best suits your needs.

Here are the different schema drift options and their impacts when resuming a database ingestion and replication job:

Ignore: This option does not replicate DDL changes that occur on the source database to the target.
Stop Table: This option stops processing the source table on which the DDL change occurred. The job cannot retrieve the data changes that occurred on the source table after the job stopped processing it. Consequently, data loss might occur on the target. To avoid data loss, you will need to resynchronize the source and target objects that the job stopped processing.
Resync: This option resynchronizes the target tables with the latest source table definitions, including any DDL changes that the schema drift ignored. Use this option for tables that the job stopped processing because of the Stop Table setting for a Schema Drift option. This option is available only for combined initial and incremental load jobs.
Resync (refresh): For database ingestion and replication combined load jobs that have an Oracle or SQL Server source, use this option to resynchronize the target tables with the latest source table definitions, including any DDL changes that schema drift ignored. After the target tables are refreshed, the structure of the source and target tables match. This option mimics the behavior of the Resync option.
Resync (retain): For database ingestion and replication combined load jobs that have an Oracle or SQL Server source, use this option to resynchronize the same columns that have been processed for CDC, retaining the current structure of the source and target tables. No checks for changes to the source or target table definitions are performed. If source DDL changes affect the source table structure, those changes are not processed.
Replicate: This option allows the database ingestion and replication job to replicate the DDL change to the target. If you specify the Replicate option for Rename Column operations on Microsoft Azure Synapse Analytics targets, the job will end with an error.

Types of Apply Modes

In the process of Database Ingestion and Replication, the Apply Mode is a crucial setting that determines how changes from the source are applied to the target. Here's an explanation of the three Apply Modes:

Standard Mode: Think of this as a smart mode that optimizes how changes are applied to the target. It collects all changes in one cycle and merges them into fewer SQL statements. For instance, if a row in the source is updated and then deleted, no row is applied to the target. If there are multiple updates on the same field, only the last update is applied to the target.
Soft Deletes Mode: This mode doesn't remove deleted rows from the database. Instead, it marks them as deleted by placing a "D" in the INFA_OPERATION_TYPE column against the deleted record. However, it's important to remember not to update the primary key in a source table when using this mode, as it can lead to data corruption on the target.
Audit Mode: This mode creates an audit trail of every change made on the source tables. Each change on a source table is written to the target table along with selected audit columns. These columns contain metadata about the change like the type of operation, time, owner, transaction ID, and more. However, Audit modes are not supported for Query-based CDC.

To set the Apply Mode, you would do it on the Target page of the task wizard when configuring the database ingestion and replication task. For more detailed steps, you can refer to the Informatica documentation.

Remember, these modes are designed to give you control over how changes are replicated, so choose the one that best suits your needs. Below is the Video for a quick explanation:

Running and Deploying a Database Ingestion task

Example of a Task

Database Ingestion Solution Accelerators

Troubleshooting

Find helpful resources below:

1) Database Ingestion Log Collection Tool in Windows.

Explains how to use the diagnostic tool to collect the database ingestion logs on Windows for effective troubleshooting.

2) Troubleshooting Database Ingestion Service Failures.

Guides users through identifying and resolving common issues that cause database ingestion service failures.

3) CDIR Job Failures with Supplemental Logging Errors.

Provides solutions to address CDIR job failures caused by missing or incorrect supplemental logging configurations.

4) How to Troubleshoot CDIR Job Failure with the “OCI path not set” Error.

Demonstrates how to resolve CDIR job failures resulting from an unset or incorrect OCI path.

5) How to Troubleshoot CDIR Job Failure with the “SCN not in the valid SCN range” error.

It covers steps to fix CDIR job failures triggered by an invalid or out-of-range SCN value.

6) How to apply Custom properties at the global level in DBMI.

Details the process of configuring and applying custom properties at the global level in Database Ingestion Jobs.