• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date Jun 09, 2021 |

Challenge

In today’s world, business decisions are dependent on the availability of quality data from a variety of applications and data sources. Data latency is just one aspect of quality data. Data latency becomes especially important in an environment where changes in the data records are captured and replicated to other data sources and applications. In this context, a simple definition of data latency is the amount of time between when the change in the data record was committed and when the changed data record was replicated to the target.

Description 

To help measure data latency, organizations will implement Cloud Mass Ingression (CMI) or Database Mass Ingestion (DBMI) with the incremental load process and use service level agreements (SLAs) to formally set their expectations related to overall data latency. Determining data latency can be difficult to accomplish because the frequency of data changes within a given table or set of tables is not predictable. When both the source and target are relational, a Heartbeat application can assist in determining data latency.

From a high-level, a Heartbeat application is a rather simple application. A process inserts a record into the source Heartbeat table record at a predetermined time interval. For example, every 15 minutes. Once the insert is committed, the CMI or DBMI incremental load process will replicate the insert data change record into the target Heartbeat table automatically.

The target Heartbeat table is a historical table so a set of queries or reports can be generated to report the source to target Heartbeat data latency over a specified timeframe. Since the Heartbeat application uses the same source database, CMI or DBMI incremental load process and target database as the other tables, it’s data latency will be very similar to the data latency experienced by the tables. In addition, this data latency information can be used to assist in determining realistic SLAs and alerting the operations staff when the replication process is not performing as expected. An alert example might be notifying the operations staff when the target Heartbeat table has not processed any new data change records in the past 45 - 60 minutes.

To implement a simple Heartbeat application the following activities or tasks need to be completed.

  • Each of the source databases using CMI or DBMI incremental load should have a Heartbeat table that will be used to help determine data latency. Note that the primary key for this table is a composite primary key. Records will be INSERTED into this source Heartbeat table at a predefined time interval, for example every 15 minutes, by another process.

Description of Source Heartbeat Table Columns

Column Name

Description

REC_TYPE

This column identifies the record as a Heartbeat record.

SRC_DBNAME

This column contains an acronym that identifies the source database.

LINE_OF_BUSINESS

This column contains an acronym that identifies the line of business associated with the source database.

SRC_TIMESTAMP

This column contains the system timestamp when the process that inserted this record was executed.

SRC_DBTIMESTAMP

This column contains the source database timestamp when the Heartbeat record was inserted into the source database.

TGT_DBTIMESTAMP

This column will contain the database timestamp when the Heartbeat record was inserted into the target database.


Sample Source Table DDL

CREATE TABLE schema.HEARTBEAT_TABLE (
        REC_TYPE VARCHAR2 (9) NOT NULL,
        SRC_DBNAME VARCHAR2 (10) NOT NULL,
        LINE_OF_BUSINESS VARCHAR2 (10) NOT NULL,
        SRC_TIMESTAMP TIMESTAMP NOT NULL,
        SRC_DBTIMESTAMP TIMESTAMP DEFAULT SYSTIMESTAMP NOT NULL,
        TGT_DBTIMESTAMP TIMESTAMP,
CONSTRAINT HEARTBEAT_PK
            PRIMARY KEY (REC_TYPE, SRC_DBNAME, LINE_OF_BUSINESS , SRC_TIMESTAMP));
COMMIT;

  • The target Heartbeat table will be created automatically, however, it will need a slight change. The constraints on the TGT_DBTIMESTAMP need to be changed so this column will be set to the current timestamp for the target. If the target is a relational database, the syntax for the ALTER TABLE will vary depending on the database. If the target is a flat file, then the TGT_DBTIMESTAMP field will need to be populated by another process.

Sample Alter Target Table DDL

ALTER TABLE schema.HEARTBEAT_TABLE
      MODIFY TGT_DBTIMESTAMP TIMESTAMP DEFAULT SYSTIMESTAMP NOT NULL;
COMMIT;

  • A process will need to be created that will INSERT a record into the source Heartbeat table at a predefined time interval. For example, every 15 minutes. This ensures that at least one data change record will be flowing through the CMI or DBMI incremental load process every 15 minutes.

Below are column values that this process will need to set.

 

Column Name

Description

REC_TYPE

Set the value to “HEARTBEAT” which identifies this as a Heartbeat record.

SRC_DBNAME

Set the value to the acronym that identifies the source database.

LINE_OF_BUSINESS

Set the value to the acronym that identifies the line of business.

SRC_TIMESTAMP

Set this value to the system timestamp when the process that updates this record was executed.

SRC_DBTIMESTAMP

Set this column to null. By default, this column will be set to the current timestamp for the source database.

TGT_DBTIMESTAMP

Set this column to null.

 
  • Since mass ingestion is being used to replicate data change records on this table, it will generate the mapping that will consume the data change records automatically.
  • To facilitate the reporting on the target Heartbeat table, a view should be created that includes the calculation of data latency between the various timestamp columns.

Description of the Target Heartbeat Reporting View Columns

Column Name

Description

REC_TYPE

This column identifies the record as a Heartbeat record.

SRC_DBNAME

This column contains an acronym that identifies the source database.

LINE_OF_BUSINESS

This column contains an acronym that identifies the line of business associated with the source database.

SRC_TIMESTAMP

This column contains the system timestamp when the process that updates this record was executed.

SRC_DBTIMESTAMP

This column contains the source database timestamp when the source record was inserted into the source database.

TGT_DBTIMESTAMP

This column will contain the database timestamp when the Heartbeat record was inserted into the target database.

LATENCY_ BTW_TGT_AND_SRC

Difference between TGT_DBTIMESTAMP and SRC_TIMESTAMP. Format should be mmmm.ss.


Sample Target Table View DDL for Reporting

CREATE VIEW HEARTBEAT_VIEW (
        REC_TYPE,
        SRC_DBNAME,
        LINE_OF_BUSINESS,
        SRC_TIMESTAMP,
        SRC_DBTIMESTAMP,
        TGT_DBTIMESTAMP,
        TGT_DBTIMESTAMP - SRC_TIMESTAMP)
AS SELECT DATA_SOURCE,
          REC_TYPE,
          SRC_DBNAME,
          LINE_OF_BUSINESS,
          SRC_TIMESTAMP,
          SRC_DBTIMESTAMP,
          TGT_TIMESTAMP,
           LATENCY_ BTN_TGT_AND_SRC
FROM schema.HEARTBEAT_TABLE
          ORDERED BY (REC_TYPE, SRC_DBNAME, LINE_OF_BUSINESS, SRC_TIMESTAMP);

  • A set of queries or reports will need to be created showing the data latency detail records or averages over a specified time period.
  • A process will need to be created that will periodically purge obsolete records from the source Heartbeat table based on a specific date range. Setting the delete date range needs to be done with caution. When a record is deleted from the source Heartbeat table, the incremental load process will also remove the replicated records from the target Heartbeat table.

Table of Contents

Success

Link Copied to Clipboard