• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
    All your Engagements at one place
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date Aug 08, 2024 |

Challenge

When starting an MDM program, data profiling and discovery are essential to the success of the program. Leveraging IDMC’s Cloud Data Governance and Catalog (CDGC), the profiling and discovery is systematic and comprehensive. Identified data sources can be scanned in whole, profiled, documented against enterprise glossary terms, and have data quality rules run seamlessly to quantify the existing state of data quality.

Adding CDGC as a part of the MDM process will support MDM development. Tables are profiled in a common setting with the ability to classify fields with out-of-the-box sensitivity rules, add custom metadata tags to identify matching fields. Additionally, users can view data lineage for all sources and transformations, both prior to the MDM development as well as post development.

Description

Identify MDM Domains, Tables and Fields.

While gathering requirements for the MDM domain, a glossary hierarchy should be built to correspond to this requirement. Glossary Terms can be documented and tagged as part of the MDM Data model.

Leveraging CDGC workflows, these domains and fields can be socialized with stakeholders, comments captured in the workflow ticket, and approvals or disagreements can be stored in one location. No more searching through emails, meeting minutes, or spreadsheets to find the decisions and historical revisions to the approved definitions and metamodel.

Identify the Critical Data Elements i.e. Client name, Address, Phone Number

With the data glossary in progress, critical data elements can be tagged. These elements will be critical to building your MDM ingress and matching and merge rules. Using CDGC lineage functionality, you can analyze and determine the source of data which can support survivorship rules.

Find and certify the Tables that will support MDM

Once domain metadata has been scanned in CDGC, an analysis can be performed to review the tables that are part of the ingress process. Tables can be certified in CDGC and highlighted as a certified table indicating that it has been reviewed and deemed a valid, clean, and critical table.

Build Systems and Data Sets identifying ingress fields

With scanned metadata in CGDC, you can build representations of database systems and tables. Each MDM table becomes a data set built with a selection of fields from the table. These datasets can run through the CDGC workflow approval process to capture decisions and approvals on each of the ingress tables.

Building data sets and systems in CDGC will provide a graphic view of the data model and the lineage of the data. MDM developers have a quick search function to find information about tables to ingress, and a quick profile view of fields that will expedite development.

Analyze Data Profiling Results for Data Quality and Standardization Opportunities

When MDM Domains are established and the metadata scanned in CDGC, the profiling is included in the metadata scanning. This profiling provides the following information:

  • Inferred Data Types
  • Pattern
  • Value Frequency

Using the CDGC profile and the identification of key data elements, you can narrow the selection of tables for a full data profiling. This eliminates extra work to profile tables that will be useful in the MDM Project.

Build Data Quality Rule Specifications

From the data profiling results, data quality rules can be identified and built to specify, Reference Tables, Standardization rules, and identifying Null Values. The Rule specifications can be added to the data profiling to give a quick look in the data quality. These data quality rules also help identify the data quality issues that may have to be resolved before match and merge testing is started.

Using the data quality rule results, reference tables can be identified that will help data standardization on ingress.

Add Data Quality Rule Templates and Occurrences to CDGC

Data Quality rules that are built for the critical tables can be applied to all the metadata that is scanned in CDGC. Each rule is associated with a particular data glossary item. Each instance of this glossary item associated to a technical field produces a unique data quality occurrence. Each time the database is scanned, the data quality rules will run and a history of the scores is captured in CDGC. This allows for tracking and verifying the success of a data clean up.

Data quality rules in CDGC are aggregated both at the table level and for each glossary item. For example, the ability to review the data quality across all fields in the table “crm. customer” as well as the table salesforce.client. This can help with survivorship and golden record requirements.

Leveraging Cloud Data Governance and Catalog while building MDM requirements provides guidance on what data is available, where the data comes from, and how the data could be used.

Table of Contents

Success

Link Copied to Clipboard