• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date Jul 21, 2021 |

Challenge

During Data Governance, glossary definitions questions often arise on how to set up the Glossary when different entities share the same type of data.

For example, a Person has a Name, Address, Phone, Email but so does an Organization you may do business with.

The question is how can Person information be segregated from Organization information, without repeating the information they have in common?

Description

This best practice explains how to establish an optimized and efficient design while not repeating definitions in separate hierarchies and yet still allow for domain specific details to exist. The goal is to minimize the number of overall glossary items and avoid “Glossary Bloat”. This best practice applies primarily to the Glossary facet, but the concepts can be applied to any Data Governance facet where multiple relationships are allowed.

General Recommendations

The Problem:

In a scenario where there is information on both individual persons and organizations that our managed in systems, repositories and reports, there can be a lot of overlap in the individual and organization details such as names, address, phone numbers, emails, etc.

Two separate hierarchies could be created in the Data Governance tool Glossary facet and these detailed could be replicated in each hierarchy.

gloss-design1

This typically makes people happy since they can see everything there is to know about that domain in one neat hierarchy.

However, the definitions of both the repeated glossary items may be the same, therefore there are  two places to maintain that definition and also a greater number of glossary items have been created than are needed, leading to “glossary bloat”.

In addition, the repeated Glossary item now has to be “fully qualified” with either its Parent Glossary Name or its Glossary ID, so that the Data Governance tool will be able to differentiate between the duplicate names.

Many customers turn to using prefixes or suffixes to make the names unique, even though the definition may be the same. This technique of prefixing or suffixing is discouraged for the same reason as above since it leads to two objects with almost the identical definition, and also lead to “glossary bloat”.

The Solution:

This is a sample of a design that was implemented for a major European bank.

Concept 1:

  • Create a generic Glossary domain called Party along with the Person domain and Organization domain.
  • Everything that is generic across all types of parties (e.g., name, address, phone, email, etc. ) goes into the Party Hierarchy as parent/child relationships. (The diagram uses solid blue lines because parent/child relationships show up in Blue text in the Data Governance tool.)
gloss-design2

Concept 2:

  • Then if there are specific nuances just for an Individual vs. an Organization, those glossary items go into a separate hierarchy just for the Individual or Organization as parent/child relationships (again Blue solid lines in the diagram).
  • But first question whether it is really necessary to distinguish between a Personal Phone and a Business Phone? The format is the same, they are dialed in the same way, etc. Is it really worth the extra effort to maintain two different kinds of phone numbers?
gloss-design3

Concept 3:

  • Some users want to know the “world of names” or all the types of names where a Glossary x Glossary relationship is created between the Party.Name and the Individual.LegalName or Party.Name and OrganizationDBAName, etc. (Orange dotted lines in the diagram since these relationships show up in Orange in the Glossary Hierarchy in the Data Governance tool.)
  • When it’s preferable for the Individual domain and the Organization domain to also be associated with the things they have in common in the Party domain, a Glossary x Glossary relationship is created between those objects.
  • Again, determine if the extra work in creating and then maintaining these additional relationships provides business users extra value? Is it worth it? Remember, the Data Governance tool is not a data modeling tool, so if this is difficult for a business’s users to understand, then the best approach may be to leave them in their simpler format as depicted above.
  • Using the Glossary x Glossary relationship provides many-to-many relationships instead of a parent/child relationship that provides a one-to-many relationship (parents can have many children, but children can only have one parent).
gloss-design4

Concept 4:

  • Another possible key to determining whether to create a separate Glossary item is if the Policies associated with the Glossary item would be different for an Individual than for an Organization.
  • Person information is treated with far more security, masking/obscuring, and consent policies due to privacy regulations, than Organization information which is generally more public.
  • In this case there might be a Privacy Policy associated with a Person Name, and a Business Sensitivity Policy associated with an Organization Name; among other policies using a Glossary x Policy relationship.
gloss-design5
  • Or a simpler approach with less relationships would be to just keep the single Glossary item, and attach both personal and organizational policies to it and let a later process/person figure out which one applies in their usage scenario. Ask your business users if this provides the value they are looking for.
gloss-design6

Concept 5:

  • If it is necessary to go deeper and identify the differences between an Employee vs. Customer, follow the same process as above, adding the Glossary items in common to the Individual domain.
gloss-design7
  • Then adding the specific Glossary items for the Employee domain as children and the specific Glossary items for Customer as children
  • Then relate the Employee and Customer domains to the Individual domain as Glossary x Glossary relationships.
gloss-design8

Summary:

  • These concepts are true for any Facet x Facet relationship – relationships provide a many-to-many relationship, except for some Facets that are restricted though to just one relationship.
    • For example, a Data Set can have one, and only one Glossary,
      • but a Data Set can have many Product relationships
      • or a Data Set can have many Process relationships, etc.
  • However, just like ice cream, too much of a good thing can be bad. So always weigh the pros and cons of the extra work of creating and maintaining more glossary items and more relationships against the business value gained.

 

Table of Contents

Success

Link Copied to Clipboard