Following a rigorous methodology is key to delivering customer satisfaction and expanding analytics use cases across the business.
During Data Governance, glossary definitions questions often arise on how to set up the Glossary when different entities share the same type of data.
For example, a Person has a Name, Address, Phone, Email but so does an Organization you may do business with.
The question is how can Person information be segregated from Organization information, without repeating the information they have in common?
This best practice explains how to establish an optimized and efficient design while not repeating definitions in separate hierarchies and yet still allow for domain specific details to exist. The goal is to minimize the number of overall glossary items and avoid “Glossary Bloat”. This best practice applies primarily to the Glossary facet, but the concepts can be applied to any Data Governance facet where multiple relationships are allowed.
In a scenario where there is information on both individual persons and organizations that our managed in systems, repositories and reports, there can be a lot of overlap in the individual and organization details such as names, address, phone numbers, emails, etc.
Two separate hierarchies could be created in the Data Governance tool Glossary facet and these detailed could be replicated in each hierarchy.
This typically makes people happy since they can see everything there is to know about that domain in one neat hierarchy.
However, the definitions of both the repeated glossary items may be the same, therefore there are two places to maintain that definition and also a greater number of glossary items have been created than are needed, leading to “glossary bloat”.
In addition, the repeated Glossary item now has to be “fully qualified” with either its Parent Glossary Name or its Glossary ID, so that the Data Governance tool will be able to differentiate between the duplicate names.
Many customers turn to using prefixes or suffixes to make the names unique, even though the definition may be the same. This technique of prefixing or suffixing is discouraged for the same reason as above since it leads to two objects with almost the identical definition, and also lead to “glossary bloat”.
This is a sample of a design that was implemented for a major European bank.
Success
Link Copied to Clipboard