Following a rigorous methodology is key to delivering customer satisfaction and expanding analytics use cases across the business.
There are many reasons that organizations choose to improve Data Quality. These reasons may include any or all of the following.
A data quality (DQ) project usually begins with a specific use case in mind. Regardless of the specific need, planning for the data quality project should be considered an iterative process. As change will always be prevalent, data quality must not be considered an absolute.
An organization must be cognizant of the continuing nature of data quality whenever undertaking a project that involves improving data quality. The goal of this Best Practice is to set forth principles that outline the iterative nature of data quality and the steps that should be considered when planning a data quality initiative. Experience has shown that applying these principles and steps will maximize the potential for ongoing success in data quality projects.
Reasons for considering data quality as an iterative process stem from two core concepts. First, the level of sophistication around data quality will continue to improve as a DQ process is implemented. Specifically, as the results are disseminated throughout the organization, it will become easier to make decisions on the types of rules and standards that should be implemented; as everyone will be working from a single view of the truth. Although everyone may not agree on how data is being entered or identified, the baseline analysis will identify the standards (or lack thereof) currently in place. Once the initial data quality process is implemented, the iterative cycle begins. The users become more familiar with the data as they review the results of the data quality plans built to standardize, cleanse and de-duplicate the data. As each iterative cycle continues, the data stewards should determine if the business rules and reference dictionaries need to be modified to effectively address any new issues that arise.
The second reason that data quality continues to evolve is based on the premise that the data will not remain static. Although a baseline set of data quality rules will eventually be agreed upon, the assumption is that as soon as legacy data has been cleansed, standardized and de-duplicated it will ultimately change. This change could come from a user updating a record or a new data source being introduced that needs to become a part of the master data. In either case, the need to perform additional iterative cycles on the updated records and/or new sources should be considered. The frequency of these cycles will vary and are driven by the processes for data entry and transformation within an organization. This can result in processes that range from a need to cleanse data in real-time to performing a nightly or weekly batch processes with notifications. Profiles or applicable metrics should be monitored on a regular basis by Data Stewards to determine whether the business rules initially implemented continue to meet the data quality needs of the organization or have to be modified.
The questions that should be considered when evaluating the continuing and iterative nature of data quality include:
The answers to these questions will provide a framework to measure the current level of success achieved in implementing an iterative data quality initiative. These questions should be reflected upon frequently to determine if changes are needed to the data quality implementation or to the underlying business rules within a specific DQ process.
Although the reasons to iterate through the data may vary, the following steps will be present in each iterative cycle:
As noted in the above diagram, the iterative data quality process will continue to be leveraged within an organization as new master data is introduced. By having defined processes upfront, the ability to effectively leverage the data quality solution will be enhanced. An organization’s departments that are charged with implementing and monitoring data quality will be doing so within the confines of the enterprise wide rules and procedures of the organization.
The following points should be considered as an expansion to the five steps noted above:
Success
Link Copied to Clipboard