• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date May 25, 2021 |

Business Goal

Over the past few decades most organizations have implemented at least one Data Warehouse, with the common purpose of providing enterprise-wide reporting and a single version of the truth data analysis. A typical data warehouse implementation extracts data from a variety of sources using an on-premise tool or code set with the purpose of cleansing, relating and ultimately storing a copy of key data elements. The storage medium is often a large on-premise database designed with a star schema model for storage and reporting.

 

Over the years Data Warehouses have provided many benefits for organizations, but there are a number of limitations that have companies seeking ways to improve their Data Warehouse infrastructure:

 

Storage Enhancements: Data growth and expansion of unstructured data sources can put a strain on the capability and cost of current storage technologies. More cost effective and scalable storage solutions have been introduced (often exposed in the cloud) that take the burden off of the IT team to manage, upgrade , backup, and expand the storage technology for new data volumes. Modern storage technology also enhances the ability to manage and process more data of wider varieties.

 

Analytics: Real-time and Artificial Intelligence capabilities are often key components in modernized data warehouses. These features may require new data structures and tools to provide end users with faster and more advanced data analysis. These new capabilities increase the value end users receive from the data, while reducing the delay and burden on IT to publish new metrics and analytics through the use of self-service data preparation.

 

Cost Control: Cloud based technologies can provide a more predictable cost model and potential cost savings in a number of ways. On the hardware side, cost is incurred as needed by leveraging compute cycles on-demand, rather than paying for a large technology footprint which is only fully used on rare heavy-volume occasions. On the software side, upgrades and maintenance costs are reduced, as these are often provided by the cloud vendor.

 

While the Data Warehouse continues to be the cornerstone of standardized analytics, dashboards, and reporting that drives key business decisions, today’s advanced technology can provide much more value at potentially a lower overall cost. Organizations can potentially attain a number of tangible business benefits by reviewing their current architecture and determining how to best leverage new technologies and capabilities to renew and modernize their environment. Informatica’s Data Warehouse Modernization solution is a roadmap to help make this transition while a current Data Warehouse is already in place. A well-planned modernization approach will ensure minimal disruptions and loss of value while installing a modernized architecture for the next generation of data analytics.

Getting Started with a Data Warehouse Modernization

Let’s explore key steps to implementing a modern Data Warehouse, assuming a move from an on-premise architecture for data management and data storage to a cloud-based solution:

 

1. Select and Configure a Modern Data Warehouse Architecture

There are a number of cloud-hosted and born in the cloud storage technologies that provide scalable compute and storage cycles and a variety of new data access capabilities for the modern Data Warehouse. These tools generally come with the need for less IT support as the cloud vendor is responsible for uptime SLA’s, data backup, and general system maintenance. The Modern Data Warehouse typically involves some or all of the architecture hosted in the cloud. The architecture of the data structures also needs to be evaluated and potentially redesigned/realigned to a structure that best leverages these modern storage technologies. The new modernized architecture will be setup and configured to run in parallel with the existing Data Warehouse through the duration of the  

 

2. Determine Data Integration Technology

In conjunction with a move to the cloud for storage and access, companies often find this is an optimal time to modernize data integration technology to a cloud-based solution as well. Cloud-based integration removes the need to manage integration-only hardware and servers and provides quicker and more seamless upgrades. For Informatica customers this often means a migration from Informatica PowerCenter to Informatica Intelligent Cloud Services. While it’s possible to  continue to leverage a technology like Informatica PowerCenter to load new cloud storage technologies, since testing will occur on the integration loads into the new storage technology, it’s a great time to pivot to Informatica’s Cloud Integration Platform. Modern cloud integration tools provide a host of long-term benefits such as new cloud optimized integration features, reduced infrastructure maintenance, and elastic compute options.  If multiple integration technologies are being used for the current Data Warehouse, this is also a golden opportunity to rebuild and migrate any other technology used to load the Data Warehouse onto a single platform. Database scripts, hand code, and other one-off integration technologies that are part of the current Data Warehouse are potential candidates for moving into a single integration platform for better management and control.

 

3. Migrate Data Integration Infrastructure

Three required components of migration include:

  • Migrate Schema – This is the first step to creating the metadata in the modern data warehouse. This includes migrating table structures, table specifications, indices, etc., and these may either be in a mirror format to the original or modernized to take advantage of the new storage technology.
  • Migrate Data Integration Processes – The current processes may need to be modified to optimize for the new platform or to change the target database load type. Depending on the level of change to the data warehouse schema, in some cases it may be optimal to recreate the jobs in the new architecture, or to automate the migration and configure the remaining changes.
  • Migrate Data Integration Workflows/Jobs – The schedule of data updates and refreshes may change for the new technology. Informatica recommends reviewing and updating current load patterns to account for potentially more elastic cloud compute cycles (to run more jobs in parallel).

 

There are going to be inherent changes to loading data into a new modern data storage technology. Even if the schema is not altered, the methods, transports, and capabilities are likely different across technologies. These patterns will need to be migrated into the modern data integration platform.

 

Informatica provides a conversion tool to assist customers in moving from PowerCenter to Informatica Cloud Data Integration. However, even this process will require some intervention as there are often objects within these data integrations that will require updating that are very tied to the previous storage tool. Informatica also recommends moving non-Informatica related data integration components into the Informatica Cloud Platform at this time. The timing advantage for making this change now will be in having a working version to use as a model to quickly replicate in the modern technology while validating against the current Data Warehouse processes and results.

 

Organizations may also look to leverage real time capabilities to populate data more frequently to the Modern Data Warehouse as part of this migration or they could wait for a future phase of modernization while simply focusing on the standard refresh cycle. Implementing real time capabilities may require additional time in project planning.

 

4. Perform One Time Data Migration

There are cases where the new storage technology can be reloaded by simply pulling all historical data from source systems. For many organizations, however, this is impractical because the current Data Warehouse has more history than what exists in the current systems and there is history of changes over time that the operational systems do not store. For this reason, often the best solution is a one-time load of the current Data Warehouse data into the new Modern Data Warehouse using a number of one-time loads. Informatica Cloud Data Integration can be leveraged to automate the build of these jobs to quickly port the current snapshot of the existing Data Warehouse into the Modern Data Warehouse. This is often a network-intensive (and it can be a time-intensive) one-time task. Informatica recommends executing this step while still running both warehouses in parallel, rather than waiting for the cutover to the new technology. This allows for time to fine-tune any issues with the model and configurations in the modernized storage technology.

 

5. Operationalize Data Ingestion to Modern Data Warehouse

After the one-time load is completed to the Modern Data Warehouse, the incremental loads will be turned on to ingest ongoing changes into the Warehouse. These loads will run in parallel for a time, as testing is performed to ensure that the Modern Data Warehouse results match the old Data Warehouse. Any data anomalies will be researched and addressed. This parallel run-time may increase stress on the source systems with duplicate extracts of data, so while the length of time this persists should be minimized, it is however, an important time to validate and ensure that the modernized warehouse is providing the same or better accuracy than the current data warehouse. Informatica provides validation tools to accelerate the validation process.

 

6. Reporting and Analytics Redeployment

The reporting and analytics environment supported by the current Data Warehouse will need to be repointed and rereleased using the Modern Data Warehouse as the new source, and potentially new visualization tools as well. The effort to migrate this infrastructure may be simple and seamless or it may require effort depending on how different the new storage technology is from the previous as well as new capabilities being utilized such as expanded data preparation and ad hoc analysis. The previous technology may not be able to be fully decommissioned until the current reporting, analytics, and extract infrastructure related to the current Data Warehouse is repointed and users are migrated to the new tools and warehouse.

 

7. Decommission Old Data Warehouse and Related Infrastructure

The now historical Data Warehouse can now be decommissioned, freeing up servers, data base licensing, and IT support costs. The Modern Data Warehouse is now the center of the analytics infrastructure, providing all the benefits of cloud-based systems such as scalability and upgrades ease, and reduced support costs.  Depending on industry requirements some organizations may choose to archive the previous warehouse for legal concerns or other reasons. 

 

8. Evaluate and Improve

From here an organization may expand into the next phase of modernizing the Data warehouse such as continuing to evolve batch loads to more real-time loads where possible, as well as expanding new data analysis and mining capabilities provided by the new Modern Data Warehouse technology. Additional data sets previously too big or too complex to be managed by older storage technology may now be leveraged in the modernized scalable technology. This may drive new areas for insights and end user value to drive enhanced business outcomes.

Table of Contents

RESOURCES

Cloud Datawarehouse & Data Lake

Article

PLAN

Article

IMPLEMENT

Best Practice

MONITOR

Article

OPTIMIZE

Success

Link Copied to Clipboard