• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Last Updated Date May 25, 2021 |

Challenge

End-to-end (E2E) data mapping is often the critical path on business intelligence, data migration, data warehouse or data synchronization projects. This best practice defines E2E mapping as the combination of target definition and requirements, source system data discovery and analysis, data profiling, mapping rules definition, and development and testing. In laymen terms, “getting the data right” is what ultimately determines the project timeline.
 

In waterfall methodologies, the tasks associated with E2E mapping may be assigned to five or more teams, with each team communicating via detailed deliverables (often in unstructured formats such as Excel or MSWord) and with formal processes for document review, sign-off, and change control. A waterfall methodology works well for a Data Integration (DI) project where the requirements are clear, well-articulated and stable; the source data is well understood and high quality; and where the transformation rules are simple and unambiguous. However, when this is not the case, the process of E2E analysis, mapping, and development is iterative as data issues and needs are discovered and refined.
 

The combination of imperfect knowledge and evolving requirements creates several challenges:
 

  • When multiple teams are operating as silos with formal (slow) interactions between them, the iterations, in turn, are slow and what should take hours or days can take weeks or months.
  • Each team has an incomplete picture of knowledge about the E2E DI details and, without appropriate and comprehensive documentation, each team makes assumptions that result in wasted effort and rework.
  • Team members spend a significant amount of time documenting and communicating information to other teams and reviewing and validating documentation from other teams. Much of this documentation is solely for the purpose of communications between the teams and would not be needed if the staff worked as one team on an E2Edata flow.
  • By the time the project is delivered, the business has progressed and the requirements may have changed to the point where the delivered solution no longer provides the same level of business value as was originally envisioned.
  • The desire to plan everything at the beginning of a project followed by design, then development and testing assumes that there is perfect knowledge of the E2E integration. This makes projects inflexible to change which often conflicts with the reality of many projects constantly changing: market conditions change, business requirements change, and customer preferences change. In these instances, the complexity of IT projects makes it difficult to initially predict all requirements, schedules, and resources.
     

Lean and Agile DI best practices address these challenges.

Description

Preconditions and Constraints for Agile Data Integration

Adopting Lean and Agile DI requires using an Agile project methodology and organizing project staff into work-cells or high-performance teams. If an organization is strongly entrenched in using a waterfall methodology and organizing staff into functional silos with formal documentation and governance, then this best practice will not work.
 

Only apply Agile Data Integration in contexts where Lean and Agile practices have already been established or where the operational and cultural transformation for a Lean and Agile approach is an explicitly planned and managed aspect of the project.

What Is Agile Data Integration

Agile DI increases the chance of success, delivers projects faster, and reduces defects. Applying Lean principles within the organization can help ease the transition to Agile DI. First and foremost, Lean recommends an organizational focus on eliminating waste and optimizing the DI process from the customers’ perspective.
 

Agile Data Integration maximizes the business value of projects (e.g., Agile Business Intelligence (BI), Data Warehousing (DW), Big Data Analytics, Data Migration, etc.) by delivering exactly what the business needs as desired. Agile breaks big projects into smaller, more manageable, deliverables in order to incrementally deliver value to the business. Agile DI also recommends the following:
 

  • Organize Cross-Functional, Collaborative Teams rather than Silos – A term that is sometimes used is High Performance Team; a tight-knit group of people focused on a common goal. Within the high-performance team, people are highly skilled and are able to interchange their roles. Also, leadership within the team is not vested in a single individual. Instead, the teams are somewhat self-organizing and leadership is occupied by various team members according to the need at that point in time. The team has shared norms and values, feels a strong sense of accountability for achieving their goals, and displays high levels of mutual trust towards each other.
  • Use Empirical Process Control – Do not try to plan everything immediately (fixed date, fixed cost, fixed price). DI Projects, by nature, contain many elements that are unpredictable and unknown, but it may not be obvious in advance. The illusion of predictability and a precise plan is reassuring and may not be easy to overcome, but it removes flexibility from the project. Since change is constant, be flexible and agile. Due to the unpredictability of DI projects, it is common in Agile DI to use empirical process control rather than trying to plan everything first.
  • Facilitate Collaboration – Frequent and effective collaboration between core team members is required. For ongoing projects like DW/BI, teams struggle with changing requirements and complex technologies. Agile requires much more collaboration between management, the customer, and development teams versus more traditional project approaches. Requirements can emerge as the product matures or the business environment changes. DW/BI projects, just like many system development projects, are unpredictable in nature. Empirical process control is best suited for managing complex processes. It requires frequent inspection and adaptive response.
  • Remove Impediments – The project leader or Scrum Master is tasked to remove impediments and make sure the project team members are most productive. Provide team members with the best tools to complete their tasks efficiently and effectively. Remove non-value added tasks from their daily routine and continuously optimize the process to meet the needs of the customer.
  • Assess Projects Frequently – Agile gives management direct control of the project with frequent assessment of value, cost, duration, and etcetera. The team needs the tools to understand and measure progress, quality, and what is left to complete. Frequently inspect, assess, and adjust such as in a Daily Standup Meeting. Demonstrate any piece of user functionality on the selected technology (e.g., Sprint goal in SCRUM).
  • Adapt to Change – Learn from each iteration and quickly adapt to changes. Business requirements will change, and new data sources may emerge. Plan for change, and implement a flexible data architecture—built on adaptive data services—to hide underlying data complexities, insulate projects and applications from change, maximize reuse, and ensure business rules and policies are consistently applied across projects and the enterprise.
  • Balance Design and Production – It is important to balance design and production so that the project continuously evolves to meet requirements and incrementally delivers value to the business. Too often there is a lot of design with little construction or too much construction without enough design. This separation of design and production results in too many tasks that do not contribute to creating software (e.g., endless meetings, back-and-forth between stage gates, rework, rubber-stamp code reviews, etc.).

Agile Data Integration Tools and Techniques

Agile DI can be performed with virtually any integration tools to some degree. The Informatica 9.1 platform facilitates the process by providing features which explicitly supports Agile DI with best practices that encourage good data governance, facilitate business-IT collaboration, promote reuse and flexibility through data virtualization, and enable rapid prototyping and test-driven development. Ideally, organizations that want to successfully adopt Agile DI should standardize on the following practices and leverage Informatica 9.1 to streamline the DI process, improve data governance, and provide a flexible data virtualization architecture. 
 

  • The business and IT work efficiently and effectively to translate requirements and specifications into data services – Informatica enables analysts and developers to collaborate on source-to-target ETL mapping specifications using role-based tools and a central metadata repository. This is also referred to as self-service DI because it empowers the business to be autonomous and collaborate more efficiently with IT in order to quickly receive the necessary information while IT retains control of the process for governance and compliance. Self-service DI streamlines the communication between the business and IT and eliminates errors so that IT can focus on more value-added tasks. For example, the DI Analyst option enables analysts to quickly translate business requirements to mapping specifications and validate DI logic without the help of IT. Analysts can easily share these specifications with developers and automatically generate mappings to increase efficiency and deliver relevant, trustworthy, and authoritative information to business users for their Agile DI projects.
  • Rapid prototyping and validation of mapping specifications and business rules – Rapid prototyping and iterative development is a key Agile best practice. For example, by using the Informatica 9.1 platform, analysts can directly validate DI specifications and data quality rules within the Informatica Analyst browser-based user interface. Developers can rapidly prototype data services, starting with a logical data object connected to physical data sources through what are called read-maps. The logical data object can be profiled and validated by all stakeholders, including developers and analysts.
  • Ensure the information is consistent, complete and accurate – It is important to have a data governance program in place for Agile DI to provide information transparency, integrity, and confidence. This requires the ability to create a business glossary of terms linked to an information catalog containing metadata about data sources and targets, source-to-target ETL mappings, and other DI objects. This enables analysts to quickly find data by searching on business terms and browsing the data lineage.
  • Data profiling is used continuously throughout the design and development phases – Column level profiling and comparative profiling helps analysts, stewards, and developers measure progress and quality frequently throughout the Agile DI implementation lifecycle. This enables continuous, test-driven development to avoid mistakes downstream. Infer both primary and foreign key relationships directly in the developer tool using model profiling. Confirm join conditions will work, identify orphan rows, and find redundant values in other tables.
  • Data quality is built into the DI process to deliver trusted data – With a unified development environment, it is easy to find, reuse, and include data quality rules in the data processing pipeline. Browser-based dashboards provide complete visibility of data quality for the most important data for all Agile BI and DW project stakeholders at each and every iteration. In this way, all stakeholders can easily monitor data quality and quickly respond to trends.
  • Insulate applications from underlying data changes – One of the techniques for insulating front-end reporting or analytics systems from changes to the underlying systems is to use a canonical (common) data model for information exchanges. The Informatica 9.1 platform facilitates this technique through data virtualization, built on model-driven data services architecture. Informatica’s adaptive data services enable Agile BI organizations to build a data virtualization architecture that can access and federate any data source, implement data governance policies (e.g., access, quality, retention, privacy, and latency), and help agile teams manage data assets by hiding the underlying data complexities from consuming applications, essentially insulating them from underlying data changes.
  • Minimize delivery time and maximize reuse – Reusing integration components is another technique for minimizing delivery time. The Informatica 9.1 platform facilitates reuse via adaptive data services which provides flexible deployment options. The team designs and builds data services once and deploys them as needed through ODBC/JDBC, web services, or a batch ETL process. This maximizes reuse across Agile DI projects and enables the business to quickly receive the necessary data.
  • Automate repetitive processes such as testing – Agile teams should look for opportunities to build scripts or tools to automate routine activities or leverage capabilities in off-the-shelf software. For example, Informatica provides several tools to help decrease the time it takes to perform unit testing and system testing for Agile DI projects. Comparative profiling enables developers to quickly compare profiles before and after transformations. The Data Validation option enables automated tests that compare actual target results with expected results of the ETL process, thereby reducing test cycle time by up to twenty times. Data subsetting enables the creation of realistic datasets based on production data, increasing test coverage and lowering costs, which reduces defects. Data masking protects sensitive data minimizing any risk of a security breach.

Agile Data Integration Architecture

Agile Data Integration Architecture

Flexible and agile DI architecture is critical for supporting the needs of an agile organization. Informatica provides comprehensive data services, role-based tools, unified administration, and universal connectivity as the technical foundation for Agile best practices. 

How to Get Started with Agile Data Integration

To get started, identify an executive or senior manager to sponsor an early win project to prove the value of Agile DI. First and foremost, remember the Lean principle which focuses on knowing the customer, eliminating waste, and optimizing the process from the customers’ perspective. This is essential to ensuring the success of the first project. Identify change agents and people who are passionate about achieving success to deliver maximum business value to the customer. Create an “as-is” and a “to-be” value stream map of the DI process. Informatica Professional Services offers a Lean Integration Value Stream Mapping Workshop to give participants the experience of applying lean principles and optimizing DI processes. Over time, continue to standardize and leverage the full capabilities of the Informatica platform to streamline the DI process, facilitate business-IT collaboration, improve data governance, and provide flexible data virtualization architecture.

Table of Contents

Success

Link Copied to Clipboard