Data Engineering Integration (Big Data Management) delivers high-throughput data ingestion and data integration processing so business analysts can get the data they need quickly. Hundreds of prebuilt, high-performance connectors, data integration transformations, and parsers enable virtually any type of data to be quickly ingested and processed on big data infrastructures, such as Apache Hadoop, NoSQL, and MPP appliances. Beyond prebuilt components and automation, Informatica provides dynamic mappings, dynamic schema support, and parameterization for programmatic and templatized automation of data integration processes.
The Beginner Level of learning path will enable you to understand DEI fundamentals. It constitutes of videos, webinars, and other documents on introduction to DEI, DEI architecture, Blaze architecture, use cases on Azure, Amazon cloud and many other ecosystems, integration with AWS and many more.
After you have successfully completed finish all the three levels of DEI product learning, you will earn an Informatica Badge for Data Engineering Integration (Big Data Management). So start your product learning right away and earn your badge!
This module covered an introduction to DEI, which consisted of a short overview of Big Data Management and getting started with DEI and discussed various functionalities of Big Data.
This module also discussed how to redeploy PowerCenter applications and mappings into the Big Data world, setting up and configuring the Spark Engine, Blaze configuration, how to collect logs and what are the locations for the blazed log, common issues and how to troubleshoot them, offering advanced Spark functionality, solution overview, themes, drivers and new use cases for BDS 10.2.2, Stream processing and analytics.
You also explored how to handle DEI on Microsoft Azure, DEI on Amazon Cloud, DEI on Cloudera Altus, DEI on MapR. You also got an insight of operational insights and goals, workspaces, Informatica Azure Databricks Support, how to create ClusterConfigObject (CCO), Cluster provisioning Configuration, Databricks Connection and DEI's capabilities to integrate with AWS ecosystem.
Now move on to the next "Intermediate level" for your DEI onboarding and get to know more about the product.
This video provides a brief introduction and overview of the Informatica Data Engineering Integration (Big Data Management).
Watch the video to learn more about Hadoop and next-generation data integration on the Hadoop platform and its features.
Big Data: What you need to know
If you are a beginner in DEI products, go through the following primer to get brief information on each product and the services provided.
You could also find information on the tools, documentation, and resources associated with every product. This primer includes information on the tools, services, and other resources for the following:
- Data Engineering Integration (Big Data Management)
- Data Engineering Quality (Big Data Quality)
- Enterprise Data Lake
- Data Engineering Integration Streaming
- Enterprise Data Catalog
A cluster configuration is an object in the domain that contains configuration information about the Hadoop cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the Hadoop environment. Creating CCO helps a beginner in DEI products to create hive, hdfs, Hadoop, and hbase connection.
Metadata Access Service is a requirement in the Data Engineering Integration context to allow Developer Client Tool (DxT) installation/configuration to be simplified so that the end-user doesn't need to install/configure various adapter-related binaries with every DxT installation. This video provides an introduction to Metadata Access Service (MAS) and also explains the adapters enabled through MAS.
This video discusses the capabilities of Avro and Parquet. It also describes the benefits of using Parquet and explains how to transform from Avro to Parquet.
Informatica Data Engineering Streaming (Big Data Streaming) provides real-time streaming processing of unbounded Data Engineering Integration. Go through this video to get an introduction to the functionalities of Data Engineering Integration, Stream Data Management Reference Architecture, and use cases that Data Engineering can help you with.
This video gives an overview of Big Data Streaming (BDS) and its key concepts. It also describes Kafka Architecture and license options for Informatica BDS.
This video explores Streaming mapping and the sources and targets that are supported in Streaming Mapping. It also explains how to create a BDS mapping in Informatica Developer with Kafka source.
This video provides information on Confluent Kafka in Data Engineering Integration v10.4. It also explains how to create a Confluent Kafka connection using infacmd.
This video discusses how to enable Column Projection for Kafka Topic using Informatica Big Data Streaming.
This video explains how to create Amazon Kinesis Connection in the Informatica Developer tool, in the admin console, and the command line.
This video demonstrates the ease with which you could use the existing PowerCenter applications and mappings and redeploy them into the Big Data world.
The video discusses what the PowerCenter Reuse report identifies and contains a quick demo that explains how to reuse existing PowerCenter applications for big data workloads, generate a report to assess the effort in the journey to big data world and use the import utility to seamlessly import the PowerCenter mappings.
Learn more about Blaze - one of the execution engines that Informatica uses. The video gives you an overview of the following features in Blaze:
- Blaze configuration
- How to collect logs
- What are the locations for the blaze log
- Common issues and how to troubleshoot them
- Tips and tricks while you are using Blaze
The video explains the setting up and configuration process of the Spark Engine on Informatica Data Engineering Integration. The video also has a demonstration that guides you through the entire set-up process and important information useful while using the Spark Engine in real-time scenarios.
Learn how DEI integrates seamlessly with Cloudera Altus. The video gives an overview of Cloudera Altus, a typical scenario in Informatica's customer base, followed by a demo.
This video explains what Azure Databricks is and how it is integrated with Informatica Data Engineering Integration.
The video will also take you through workspaces, Informatica Azure Databricks Support, how to create ClusterConfigObject (CCO), Cluster provisioning Configuration, Databricks Connection followed by a demo.
Learn how to implement end-to-end big data solutions in the Amazon Ecosystem using Data Engineering Integration (Big Data Management) 10.2.1.
The video explains the DEI 10.2.1 capabilities to integrate with the AWS ecosystem.
Go through this video to learn about how to integrate Informatica BDM and Azure DataBricks Delta.
Watch this video to learn how DEI helps resolve the "Data Lake on Cloud" use case in the Azure ecosystem followed by a demo.
If you are moving your on-prem data to Cloud and processing from on-prem to Azure, this video would be helpful in understanding how DEI can enable you in your journey to Cloud. The video also discusses different features and functionalities in the Azure ecosystem and a use case.
Watch another video to learn how DEI helps resolve the “Data Lake on cloud” use case in the Amazon ecosystem followed by a demo.
If you are moving away from the on-premise data warehouse and moving towards cloud and data lake scenarios, this video would help you in understanding how DEI can help you in this journey to Cloud. This video also explains the additional Informatica features on Amazon, and a data lake use case supported by a demo.
The video takes you through DEI 10.2.2 and elaborates on its strategic themes, one being an enterprise-class product in the big data ecosystems, offering advanced Spark functionality, and being available across clouds and connectivity.
The video takes you through Enterprise Data Streaming (EDS) Management Solution Overview, themes, drivers, and new use cases for EDS 10.2.2, Stream processing and analytics, Streaming Data Integration, Spark structured Streaming support, and how that can help you, CLAIRE Integration and much more.
Operational Insights is a Cloud-based Application that provides visibility into the performance and operational efficiency of Informatica assets across the enterprise. This webinar will introduce a new product offering - Operational Insights for Informatica Data Engineering Integration (Big Data Management). Now you can better understand big data cluster resource utilization of Informatica jobs, analyze mapping/workflow executions, manage capacity, and troubleshoot issues in minutes. Includes product demos
Webinar: Introduction to Operational Insights for Informatica Data Engineering Integration (Big Data Management)
Watch this video to learn how DEI integrates with one of the Hadoop vendors - the MapR ecosystem.
This video also explains how to solve “Data warehousing offloading” use case with the help of a demo.