• Success
    Manage your Success Plans and Engagements, gain key insights into your implementation journey, and collaborate with your CSMs
    Success
    Accelerate your Purchase to Value engaging with Informatica Architects for Customer Success
    All your Engagements at one place
  • Communities
    A collaborative platform to connect and grow with like-minded Informaticans across the globe
    Communities
    Connect and collaborate with Informatica experts and champions
    Have a question? Start a Discussion and get immediate answers you are looking for
    Customer-organized groups that meet online and in-person. Join today to network, share ideas, and get tips on how to get the most out of Informatica
  • Knowledge Center
    Troubleshooting documents, product guides, how to videos, best practices, and more
    Knowledge Center
    One-stop self-service portal for solutions, FAQs, Whitepapers, How Tos, Videos, and more
    Video channel for step-by-step instructions to use our products, best practices, troubleshooting tips, and much more
    Information library of the latest product documents
    Best practices and use cases from the Implementation team
  • Learn
    Rich resources to help you leverage full capabilities of our products
    Learn
    Role-based training programs for the best ROI
    Get certified on Informatica products. Free, Foundation, or Professional
    Free and unlimited modules based on your expertise level and journey
    Self-guided, intuitive experience platform for outcome-focused product capabilities and use cases
  • Resources
    Library of content to help you leverage the best of Informatica products
    Resources
    Most popular webinars on product architecture, best practices, and more
    Product Availability Matrix statements of Informatica products
    Monthly support newsletter
    Informatica Support Guide and Statements, Quick Start Guides, and Cloud Product Description Schedule
    End of Life statements of Informatica products
Overview

Lake System scanners such as Amazon S3, ADLS Gen2 and Google Cloud Storage supports profiling below file types:

  • Avro
  • Parquet
  • CSV

Data profiling in CDGC involves evaluating the quality of metadata extracted from the respective source system. Running a profile on Avro or Parquet files in CDGC requires an advanced cluster configuration due to the complexity of these file formats. Avro and Parquet are optimized for big data processing, with Avro being row-based and Parquet being columnar. 

An advanced cluster is a Kubernetes cluster which is crucial for profiling complex file types (Avro and Parquet) in CDGC due to its distributed processing capabilities. These clusters manage large-scale data processing efficiently by distributing tasks across multiple nodes. This setup is essential for handling the complexities and sizes of Avro and Parquet files, enabling efficient data analysis.

Pre-Requisites
Self-Service Resources
Goals
  • Learn how to setup an advanced cluster to profile AVRO and Parquet file in CDGC based on the business requirement. 
  • Learn how to configure data profiling for complex file types 
Outcome
  • Gain comprehensive understanding of setting up an advanced cluster and use it to profile complex file types such as Avro and Parquet in CDGC.
Required Roles/Personas
Actions
Add to Favorites
Engagement Details
Catalog Type

Ask An Expert

Engagement Category

Feature Clarity

Products

Cloud Data Governance and Catalog

Engagement Type

Ask An Expert

Adoption Stage

Configure

Implement

Focus Area

Adoption - Technical

Functional

Engagement ID

AAE-CDGC-029

Disclaimer

  • All the topics covered in the Success Accelerators/Ask An Expert sessions are intended for guidance and advisory only. This is implicit and it will not be called out under the scope of each engagement.
  • Customers need to include their relevant technical/business team members highlighted in each engagement topic to derive the best out of each engagement.
  • Customers need to perform any hands-on work by themselves leveraging the guidance from these engagements.
  • Customers need to work with Informatica Global Customer Support for any product bugs/issues and troubleshooting.

Success

Link Copied to Clipboard