Polyglot Data Manager
Managing Data from diverse sources
Last updated
Managing Data from diverse sources
Last updated
Invariant Polyglot Data Manager (PDM) seamlessly integrates with the Invariant data platform and other data sources, delivering powerful SQL query support for both interactive and batch workloads. Capable of handling datasets of any size, PDM scales effortlessly to meet the demands of enterprise environments. By leveraging file system caching for data stored in object storage, along with various connectors, PDM reduces latency and minimizes the need for frequent data retrieval. Caching frequently accessed data on local storage devices alleviates the load on distributed file systems (e.g., HDFS), resulting in faster query execution times and improved overall system performance. This caching mechanism ensures that ad-hoc activities can be performed on the same data lake without disrupting ongoing workloads.
At its core, PDM features a distributed query engine that enables parallel data processing across multiple servers. It allows users to query across various data sources without the need for complex ETL processes to centralize the data. This empowers analysts to efficiently run both ad-hoc and batch workloads, conducting SQL-based analysis on large, distributed datasets. PDM specializes in data analytics, excelling at sampling vast datasets to uncover patterns and trends, facilitating quick, data-driven decision-making. Once these patterns are identified and validated, resulting models can be scaled using existing data lake resources, leveraging predefined ETL and ELT pipelines.
The PDM driver is a crucial client component that enables seamless communication between applications and the PDM Server Engine. It facilitates interaction with external systems, allowing users to run queries on data stored across diverse environments, such as databases, data lakes, and cloud storage. Clients can easily integrate with the PDM engine through a variety of platforms, including desktop applications, web interfaces, and modern BI tools like Tableau, providing smooth access to data for comprehensive analysis and reporting.
PDM works seamlessly with both cloud and on prem data sources. You can use PDM for
Serving as the SQL query engine behind business intelligence tools.
Providing a fast and interactive SQL querying experience for big data.
Federating queries across multiple data sources.
Query data stored in distributed data lakes such as Amazon S3, Google Cloud Storage, or Hadoop HDFS