Documents
  • Invariant Documents
  • Platform
    • Data Platform
      • Install Overview
      • System Requirement
      • Software Requirement
      • Prepare the Environment
      • Installing Ambari Server
      • Setup Ambari Server
      • Start Ambari Server
      • Single Node Install
      • Multi-Node Cluster Install
      • Cluster Install from Ambari
      • Run and monitor HDFS
    • Apache Hadoop
      • Compatible Hadoop Versions
      • HDFS
        • HDFS Architecture
        • Name Node
        • Data Node
        • File Organization
        • Storage Format
          • ORC
          • Parquet
        • Schema Design
      • Hive
        • Data Organization
        • Data Types
        • Data Definition
        • Data Manipulation
          • CRUD Statement
            • Views, Indexes, Temporary Tables
        • Cost-based SQL Optimization
        • Subqueries
        • Common Table Expression
        • Transactions
        • SerDe
          • XML
          • JSON
        • UDF
      • Oozie
      • Sqoop
        • Commands
        • Import
      • YARN
        • Overview
        • Accessing YARN Logs
    • Apache Kafka
      • Compatible Kafka Versions
      • Installation
    • Elasticsearch
      • Compatible Elasticsearch Versions
      • Installation
  • Discovery
    • Introduction
      • Release Notes
    • Methodology
    • Discovery Pipeline
      • Installation
      • DB Event Listener
      • Pipeline Configuration
      • Error Handling
      • Security
    • Inventory Manager
      • Installation
      • Metadata Management
      • Column Mapping
      • Service Configuration
      • Metadata Configuration
      • Metadata Changes and Versioning
        • Generating Artifacts
      • Reconciliation, Merging Current View
        • Running daily reconciliation and merge
      • Data Inventory Reports
    • Schema Registry
  • Process Insight
    • Process Insight
      • Overview
    • Process Pipeline
      • Data Ingestion
      • Data Storage
    • Process Dashboards
      • Panels
      • Templating
      • Alerts
        • Rules
        • Notifications
  • Content Insight
    • Content Insight
      • Release Notes
      • Configuration
      • Content Indexing Pipeline
    • Management API
    • Query DSL
    • Configuration
  • Document Flow
    • Overview
  • Polyglot Data Manager
    • Polyglot Data Manager
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Operational Insight
    • Operational Insight
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Data Science
    • Data Science Notebook
      • Setup JupyterLab
      • Configuration
        • Configuration Settings
        • Libraries
    • Spark DataHub
      • Concepts
      • Cluster Setup
      • Spark with YARN
      • PySpark Setup
        • DataFrame API
      • Reference
  • Product Roadmap
    • Roadmap
  • TIPS
    • Service Troubleshooting
    • Service Startup Errors
    • Debugging YARN Applications
      • YARN CLI
    • Hadoop Credentials
    • Sqoop Troubleshooting
    • Log4j Vulnerability Fix
Powered by GitBook
On this page
  1. Product Roadmap

Roadmap

Engine

  • Update platform to add support for Hadoop 2.8. This will bring in critical fixes for HDFS and YARN.

  • Elasticsearch updated to version 7.9

  • XML Serde improvements and large XML support

  • JSON Serde improvements

  • Improved fault tolerance with support for multiple standby name nodes

  • Scalable Timeline server for improved scalability and reliability of timeline service.

  • Workload management - resource pooling

  • Improved ACID transaction support

  • Spark Engine for data pipeline and analytics

Inventory Manager

  • Auto schema and version management to handle source system schema changes

  • Recon-Merge tooling improvements. Additional state management for improved error handling and restarts.

  • Simplify dependency – remove additional web service dependency to simplify the configuration generation and deployment process.

  • Support for Centralized Schema registry for use with ingestion and improve configuration management.

New Pipeline Source DB Connectors

  • Oracle

  • Postgres

  • MySQL

Discovery Pipeline

  • Kafka 2.x support

  • Pipeline error log capture to HDFS to display mapping and transformation errors from Hive.

  • Schema registry and AVRO payload support

Miscellaneous

  • Improved alerting and monitoring integration with 3rd party tools

  • Tooling for Kafka Lag monitoring and alerting for Operations.

  • nmon collection agents for performance metrics

OS Support

  • RHEL 8.x

  • CentOS 8.x

Container Support

  • Docker, Podman

  • Kubernetes

  • Openshift

PreviousReferenceNextService Troubleshooting

Last updated 4 years ago