Documents
  • Invariant Documents
  • Platform
    • Data Platform
      • Install Overview
      • System Requirement
      • Software Requirement
      • Prepare the Environment
      • Installing Ambari Server
      • Setup Ambari Server
      • Start Ambari Server
      • Single Node Install
      • Multi-Node Cluster Install
      • Cluster Install from Ambari
      • Run and monitor HDFS
    • Apache Hadoop
      • Compatible Hadoop Versions
      • HDFS
        • HDFS Architecture
        • Name Node
        • Data Node
        • File Organization
        • Storage Format
          • ORC
          • Parquet
        • Schema Design
      • Hive
        • Data Organization
        • Data Types
        • Data Definition
        • Data Manipulation
          • CRUD Statement
            • Views, Indexes, Temporary Tables
        • Cost-based SQL Optimization
        • Subqueries
        • Common Table Expression
        • Transactions
        • SerDe
          • XML
          • JSON
        • UDF
      • Oozie
      • Sqoop
        • Commands
        • Import
      • YARN
        • Overview
        • Accessing YARN Logs
    • Apache Kafka
      • Compatible Kafka Versions
      • Installation
    • Elasticsearch
      • Compatible Elasticsearch Versions
      • Installation
  • Discovery
    • Introduction
      • Release Notes
    • Methodology
    • Discovery Pipeline
      • Installation
      • DB Event Listener
      • Pipeline Configuration
      • Error Handling
      • Security
    • Inventory Manager
      • Installation
      • Metadata Management
      • Column Mapping
      • Service Configuration
      • Metadata Configuration
      • Metadata Changes and Versioning
        • Generating Artifacts
      • Reconciliation, Merging Current View
        • Running daily reconciliation and merge
      • Data Inventory Reports
    • Schema Registry
  • Process Insight
    • Process Insight
      • Overview
    • Process Pipeline
      • Data Ingestion
      • Data Storage
    • Process Dashboards
      • Panels
      • Templating
      • Alerts
        • Rules
        • Notifications
  • Content Insight
    • Content Insight
      • Release Notes
      • Configuration
      • Content Indexing Pipeline
    • Management API
    • Query DSL
    • Configuration
  • Document Flow
    • Overview
  • Polyglot Data Manager
    • Polyglot Data Manager
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Operational Insight
    • Operational Insight
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Data Science
    • Data Science Notebook
      • Setup JupyterLab
      • Configuration
        • Configuration Settings
        • Libraries
    • Spark DataHub
      • Concepts
      • Cluster Setup
      • Spark with YARN
      • PySpark Setup
        • DataFrame API
      • Reference
  • Product Roadmap
    • Roadmap
  • TIPS
    • Service Troubleshooting
    • Service Startup Errors
    • Debugging YARN Applications
      • YARN CLI
    • Hadoop Credentials
    • Sqoop Troubleshooting
    • Log4j Vulnerability Fix
Powered by GitBook