Documents
  • Invariant Documents
  • Platform
    • Data Platform
      • Install Overview
      • System Requirement
      • Software Requirement
      • Prepare the Environment
      • Installing Ambari Server
      • Setup Ambari Server
      • Start Ambari Server
      • Single Node Install
      • Multi-Node Cluster Install
      • Cluster Install from Ambari
      • Run and monitor HDFS
    • Apache Hadoop
      • Compatible Hadoop Versions
      • HDFS
        • HDFS Architecture
        • Name Node
        • Data Node
        • File Organization
        • Storage Format
          • ORC
          • Parquet
        • Schema Design
      • Hive
        • Data Organization
        • Data Types
        • Data Definition
        • Data Manipulation
          • CRUD Statement
            • Views, Indexes, Temporary Tables
        • Cost-based SQL Optimization
        • Subqueries
        • Common Table Expression
        • Transactions
        • SerDe
          • XML
          • JSON
        • UDF
      • Oozie
      • Sqoop
        • Commands
        • Import
      • YARN
        • Overview
        • Accessing YARN Logs
    • Apache Kafka
      • Compatible Kafka Versions
      • Installation
    • Elasticsearch
      • Compatible Elasticsearch Versions
      • Installation
  • Discovery
    • Introduction
      • Release Notes
    • Methodology
    • Discovery Pipeline
      • Installation
      • DB Event Listener
      • Pipeline Configuration
      • Error Handling
      • Security
    • Inventory Manager
      • Installation
      • Metadata Management
      • Column Mapping
      • Service Configuration
      • Metadata Configuration
      • Metadata Changes and Versioning
        • Generating Artifacts
      • Reconciliation, Merging Current View
        • Running daily reconciliation and merge
      • Data Inventory Reports
    • Schema Registry
  • Process Insight
    • Process Insight
      • Overview
    • Process Pipeline
      • Data Ingestion
      • Data Storage
    • Process Dashboards
      • Panels
      • Templating
      • Alerts
        • Rules
        • Notifications
  • Content Insight
    • Content Insight
      • Release Notes
      • Configuration
      • Content Indexing Pipeline
    • Management API
    • Query DSL
    • Configuration
  • Document Flow
    • Overview
  • Polyglot Data Manager
    • Polyglot Data Manager
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Operational Insight
    • Operational Insight
      • Release Notes
    • Data Store
      • Concepts
      • Sharding
    • Shippers
      • Filerelay Container
    • Processors
    • Search
    • User Interface
  • Data Science
    • Data Science Notebook
      • Setup JupyterLab
      • Configuration
        • Configuration Settings
        • Libraries
    • Spark DataHub
      • Concepts
      • Cluster Setup
      • Spark with YARN
      • PySpark Setup
        • DataFrame API
      • Reference
  • Product Roadmap
    • Roadmap
  • TIPS
    • Service Troubleshooting
    • Service Startup Errors
    • Debugging YARN Applications
      • YARN CLI
    • Hadoop Credentials
    • Sqoop Troubleshooting
    • Log4j Vulnerability Fix
Powered by GitBook
On this page
  1. Platform
  2. Apache Hadoop
  3. Hive

Subqueries

Subqueries in WHERE Clauses

A subquery is a SQL expression that returns a set of rows. The subquery is evaluated and its query result set used to evaluate the parent query, the outer query that contains the subquery. Hive supports subqueries to include WHERE clauses, as shown in the following example:

SELECT state, net_payments FROM transfer_payments WHERE transfer_payments.year 
  IN (SELECT year FROM us_census);

There are some restrictions for the use of subqueries in WHERE clauses.

  • Subqueries in WHERE SQL clauses have the following limitations:

  • Subqueries must appear on the right hand side of an expression.

  • Nested subqueries are not supported.

  • Only one subquery expression is allowed for a single query.

  • Subquery predicates must appear as top level conjuncts.

  • Subqueries support four logical operators in query predicates: IN, NOT IN, EXISTS, and NOT EXISTS.

  • The IN and NOT IN logical operators may select only one column in a WHERE clause subquery.

  • The EXISTS and NOT EXISTS operators must have at least one correlated predicate.

  • The left side of a subquery must qualify all references to table columns.

  • References to columns in the parent query are allowed only in the WHERE clause of the subquery.

  • Subquery predicates that reference a column in a parent query must use the Equals To ( =) predicate operator.

  • Subquery predicates may not refer only to columns in the parent query.

  • Correlated subqueries with an implied GROUP BY statement may return only one row.

  • All unqualified references to columns in a subquery must resolve to tables in the subquery.

  • Correlated subqueries cannot contain windowing clauses.

PreviousCost-based SQL OptimizationNextCommon Table Expression

Last updated 6 years ago