Transactions
Support for transactions in enables SQL atomicity of operations at the row level rather than at the level of a table or partition. This allows a Hive client to read from a partition at the same time that another Hive client is adding rows to the same partition. In addition, transactions provide a mechanism for streaming clients to rapidly update Hive tables and partitions.
Hive transactions differ from RDBMS transactions in that each transaction has an identifier, and multiple transactions are grouped into a single transaction batch. A streaming client requests a set of transaction IDs after connecting to Hive and subsequently uses these transaction IDs one at a time during the initialization of new transaction batches. Clients write one or more records for each transaction and either commit or abort a transaction before moving to the next transaction.
Last updated