Parquet

Parquet is a popular columnar storage format, which provides compressed, efficient columnar data representation that can be used in the Hadoop ecosystem. Parquet provides efficient compression and encoding schemes. It is implemented using the record-shredding and assembly algorithm and can support complex data structures.

Read more about Parquet spec at https://parquet.apache.org/

Last updated