Parquet
A Parquet file is a special type of data file that's designed to store large amounts of information in a way that's really efficient and easy to work with. It's kind of like a digital filing cabinet for your data.
Parquet files are really handy for working with large, complex datasets. They're super efficient, fast, and flexible, which makes them a popular choice for a lot of big data and analytics applications.
Some of the key benefits of Parquet files include:
- Smaller file size: The columnar storage and compression features make Parquet files much smaller than other data formats.
- Faster performance: Data is organized making it really quick and easy to find and access the specific information you need.
- Cross-platform: They can be used with all kinds of different tools and frameworks, so you can share your data anywhere.
- Metadata support: They store information about the data structure and schema, which makes them really easy to work with.
Parquet parameters
Required parameters are in red and Optional parameters are in blue.
- directory, the folder location to write JSON files to
- compression, the method used to compress data
default:GZIP
( allowed:GZIP
,SNAPPY
)
How do I use the Parquet syncer in commands?
cs_tools tools searchable bi-server --syncer parquet://directory=.
- or -
cs_tools tools searchable bi-server --syncer parquet://definition.toml
Definition TOML Example
definition.toml