You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@parquet.apache.org by sh...@apache.org on 2022/03/16 18:58:12 UTC
[parquet-site] 38/39: Merge branch 'apache:asf-site' into asf-site
This is an automated email from the ASF dual-hosted git repository.
shangxinli pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/parquet-site.git
commit 3e713ee44fa57dc02b444d55e0f8c03f1957061e
Merge: 3f0917d 5cabac3
Author: Vegard Stikbakke <ve...@gmail.com>
AuthorDate: Wed Mar 9 09:03:50 2022 +0100
Merge branch 'apache:asf-site' into asf-site
.asf.yaml | 6 ++-
Gemfile.lock | 4 +-
output/documentation/latest/index.html | 92 ++++++++++++++++----------------
source/documentation/latest.html.md | 96 +++++++++++++++++-----------------
4 files changed, 100 insertions(+), 98 deletions(-)
diff --cc source/documentation/latest.html.md
index 5307955,95d4163..b713a3d
--- a/source/documentation/latest.html.md
+++ b/source/documentation/latest.html.md
@@@ -135,25 -135,24 +135,25 @@@ path for the column are defined. Repet
in the path has the value repeated. The max definition and repetition levels can
be computed from the schema (i.e. how much nesting there is). This defines the
maximum number of bits required to store the levels (levels are defined for all
- values in the column).
+ values in the column).
-Two encodings for the levels are supported BIT_PACKED and RLE. Only RLE is now used as it supersedes BIT_PACKED.
+Two encodings for the levels are supported: BIT_PACKED and RLE. Only RLE is now used as it supersedes BIT_PACKED.
## Nulls
- Nullity is encoded in the definition levels (which is run-length encoded). NULL values
- are not encoded in the data. For example, in a non-nested schema, a column with 1000 NULLs
+ Nullity is encoded in the definition levels (which is run-length encoded). NULL values
+ are not encoded in the data. For example, in a non-nested schema, a column with 1000 NULLs
would be encoded with run-length encoding (0, 1000 times) for the definition levels and
- nothing else.
+ nothing else.
## Data Pages
For data pages, the 3 pieces of information are encoded back to back, after the page
- header. We have the
+ header. We have the
- - definition levels data,
- - repetition levels data,
+ - definition levels data,
+ - repetition levels data,
- encoded values.
-The size of specified in the header is for all 3 pieces combined.
+
+The size specified in the header is for all 3 pieces combined.
The data for the data page is always required. The definition and repetition levels
are optional, based on the schema definition. If the column is not nested (i.e.