You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/20 08:52:09 UTC

[GitHub] [incubator-doris] gaodayue opened a new pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

gaodayue opened a new pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953
 
 
   Fixes #2892 
   
   IMPORTANT NOTICE: this CL makes incompatible changes to V2 storage format, developers need to create new tables for test.
   
   This CL refactors the metadata and page format for segment_v2 in order to
   * make it easy to extend existing page type
   * make it easy to add new page type while not sacrificing code reuse
   * make it possible to use SIMD to speed up page decoding
   
   Here we summary the main code changes
   * Page and index metadata is redesigned, please see `segment_v2.proto`
   * `PageCompressor` and `PageDecompressor` is removed. `PageIO` is the single place for reading and writing pages. Lots of duplicated code removed.
   * Value ordinal now uses 64-bits `ordinal_t` instead of `rowid_t`, this affects ordinal index as well.
   * Column's ordinal index is now implemented by IndexPage, the same with IndexedColumn.
   * Zone map index is now implemented by IndexedColumn
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383675174
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/parsed_page.h
 ##########
 @@ -17,47 +17,88 @@
 
 #pragma once
 
-#include "olap/rowset/segment_v2/page_decoder.h" // for PagePointer
-#include "util/rle_encoding.h" // for RleDecoder
+#include <memory>
+
+#include "common/status.h"
+#include "gen_cpp/segment_v2.pb.h"
+#include "olap/rowset/segment_v2/common.h"
+#include "olap/rowset/segment_v2/encoding_info.h"
+#include "olap/rowset/segment_v2/options.h"
+#include "olap/rowset/segment_v2/page_decoder.h"
+#include "olap/rowset/segment_v2/page_handle.h"
+#include "util/rle_encoding.h"
 
 namespace doris {
 namespace segment_v2 {
 
-class PageHandle;	
-struct PagePointer;
-
 // This contains information when one page is loaded, and ready for read
 // This struct can be reused, client should call reset first before reusing
 // this object
 struct ParsedPage {
-    ParsedPage() { }
+
+    static Status create(PageHandle handle,
+                         const Slice& body,
+                         const DataPageFooterPB& footer,
+                         const EncodingInfo* encoding,
+                         const PagePointer& page_pointer,
+                         uint32_t page_index,
+                         std::unique_ptr<ParsedPage>* result) {
+        std::unique_ptr<ParsedPage> page(new ParsedPage);
+        page->page_handle = std::move(handle);
+
+        auto null_size = footer.nullmap_size();
+        page->has_null = null_size > 0;
+        page->null_bitmap = Slice(body.data + body.size - null_size, null_size);
+
+        if (page->has_null) {
+            page->null_decoder = RleDecoder<bool>(
+                    (const uint8_t*) page->null_bitmap.data, null_size, 1);
+        }
+
+        Slice data_slice(body.data, body.size - null_size);
+        PageDecoderOptions opts;
+        RETURN_IF_ERROR(encoding->create_page_decoder(data_slice, opts, &page->data_decoder));
+        RETURN_IF_ERROR(page->data_decoder->init());
+
+        page->first_rowid = footer.first_ordinal();
+        page->num_rows = footer.num_values();
+        page->page_pointer = page_pointer;
+        page->page_index = page_index;
+
+        *result = std::move(page);
+        return Status::OK();
+    }
+
     ~ParsedPage() {
         delete data_decoder;
     }
 
-    PagePointer page_pointer;
     PageHandle page_handle;
 
+    bool has_null;
     Slice null_bitmap;
     RleDecoder<bool> null_decoder;
     PageDecoder* data_decoder = nullptr;
 
     // first rowid for this page
-    rowid_t first_rowid = 0;
-
+    ordinal_t first_rowid = 0;
 
 Review comment:
   If this type is changed to ordinal_t, is it better to change name to `first_oid`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590019622
 
 
   2. I think ZoneMap and OrdinalIndex Read/Writer logic remaining the same may be better. 
   Firstly ZoneMap and OrdinalIndex is simple, may not need to used IndexedColumnWriter/Reader complicated logic.
   Secondly IndexedColumnWriter will also contain all index writer, if we add BTree index in the futher.
   Thirdly if use the above optimization, the ZoneMap and OrdinalIndex also not suitable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590641429
 
 
   1. The indention it not to reduce storage size, it's to reduce I/O operation for store bitmap pointer like reverted index in Lucene.
   2. ZoneMap/Ordinal Index is so easy, I think it's may be better not integrated into so complicated indexed column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383189896
 
 

 ##########
 File path: gensrc/proto/segment_v2.proto
 ##########
 @@ -205,22 +191,52 @@ message IndexedColumnMetaPB {
     optional uint64 size = 7;
 }
 
-message BitmapIndexColumnPB {
+// -------------------------------------------------------------
+// Column Index Metadata
+// -------------------------------------------------------------
+
+enum ColumnIndexTypePB {
+    ORDINAL_INDEX = 0;
+    ZONE_MAP_INDEX = 1;
+    BITMAP_INDEX = 2;
+    BLOOM_FILTER_INDEX = 3;
+}
+
+message ColumnIndexMetaPB {
+    optional ColumnIndexTypePB type = 1;
+    optional OrdinalIndexPB ordinal_index = 7;
 
 Review comment:
   Ordinal 2 to 6 is reserved for other common fields that may be added in the future. `PageFooterPB` uses the same method

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r384446070
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/column_zone_map.h
 ##########
 @@ -17,8 +17,9 @@
 
 
 Review comment:
   OK

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590641429
 
 
   1. The indention it not to reduce storage size, it's to reduce I/O operation for store bitmap pointer.
   2. ZoneMap/Ordinal Index is so easy, I think it's may not integrated into so complicated indexed column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383676486
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/page_io.cpp
 ##########
 @@ -0,0 +1,205 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/rowset/segment_v2/page_io.h"
+
+#include <cstring>
+#include <string>
+
+#include "common/logging.h"
+#include "env/env.h"
+#include "gutil/strings/substitute.h"
+#include "olap/page_cache.h"
+#include "util/block_compression.h"
+#include "util/coding.h"
+#include "util/crc32c.h"
+#include "util/faststring.h"
+#include "util/runtime_profile.h"
+
+namespace doris {
+namespace segment_v2 {
+
+using strings::Substitute;
+
+Status PageIO::compress_page_body(const BlockCompressionCodec* codec,
+                                  double min_space_saving,
+                                  const std::vector<Slice>& body,
+                                  OwnedSlice* compressed_body) {
+    if (codec != nullptr) {
+        size_t uncompressed_size = Slice::compute_total_size(body);
+        size_t max_compressed_size = codec->max_compressed_len(uncompressed_size);
+        faststring buf;
+        buf.resize(max_compressed_size);
+        Slice compressed_slice(buf);
+        RETURN_IF_ERROR(codec->compress(body, &compressed_slice));
+        buf.resize(compressed_slice.get_size());
+
+        double space_saving = 1.0 - static_cast<double>(buf.size()) / uncompressed_size;
 
 Review comment:
   Better to handle case when uncompressed_size is 0.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r382045727
 
 

 ##########
 File path: gensrc/proto/segment_v2.proto
 ##########
 @@ -205,22 +191,52 @@ message IndexedColumnMetaPB {
     optional uint64 size = 7;
 }
 
-message BitmapIndexColumnPB {
+// -------------------------------------------------------------
+// Column Index Metadata
+// -------------------------------------------------------------
+
+enum ColumnIndexTypePB {
+    ORDINAL_INDEX = 0;
+    ZONE_MAP_INDEX = 1;
+    BITMAP_INDEX = 2;
+    BLOOM_FILTER_INDEX = 3;
+}
+
+message ColumnIndexMetaPB {
+    optional ColumnIndexTypePB type = 1;
+    optional OrdinalIndexPB ordinal_index = 7;
 
 Review comment:
   Why not use 2,3,4,5

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590273942
 
 
   @chaoyli answers inline
   
   > Bitmap index have save dict_column and bitmap with four page, every read have four I/O, this may be consuming. I think dict_column store a bitmap PagePointer will be better. Like this:
   DictColumnValueIndex : val -> dict_column_page_pointer
   DictColumnPage : dict_column|bitmap_page_pointer
   BitmapPage : RoaringBitMap
   which can save one I/O operation
   
   Storing bitmap_page_pointer in DictColumnPage has several drawbacks
   1. Binary search inside DictColumnPage is more complicated because we now need to separate `dict_column` and `bitmap_page_pointer`
   2. Storage size is not necessarily reduced because now we need to store a bitmap page pointer for each dictionary item. Previously we only store a bitmap page pointer for each bitmap page.
   3. The implementation is more complicated because now we tightly couples DictColumn with BitmapColumn. We lose the benefits of IndexedColumn abstraction.
   
   > I think ZoneMap and OrdinalIndex Read/Writer logic remaining the same may be better.
   Firstly ZoneMap and OrdinalIndex is simple, may not need to used IndexedColumnWriter/Reader complicated logic.
   Secondly IndexedColumnWriter will also contain all index writer, if we add BTree index in the futher.
   Thirdly if use the above optimization, the ZoneMap and OrdinalIndex also not suitable.
   
   The problems with implementing all kinds of indexes from scratch instead of reusing existing abstractions is lower code reusability and higher long term maintenance cost. The nice thing about `IndexedColumn` is that it can be used as the building blocks for all kinds of data and indexes, leading to a more layered system. Considering you worries about the cost of using BTree index for ZoneMap, I think IndexedColumn can support both single-level and multiple-level index in in the future.
   
   > I found the default decoding of VARCHAR and CHAR is dictionary encoding without policy, this may be consuming space when cardinality is high. And if we want to change it, we should rebuild all of the data.
   
   Actually the current implementation of `BinaryDictPageBuilder` will fallback to plain encoding automatically when it found the cardinality is high and the size of dictionary page is too big.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r384446139
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/page_io.cpp
 ##########
 @@ -0,0 +1,205 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/rowset/segment_v2/page_io.h"
+
+#include <cstring>
+#include <string>
+
+#include "common/logging.h"
+#include "env/env.h"
+#include "gutil/strings/substitute.h"
+#include "olap/page_cache.h"
+#include "util/block_compression.h"
+#include "util/coding.h"
+#include "util/crc32c.h"
+#include "util/faststring.h"
+#include "util/runtime_profile.h"
+
+namespace doris {
+namespace segment_v2 {
+
+using strings::Substitute;
+
+Status PageIO::compress_page_body(const BlockCompressionCodec* codec,
+                                  double min_space_saving,
+                                  const std::vector<Slice>& body,
+                                  OwnedSlice* compressed_body) {
+    size_t uncompressed_size = Slice::compute_total_size(body);
+    if (codec != nullptr && uncompressed_size > 0) {
+        size_t max_compressed_size = codec->max_compressed_len(uncompressed_size);
+        faststring buf;
+        buf.resize(max_compressed_size);
+        Slice compressed_slice(buf);
+        RETURN_IF_ERROR(codec->compress(body, &compressed_slice));
+        buf.resize(compressed_slice.get_size());
+
+        double space_saving = 1.0 - static_cast<double>(buf.size()) / uncompressed_size;
+        // return compressed body only when it saves more than min_space_saving
+        if (space_saving > 0 && space_saving >= min_space_saving) {
+            *compressed_body = buf.build();
+            return Status::OK();
+        }
+    }
+    // otherwise, do not compress
+    OwnedSlice empty;
+    *compressed_body = std::move(empty);
+    return Status::OK();
+}
+
+Status PageIO::write_page(WritableFile* file,
+                          const std::vector<Slice>& body,
+                          const PageFooterPB& footer,
+                          PagePointer* result) {
+    // sanity check of page footer
+    CHECK(footer.has_type()) << "type must be set";
+    CHECK(footer.has_uncompressed_size()) << "uncompressed_size must be set";
+    switch (footer.type()) {
+    case DATA_PAGE:
+        CHECK(footer.has_data_page_footer());
+        break;
+    case INDEX_PAGE:
+        CHECK(footer.has_index_page_footer());
+        break;
+    case DICTIONARY_PAGE:
+        CHECK(footer.has_dict_page_footer());
+        break;
+    case SHORT_KEY_PAGE:
+        CHECK(footer.has_short_key_page_footer());
+        break;
 
 Review comment:
   OK

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383185514
 
 

 ##########
 File path: gensrc/proto/segment_v2.proto
 ##########
 @@ -205,22 +191,52 @@ message IndexedColumnMetaPB {
     optional uint64 size = 7;
 }
 
-message BitmapIndexColumnPB {
+// -------------------------------------------------------------
+// Column Index Metadata
+// -------------------------------------------------------------
+
+enum ColumnIndexTypePB {
+    ORDINAL_INDEX = 0;
 
 Review comment:
   OK, I'll also reserve 0 for PageTypePB 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383714780
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/parsed_page.h
 ##########
 @@ -17,47 +17,88 @@
 
 #pragma once
 
-#include "olap/rowset/segment_v2/page_decoder.h" // for PagePointer
-#include "util/rle_encoding.h" // for RleDecoder
+#include <memory>
+
+#include "common/status.h"
+#include "gen_cpp/segment_v2.pb.h"
+#include "olap/rowset/segment_v2/common.h"
+#include "olap/rowset/segment_v2/encoding_info.h"
+#include "olap/rowset/segment_v2/options.h"
+#include "olap/rowset/segment_v2/page_decoder.h"
+#include "olap/rowset/segment_v2/page_handle.h"
+#include "util/rle_encoding.h"
 
 namespace doris {
 namespace segment_v2 {
 
-class PageHandle;	
-struct PagePointer;
-
 // This contains information when one page is loaded, and ready for read
 // This struct can be reused, client should call reset first before reusing
 // this object
 struct ParsedPage {
-    ParsedPage() { }
+
+    static Status create(PageHandle handle,
+                         const Slice& body,
+                         const DataPageFooterPB& footer,
+                         const EncodingInfo* encoding,
+                         const PagePointer& page_pointer,
+                         uint32_t page_index,
+                         std::unique_ptr<ParsedPage>* result) {
+        std::unique_ptr<ParsedPage> page(new ParsedPage);
+        page->page_handle = std::move(handle);
+
+        auto null_size = footer.nullmap_size();
+        page->has_null = null_size > 0;
+        page->null_bitmap = Slice(body.data + body.size - null_size, null_size);
+
+        if (page->has_null) {
+            page->null_decoder = RleDecoder<bool>(
+                    (const uint8_t*) page->null_bitmap.data, null_size, 1);
+        }
+
+        Slice data_slice(body.data, body.size - null_size);
+        PageDecoderOptions opts;
+        RETURN_IF_ERROR(encoding->create_page_decoder(data_slice, opts, &page->data_decoder));
+        RETURN_IF_ERROR(page->data_decoder->init());
+
+        page->first_rowid = footer.first_ordinal();
+        page->num_rows = footer.num_values();
+        page->page_pointer = page_pointer;
+        page->page_index = page_index;
+
+        *result = std::move(page);
+        return Status::OK();
+    }
+
     ~ParsedPage() {
         delete data_decoder;
     }
 
-    PagePointer page_pointer;
     PageHandle page_handle;
 
+    bool has_null;
     Slice null_bitmap;
     RleDecoder<bool> null_decoder;
     PageDecoder* data_decoder = nullptr;
 
     // first rowid for this page
-    rowid_t first_rowid = 0;
-
+    ordinal_t first_rowid = 0;
 
 Review comment:
   OK

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay merged pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
imay merged pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590641429
 
 
   1. The indention it not to reduce storage size, it's to reduce I/O operation for store bitmap pointer like reverted index in Lucene. I think is also have no BinarySearch.
   2. ZoneMap/Ordinal Index is so easy, I think it's may be better not integrated into so complicated indexed column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590018446
 
 
   I have two question:
   1. Bitmap index have save dict_column and bitmap with four page, every read have four I/O, this may be consuming. I think dict_column store a bitmap PagePointer will be better. Like this:
   **DictColumnValueIndex** : val -> dict_column_page_pointer
   **DictColumnPage** : dict_column|bitmap_page_pointer
   **BitmapPage** : RoaringBitMap
   which can save one I/O operation
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590641429
 
 
   1. The indention it not to reduce storage size, it's to reduce I/O operation for store bitmap pointer.
   2. ZoneMap/Ordinal Index is so easy, I think it's may be better not integrated into so complicated indexed column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-589467268
 
 
   The failed UT `GenericPoolTest` is un-related to this PR

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590018446
 
 
   I have two question:

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590273942
 
 
   @chaoyli answers inline
   
   > Bitmap index have save dict_column and bitmap with four page, every read have four I/O, this may be consuming. I think dict_column store a bitmap PagePointer will be better. Like this:
   DictColumnValueIndex : val -> dict_column_page_pointer
   DictColumnPage : dict_column|bitmap_page_pointer
   BitmapPage : RoaringBitMap
   which can save one I/O operation
   
   Storing bitmap_page_pointer in DictColumnPage has several drawbacks
   1. Binary search inside DictColumnPage is more complicated because we now need to separate `dict_column` and `bitmap_page_pointer`
   2. Storage size is not necessarily reduced because now we need to store a bitmap page pointer for each dictionary item. Previously we only store a bitmap page pointer for each bitmap page.
   3. The implementation is more complicated because now we tightly couples DictColumn with BitmapColumn. We lose the benefits of IndexedColumn abstraction.
   
   > I think ZoneMap and OrdinalIndex Read/Writer logic remaining the same may be better.
   Firstly ZoneMap and OrdinalIndex is simple, may not need to used IndexedColumnWriter/Reader complicated logic.
   Secondly IndexedColumnWriter will also contain all index writer, if we add BTree index in the futher.
   Thirdly if use the above optimization, the ZoneMap and OrdinalIndex also not suitable.
   
   The problems with implementing all kinds of indexes from scratch instead of reusing existing abstractions is *lower code reusability* and *higher long-term maintenance cost*. The nice thing about `IndexedColumn` is that it can be used as the building blocks for all kinds of data and indexes, leading to a more **layered system**. Considering you worries about the cost of using BTree index for ZoneMap, I think IndexedColumn can support both single-level and multiple-level index in in the future.
   
   > I found the default decoding of VARCHAR and CHAR is dictionary encoding without policy, this may be consuming space when cardinality is high. And if we want to change it, we should rebuild all of the data.
   
   Actually the current implementation of `BinaryDictPageBuilder` will fallback to plain encoding automatically when it found the cardinality is high and the size of dictionary page is too big.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangpinghuang commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
kangpinghuang commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r384440137
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/column_zone_map.h
 ##########
 @@ -17,8 +17,9 @@
 
 
 Review comment:
   rename this file name to zone_map_index.h

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r383715510
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/page_io.cpp
 ##########
 @@ -0,0 +1,205 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/rowset/segment_v2/page_io.h"
+
+#include <cstring>
+#include <string>
+
+#include "common/logging.h"
+#include "env/env.h"
+#include "gutil/strings/substitute.h"
+#include "olap/page_cache.h"
+#include "util/block_compression.h"
+#include "util/coding.h"
+#include "util/crc32c.h"
+#include "util/faststring.h"
+#include "util/runtime_profile.h"
+
+namespace doris {
+namespace segment_v2 {
+
+using strings::Substitute;
+
+Status PageIO::compress_page_body(const BlockCompressionCodec* codec,
+                                  double min_space_saving,
+                                  const std::vector<Slice>& body,
+                                  OwnedSlice* compressed_body) {
+    if (codec != nullptr) {
+        size_t uncompressed_size = Slice::compute_total_size(body);
+        size_t max_compressed_size = codec->max_compressed_len(uncompressed_size);
+        faststring buf;
+        buf.resize(max_compressed_size);
+        Slice compressed_slice(buf);
+        RETURN_IF_ERROR(codec->compress(body, &compressed_slice));
+        buf.resize(compressed_slice.get_size());
+
+        double space_saving = 1.0 - static_cast<double>(buf.size()) / uncompressed_size;
 
 Review comment:
   OK

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli edited a comment on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590018446
 
 
   I have three question:
   1. Bitmap index have save dict_column and bitmap with four page, every read have four I/O, this may be consuming. I think dict_column store a bitmap PagePointer will be better. Like this:
   **DictColumnValueIndex** : val -> dict_column_page_pointer
   **DictColumnPage** : dict_column|bitmap_page_pointer
   **BitmapPage** : RoaringBitMap
   which can save one I/O operation
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
imay commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r382045474
 
 

 ##########
 File path: gensrc/proto/segment_v2.proto
 ##########
 @@ -205,22 +191,52 @@ message IndexedColumnMetaPB {
     optional uint64 size = 7;
 }
 
-message BitmapIndexColumnPB {
+// -------------------------------------------------------------
+// Column Index Metadata
+// -------------------------------------------------------------
+
+enum ColumnIndexTypePB {
+    ORDINAL_INDEX = 0;
 
 Review comment:
   Better reserve 0, and start from 1

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] gaodayue commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
gaodayue commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590273942
 
 
   @chaoyli answers inline
   
   > Bitmap index have save dict_column and bitmap with four page, every read have four I/O, this may be consuming. I think dict_column store a bitmap PagePointer will be better. Like this:
   DictColumnValueIndex : val -> dict_column_page_pointer
   DictColumnPage : dict_column|bitmap_page_pointer
   BitmapPage : RoaringBitMap
   which can save one I/O operation
   
   Storing bitmap_page_pointer has several drawbacks
   1. Binary search inside DictColumnPage is more complicated because we now need to separate `dict_column` and `bitmap_page_pointer`
   2. Storage size is not necessarily reduced because now we need to store a bitmap page pointer for each dictionary item. Previously we only store a bitmap page pointer for each bitmap page.
   3. The implementation is more complicated because now we tightly couples DictColumn with BitmapColumn. We lose the benefits of IndexedColumn abstraction.
   
   > I think ZoneMap and OrdinalIndex Read/Writer logic remaining the same may be better.
   Firstly ZoneMap and OrdinalIndex is simple, may not need to used IndexedColumnWriter/Reader complicated logic.
   Secondly IndexedColumnWriter will also contain all index writer, if we add BTree index in the futher.
   Thirdly if use the above optimization, the ZoneMap and OrdinalIndex also not suitable.
   
   The problems with implementing all kinds of indexes from scratch instead of reusing existing abstractions is lower code reusability and higher long term maintenance cost. The nice thing about `IndexedColumn` is that it can be used as the building blocks for all kinds of data and indexes, leading to a more layered system. Considering you worries about the cost of using BTree index for ZoneMap, I think IndexedColumn can support both single-level and multiple-level index in in the future.
   
   > I found the default decoding of VARCHAR and CHAR is dictionary encoding without policy, this may be consuming space when cardinality is high. And if we want to change it, we should rebuild all of the data.
   
   Actually the current implementation of `BinaryDictPageBuilder` will fallback to plain encoding automatically when it found the cardinality is high and the size of dictionary page is too big.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangpinghuang commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
kangpinghuang commented on a change in pull request #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#discussion_r384422577
 
 

 ##########
 File path: be/src/olap/rowset/segment_v2/page_io.cpp
 ##########
 @@ -0,0 +1,205 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "olap/rowset/segment_v2/page_io.h"
+
+#include <cstring>
+#include <string>
+
+#include "common/logging.h"
+#include "env/env.h"
+#include "gutil/strings/substitute.h"
+#include "olap/page_cache.h"
+#include "util/block_compression.h"
+#include "util/coding.h"
+#include "util/crc32c.h"
+#include "util/faststring.h"
+#include "util/runtime_profile.h"
+
+namespace doris {
+namespace segment_v2 {
+
+using strings::Substitute;
+
+Status PageIO::compress_page_body(const BlockCompressionCodec* codec,
+                                  double min_space_saving,
+                                  const std::vector<Slice>& body,
+                                  OwnedSlice* compressed_body) {
+    size_t uncompressed_size = Slice::compute_total_size(body);
+    if (codec != nullptr && uncompressed_size > 0) {
+        size_t max_compressed_size = codec->max_compressed_len(uncompressed_size);
+        faststring buf;
+        buf.resize(max_compressed_size);
+        Slice compressed_slice(buf);
+        RETURN_IF_ERROR(codec->compress(body, &compressed_slice));
+        buf.resize(compressed_slice.get_size());
+
+        double space_saving = 1.0 - static_cast<double>(buf.size()) / uncompressed_size;
+        // return compressed body only when it saves more than min_space_saving
+        if (space_saving > 0 && space_saving >= min_space_saving) {
+            *compressed_body = buf.build();
+            return Status::OK();
+        }
+    }
+    // otherwise, do not compress
+    OwnedSlice empty;
+    *compressed_body = std::move(empty);
+    return Status::OK();
+}
+
+Status PageIO::write_page(WritableFile* file,
+                          const std::vector<Slice>& body,
+                          const PageFooterPB& footer,
+                          PagePointer* result) {
+    // sanity check of page footer
+    CHECK(footer.has_type()) << "type must be set";
+    CHECK(footer.has_uncompressed_size()) << "uncompressed_size must be set";
+    switch (footer.type()) {
+    case DATA_PAGE:
+        CHECK(footer.has_data_page_footer());
+        break;
+    case INDEX_PAGE:
+        CHECK(footer.has_index_page_footer());
+        break;
+    case DICTIONARY_PAGE:
+        CHECK(footer.has_dict_page_footer());
+        break;
+    case SHORT_KEY_PAGE:
+        CHECK(footer.has_short_key_page_footer());
+        break;
 
 Review comment:
   add a default and check

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format

Posted by GitBox <gi...@apache.org>.
chaoyli commented on issue #2953: [segment_v2] Switch to Unified and Extensible Page Format
URL: https://github.com/apache/incubator-doris/pull/2953#issuecomment-590019930
 
 
   3. I found the default decoding of VARCHAR and CHAR is dictionary encoding without policy, this may be consuming space when cardinality is high. And if we want to change it, we should rebuild all of the data.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org