You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by wgtmac <gi...@git.apache.org> on 2017/04/27 05:44:10 UTC

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

GitHub user wgtmac opened a pull request:

    https://github.com/apache/orc/pull/113

    ORC-176: Refactor common classes for writer and reader

    This is mainly a refactoring change for ORC-176, including:
    1. Extracted common classes and functions into Common.hh and Common.cc;
    2. Put InputStream interface and its implementations from Compression.hh to Stream.hh;
    3. Refactored ColumnStatistic classes to reuse code as many as possible and added setters for writer;
    4. Added more functions in Int128 class for decimal operations;
    5. Added buildTypeFromString() method in Type class to construct ORC type.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/wgtmac/orc ORC-176

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #113
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113785522
  
    --- Diff: c++/src/Common.cc ---
    @@ -0,0 +1,107 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#include "orc/Common.hh"
    +
    +#include <sstream>
    +
    +namespace orc {
    +
    +  std::string compressionKindToString(CompressionKind kind) {
    --- End diff --
    
    Put all these methods pertaining to types in `c++/src/TypeImpl.cc`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113784499
  
    --- Diff: c++/include/orc/Statistics.hh ---
    @@ -37,12 +37,76 @@ namespace orc {
          * of rows because of NULL values and repeated values.
          * @return the number of values
          */
    -    virtual uint64_t getNumberOfValues() const = 0;
    +    virtual uint64_t getNumberOfValues() const {
    +      return valueCount;
    +    }
    +
    +    /**
    +     * Set the number of values in this column
    +     * @param newValueCount new number of values to be set
    +     */
    +    virtual void setNumberOfValues(uint64_t newValueCount) {
    --- End diff --
    
    Are these and the below methods required in the public API?
    I think they belong to the Impl classes in `c++/src/Statistics.hh`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r114033518
  
    --- Diff: c++/src/Stream.cc ---
    @@ -0,0 +1,222 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    --- End diff --
    
    We must clearly distinguish the different kinds of `Streams` :). We have ORC Streams and also the in-memory streams. Maybe we can create a new folder `io` and put all the file and memory streams there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by omalley <gi...@git.apache.org>.
Github user omalley commented on the issue:

    https://github.com/apache/orc/pull/113
  
    I'm sorry, but I committed ORC-183. Can you rebase this patch as a single commit on top of the current master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113729558
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    --- End diff --
    
    Move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac closed the pull request at:

    https://github.com/apache/orc/pull/113


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @majetideepak I would absolutely support 2 and 3 as it will definitely improve the code quality and readability. As for 1, I would say let's do it in caution. At present we have classes like ColumnStatistics and ColumnReader which have a one-to-many inheritance relation. Particularly, class ColumnStatistics has many duplicate code which takes more effort for use when we were adding functions to it for implementing writer. This can be improved first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730948
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    +    StreamKind_PRESENT = 0,
    +    StreamKind_DATA = 1,
    +    StreamKind_LENGTH = 2,
    +    StreamKind_DICTIONARY_DATA = 3,
    +    StreamKind_DICTIONARY_COUNT = 4,
    +    StreamKind_SECONDARY = 5,
    +    StreamKind_ROW_INDEX = 6,
    +    StreamKind_BLOOM_FILTER = 7
    +  };
    +
    +  /**
    +   * Get the string representation of the StreamKind.
    +   */
    +  std::string streamKindToString(StreamKind kind);
    +
    +  class StreamInformation {
    +  public:
    +    virtual ~StreamInformation();
    +
    +    virtual StreamKind getKind() const = 0;
    +    virtual uint64_t getColumnId() const = 0;
    +    virtual uint64_t getOffset() const = 0;
    +    virtual uint64_t getLength() const = 0;
    +  };
    +
    +  enum ColumnEncodingKind {
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r114039116
  
    --- Diff: c++/src/Stream.cc ---
    @@ -0,0 +1,222 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    --- End diff --
    
    This is a good idea. I will do that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113731294
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    +    StreamKind_PRESENT = 0,
    +    StreamKind_DATA = 1,
    +    StreamKind_LENGTH = 2,
    +    StreamKind_DICTIONARY_DATA = 3,
    +    StreamKind_DICTIONARY_COUNT = 4,
    +    StreamKind_SECONDARY = 5,
    +    StreamKind_ROW_INDEX = 6,
    +    StreamKind_BLOOM_FILTER = 7
    +  };
    +
    +  /**
    +   * Get the string representation of the StreamKind.
    +   */
    +  std::string streamKindToString(StreamKind kind);
    +
    +  class StreamInformation {
    +  public:
    +    virtual ~StreamInformation();
    +
    +    virtual StreamKind getKind() const = 0;
    +    virtual uint64_t getColumnId() const = 0;
    +    virtual uint64_t getOffset() const = 0;
    +    virtual uint64_t getLength() const = 0;
    +  };
    +
    +  enum ColumnEncodingKind {
    +    ColumnEncodingKind_DIRECT = 0,
    +    ColumnEncodingKind_DICTIONARY = 1,
    +    ColumnEncodingKind_DIRECT_V2 = 2,
    +    ColumnEncodingKind_DICTIONARY_V2 = 3
    +  };
    +
    +  std::string columnEncodingKindToString(ColumnEncodingKind kind);
    +
    +  class StripeInformation {
    --- End diff --
    
    move to `orc/Stripe.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730711
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    +    StreamKind_PRESENT = 0,
    +    StreamKind_DATA = 1,
    +    StreamKind_LENGTH = 2,
    +    StreamKind_DICTIONARY_DATA = 3,
    +    StreamKind_DICTIONARY_COUNT = 4,
    +    StreamKind_SECONDARY = 5,
    +    StreamKind_ROW_INDEX = 6,
    +    StreamKind_BLOOM_FILTER = 7
    +  };
    +
    +  /**
    +   * Get the string representation of the StreamKind.
    +   */
    +  std::string streamKindToString(StreamKind kind);
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @wgtmac I will provide feedback shortly. For some reason, Github is not saving my comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113729725
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/113
  
    I will fix the ColumnStatistics code. Templates should help here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730980
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    +    StreamKind_PRESENT = 0,
    +    StreamKind_DATA = 1,
    +    StreamKind_LENGTH = 2,
    +    StreamKind_DICTIONARY_DATA = 3,
    +    StreamKind_DICTIONARY_COUNT = 4,
    +    StreamKind_SECONDARY = 5,
    +    StreamKind_ROW_INDEX = 6,
    +    StreamKind_BLOOM_FILTER = 7
    +  };
    +
    +  /**
    +   * Get the string representation of the StreamKind.
    +   */
    +  std::string streamKindToString(StreamKind kind);
    +
    +  class StreamInformation {
    +  public:
    +    virtual ~StreamInformation();
    +
    +    virtual StreamKind getKind() const = 0;
    +    virtual uint64_t getColumnId() const = 0;
    +    virtual uint64_t getOffset() const = 0;
    +    virtual uint64_t getLength() const = 0;
    +  };
    +
    +  enum ColumnEncodingKind {
    +    ColumnEncodingKind_DIRECT = 0,
    +    ColumnEncodingKind_DICTIONARY = 1,
    +    ColumnEncodingKind_DIRECT_V2 = 2,
    +    ColumnEncodingKind_DICTIONARY_V2 = 3
    +  };
    +
    +  std::string columnEncodingKindToString(ColumnEncodingKind kind);
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @majetideepak Thanks for your previous review. I have replied to your comments and also have deleted some changes to make this commit more concise. Can you review it again?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113854583
  
    --- Diff: c++/include/orc/Statistics.hh ---
    @@ -37,12 +37,76 @@ namespace orc {
          * of rows because of NULL values and repeated values.
          * @return the number of values
          */
    -    virtual uint64_t getNumberOfValues() const = 0;
    +    virtual uint64_t getNumberOfValues() const {
    +      return valueCount;
    +    }
    +
    +    /**
    +     * Set the number of values in this column
    +     * @param newValueCount new number of values to be set
    +     */
    +    virtual void setNumberOfValues(uint64_t newValueCount) {
    --- End diff --
    
    If we put this into Impl classes, then there are many duplicate code which is not elegant. So we decided to lift them up to the base classes so that duplicate codes are reduced significantly while we were implementing the writer. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113787004
  
    --- Diff: c++/src/Stream.cc ---
    @@ -0,0 +1,222 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    --- End diff --
    
    Use `InputStream.cc` instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113800072
  
    --- Diff: c++/src/Stream.cc ---
    @@ -0,0 +1,222 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    --- End diff --
    
    We will add OutputStream to this file so I think name it Stream.cc is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @omalley Can you help review this change? It has no new code and our main argument is on the correct place (and naming) to put these lines. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730895
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    +    StreamKind_PRESENT = 0,
    +    StreamKind_DATA = 1,
    +    StreamKind_LENGTH = 2,
    +    StreamKind_DICTIONARY_DATA = 3,
    +    StreamKind_DICTIONARY_COUNT = 4,
    +    StreamKind_SECONDARY = 5,
    +    StreamKind_ROW_INDEX = 6,
    +    StreamKind_BLOOM_FILTER = 7
    +  };
    +
    +  /**
    +   * Get the string representation of the StreamKind.
    +   */
    +  std::string streamKindToString(StreamKind kind);
    +
    +  class StreamInformation {
    --- End diff --
    
    move to a new file `orc/Stripe.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730572
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @majetideepak Yes I think it will be much easier if I separate it into more smaller PRs. Originally we wanted to do this large PR because it may take a lot of effort for us to split them. As I saw your PRs of ORC-87 and ORC-173, I decide to split them because your code may have conflicts with ours, e.g. in TimestampColumnStatistics. I will update this PR later to make it smaller and create more JIRAs to reflect the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @omalley I've created a separate [PR](https://github.com/apache/orc/pull/118/) based on the current master per your request and thus close this one.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113784951
  
    --- Diff: c++/include/orc/Statistics.hh ---
    @@ -37,12 +37,76 @@ namespace orc {
          * of rows because of NULL values and repeated values.
          * @return the number of values
          */
    -    virtual uint64_t getNumberOfValues() const = 0;
    +    virtual uint64_t getNumberOfValues() const {
    +      return valueCount;
    +    }
    +
    +    /**
    +     * Set the number of values in this column
    +     * @param newValueCount new number of values to be set
    +     */
    +    virtual void setNumberOfValues(uint64_t newValueCount) {
    +      valueCount = newValueCount;
    +    }
    +
    +    /**
    +     * Check whether column has null value
    +     * @return true if has null value
    +     */
    +    virtual bool hasNull() const {
    +      return hasNullValue;
    +    }
    +
    +    /**
    +     * Set whether column has null value
    +     * @param newHasNull has null value
    +     */
    +    virtual void setHasNull(bool newHasNull) {
    +      hasNullValue = newHasNull;
    +    }
     
         /**
          * print out statistics of column if any
          */
         virtual std::string toString() const = 0;
    +
    +    /**
    +     * Increases count of values
    +     * @param count number of values to be increased
    +     */
    +    virtual void increase(uint64_t count) {
    +      valueCount += count;
    +    }
    +
    +    /**
    +     * reset column statistics to initial state
    +     */
    +    virtual void reset() {
    +      hasNullValue = false;
    +      valueCount = 0;
    +    }
    +
    +    /**
    +     * Merges another statistics
    +     * @param other statistics to be merged
    +     */
    +    virtual void merge(const ColumnStatistics& other) {
    +      hasNullValue |= other.hasNull();
    +      valueCount += other.getNumberOfValues();
    +    }
    +
    +    /**
    +     * Convert statistics to protobuf version
    +     * @param pbStats output of protobuf stats
    +     */
    +    virtual void toProtoBuf(proto::ColumnStatistics& pbStats) const {
    +      pbStats.set_hasnull(hasNullValue);
    +      pbStats.set_numberofvalues(valueCount);
    +    }
    +
    +  protected:
    +    uint64_t valueCount;
    --- End diff --
    
    The class members should belong to the Impl class in  `c++/include/orc/Statistics.hh` as well.
    What is the reason behind moving them here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @wgtmac I made some more comments. Thanks for minimizing the patches.
    I learnt the following rules of thumb for C++ apache projects, which might also help us improve the ORC C++ library significantly. I feel this is the right opportunity to do this.
    1. An implementation must follow the PIMPL idiom. The public headers must include only the minimal API and all the class members, implementation will go in the Impl classes.
    2. We must follow a coding style. How about we pick the google C++ coding style https://google.github.io/styleguide/cppguide.html?
    3. We must try to disambiguate the class names and file names as much as possible so that anyone new to the code base can quickly understand the code.
    
    Let me know what your thoughts are. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113730656
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc issue #113: ORC-176: Refactor common classes for writer and reader

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on the issue:

    https://github.com/apache/orc/pull/113
  
    @majetideepak That's cool. You may create a separate JIRA for it and I will continue [ORC-184](https://issues.apache.org/jira/browse/ORC-184) after your fix is done. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r114033038
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    --- End diff --
    
    I think all the ORC types including `Compression`, `Stream`, etc. should go to `Type.hh`. 
    I would do away with `Common.hh`.
    @omalley should give his view here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113845656
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    --- End diff --
    
    I think these are better to stay in Common.hh as Type.hh is specific for schema types. Same as below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] orc pull request #113: ORC-176: Refactor common classes for writer and reade...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/113#discussion_r113729840
  
    --- Diff: c++/include/orc/Common.hh ---
    @@ -0,0 +1,167 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +#ifndef ORC_COMMON_HH
    +#define ORC_COMMON_HH
    +
    +#include "orc/Vector.hh"
    +#include "orc/Type.hh"
    +#include "Exceptions.hh"
    +#include "wrap/orc-proto-wrapper.hh"
    +
    +#include <string>
    +
    +namespace orc {
    +  enum CompressionKind {
    +    CompressionKind_NONE = 0,
    +    CompressionKind_ZLIB = 1,
    +    CompressionKind_SNAPPY = 2,
    +    CompressionKind_LZO = 3,
    +    CompressionKind_LZ4 = 4,
    +    CompressionKind_ZSTD = 5,
    +    CompressionKind_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the CompressionKind.
    +   */
    +  std::string compressionKindToString(CompressionKind kind);
    +
    +  enum WriterVersion {
    +    WriterVersion_ORIGINAL = 0,
    +    WriterVersion_HIVE_8732 = 1,
    +    WriterVersion_HIVE_4243 = 2,
    +    WriterVersion_HIVE_12055 = 3,
    +    WriterVersion_HIVE_13083 = 4,
    +    WriterVersion_ORC_101 = 5,
    +    WriterVersion_ORC_135 = 6,
    +    WriterVersion_MAX = INT64_MAX
    +  };
    +
    +  /**
    +   * Get the name of the WriterVersion.
    +   */
    +  std::string writerVersionToString(WriterVersion kind);
    +
    +  enum StreamKind {
    --- End diff --
    
    move to `orc/Type.hh`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---