You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by rip-nsk <gi...@git.apache.org> on 2018/05/04 00:03:45 UTC

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

GitHub user rip-nsk opened a pull request:

    https://github.com/apache/orc/pull/265

    ORC-334: [C++] Add AppVeyor support for integration on windows

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rip-nsk/orc ORC-334

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/orc/pull/265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #265
    
----
commit 916d5eece4c3483b4a5a639d81b0536bd88aaa41
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:15:40Z

    s/uint/unsigned/

commit 846cb2ebda368e2b5d1efccf2f30b8d13d151c99
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:16:37Z

    s/bzero/memset

commit 3d44b6a6bdad89897ded341a7c9913582716536c
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:21:22Z

    IANA - Time Zone Database

commit 0e16111b00c9bb9dc5f27980fc95dd7d49ac76a0
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:21:49Z

    Adaptor.hh patch

commit 2fb7b3457612418b22f4e15ffa8eda7aa21cdecc
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:22:29Z

    Timezone.cc patch

commit a856d82be5b39c6eea84a9c78f7ecf8e5619d574
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:23:08Z

    Common.hh patch

commit c35e06e0b27e0a7a110d474aa1c1ba6c19e0e316
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:23:42Z

    appveyor.yml

commit 6db24812e3032a9f551c633e87e096cc2d6efc5f
Author: rip-nsk <ri...@...>
Date:   2018-05-03T23:47:45Z

    CMAKE_BUILD_TYPE

----


---

[GitHub] orc issue #265: ORC-334: [C++] Add AppVeyor support for integration on windo...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/265
  
    +1 LGTM


---

[GitHub] orc issue #265: ORC-334: [C++] Add AppVeyor support for integration on windo...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on the issue:

    https://github.com/apache/orc/pull/265
  
    Added an INFRA ticket to do this here https://issues.apache.org/jira/browse/INFRA-16535


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r187922554
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    I also think that it is better to leave the conversion to the client/customer. We should ideally change the conversion for Nix* systems but cannot due to backward compatibility. We should be able to converge both Java and C++ timestamps in ORC 2.0.
    For Windows, since this is the first official build support, we should be okay to use "UTC" and document this behavior.
    I will merge this PR end of today if there are no objections.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186827146
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Just double checked, we should still use local timezone if writer timezone is not found to keep backward compatibility.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by omalley <gi...@git.apache.org>.
Github user omalley commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186114158
  
    --- Diff: c++/test/CreateTestFiles.cc ---
    @@ -31,8 +31,8 @@
     void writeCustomOrcFile(const std::string& filename,
                             const orc::proto::Metadata& metadata,
                             const orc::proto::Footer& footer,
    -                        const std::vector<uint>& version,
    -                        uint writerVersion) {
    +                        const std::vector<unsigned>& version,
    --- End diff --
    
    We generally prefer to use explicitly sized types, so uint64_t or uint32_t would be better.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/orc/pull/265


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by majetideepak <gi...@git.apache.org>.
Github user majetideepak commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186409327
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Can you comment on why we look at `UTC` for windows instead of `LOCAL_TIMEZONE`?


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by omalley <gi...@git.apache.org>.
Github user omalley commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186474701
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Because there isn't a direct windows equivalent.
    
    Here is what Java writes about it-
    https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/time-zone002.html
    
    There is a registry setting, but it uses a non-standard set of names *doh*. The easiest way would be to create a Java jar and run that to determine the local timezone.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186840580
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    If we cannot find writer timezone from the stripe footer, we assume the writer and reader are in the same timezone. TimestampColumnReader tries to convert the timestamp to UTC to get the same wall clock time as the writer. Unless the files are written under UTC, otherwise we may get wrong timestamps if we set local timezone to UTC by default.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186493219
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    getLocalTimezone is still used in RowReaderImpl


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r187164583
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Any concern on future use cctz library?


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186487106
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    http://unicode.org/cldr/data/common/supplemental/windowsZones.xml - here is a mapping of windows <=> *nix time zones, 
    and here - https://github.com/google/cctz - "timezone" library which probably is a best choice (BTW https://github.com/google/cctz/issues/53).
    Anyway,  it is non trivial work which I think does not related to orc directly.
    I believe it should be customer decision how to pass/extract timestamps from orc library and any timezone conversions should be done on customer side.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186828362
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Why it can't be UTC in this (or any other) case? 


---

[GitHub] orc issue #265: ORC-334: [C++] Add AppVeyor support for integration on windo...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on the issue:

    https://github.com/apache/orc/pull/265
  
    AppVeyor after this PR:
    [https://ci.appveyor.com/project/rip-nsk/orc/build/1.0.32]


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186869413
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    So, proposed solution (local == UTC) will work until we read of the file created using java writer.


---

[GitHub] orc pull request #265: ORC-334: [C++] Add AppVeyor support for integration o...

Posted by wgtmac <gi...@git.apache.org>.
Github user wgtmac commented on a diff in the pull request:

    https://github.com/apache/orc/pull/265#discussion_r186492444
  
    --- Diff: c++/src/Timezone.cc ---
    @@ -710,7 +710,11 @@ namespace orc {
        * Get the local timezone.
        */
       const Timezone& getLocalTimezone() {
    +#ifdef _MSC_VER
    +    return getTimezoneByName("UTC");
    --- End diff --
    
    Recently we have change to UTC everywhere in C++ code. Is it better to delete getLocalTimezone function to avoid confusion?


---

[GitHub] orc issue #265: ORC-334: [C++] Add AppVeyor support for integration on windo...

Posted by rip-nsk <gi...@git.apache.org>.
Github user rip-nsk commented on the issue:

    https://github.com/apache/orc/pull/265
  
    It seems AppVeyor should be enabled by apache/orc project owner here - https://ci.appveyor.com/projects.


---