You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/07 05:12:26 UTC

[GitHub] [arrow] quinnj opened a new pull request #9121: [ARROW-11158] [Julia] Implement Decimal256 support for Julia

quinnj opened a new pull request #9121:
URL: https://github.com/apache/arrow/pull/9121


   This PR also includes a few other bugfixes since the original Julia code
   donation. The details of the release process of the code here and
   transition from the JuliaData/Arrow.jl repository are still being ironed
   out, so there has been development in both places and this synchronizes
   the two.
   
   cc: @nealrichardson 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #9121:
URL: https://github.com/apache/arrow/pull/9121


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] quinnj commented on a change in pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
quinnj commented on a change in pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#discussion_r554147764



##########
File path: julia/Arrow/Project.toml
##########
@@ -1,9 +1,10 @@
 name = "Arrow"
 uuid = "69666777-d1a9-59fb-9406-91d4454c9d45"
 authors = ["quinnj <qu...@gmail.com>"]
-version = "0.3.0"
+version = "1.1.0"

Review comment:
       This version number tracks the currently registered package in the Julia registry; for the Apache version, I plan on updating the instructions in the README with the `Pkg` command to install the library release version according to the apache/arrow release tag (i.e. `apache-arrow-2.0.0`); that will ensure people can install the exact, official Apache release code.

##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:
+
+```julia
+julia> using Pkg; Pkg.add("Arrow")
+```
+
+## Difference between this code and the JuliaData/Arrow.jl repository
+
+This code is officially part of the apache/arrow repository and as such follows the regulated release cadence of the entire project, following standard community
+voting protocols. The JuliaData/Arrow.jl repository can be viewed as a sort of "dev" or "latest" branch of this code that may release more frequently, but without following
+official apache release guidelines. The two repositories are synced, however, so any bugfix patches in JuliaData will be upstreamed to apache/arrow for each release.

Review comment:
       Well, we currently do a bugfix release after every bugfix commit in JuliaData. There are also the complications with apache/arrow CI right now that make it pretty difficult to develop against. Another point is that Julia packages have historically been their own github repos, and so users naturally go to the github repo to file issues or make PRs. There's much lower friction, institutionally, to do that at JuliaData and then ensure changes get moved up here.
   
   I do think it's something we can transition over time though, with effort. I think the Julia ecosystem will become easier to allow a package to live in a monorepo and dev efforts can be encouraged here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] quinnj commented on a change in pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
quinnj commented on a change in pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#discussion_r554149673



##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:
+
+```julia
+julia> using Pkg; Pkg.add("Arrow")
+```
+
+## Difference between this code and the JuliaData/Arrow.jl repository
+
+This code is officially part of the apache/arrow repository and as such follows the regulated release cadence of the entire project, following standard community
+voting protocols. The JuliaData/Arrow.jl repository can be viewed as a sort of "dev" or "latest" branch of this code that may release more frequently, but without following
+official apache release guidelines. The two repositories are synced, however, so any bugfix patches in JuliaData will be upstreamed to apache/arrow for each release.

Review comment:
       Well, we currently do a bugfix release after every bugfix commit in JuliaData. There are also the complications with apache/arrow CI right now that make it pretty difficult to develop against. Another point is that Julia packages have historically been their own github repos, and so users naturally go to the github repo to file issues or make PRs. There's much lower friction, institutionally, to do that at JuliaData and then ensure changes get moved up here.
   
   I do think it's something we can transition over time though, with effort. I think the Julia ecosystem will become easier to allow a package to live in a monorepo and dev efforts can be encouraged here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#issuecomment-758184038


   @quinnj if you rebase (or push additional changes), the Julia CI should be re-enabled.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] quinnj commented on a change in pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
quinnj commented on a change in pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#discussion_r554147764



##########
File path: julia/Arrow/Project.toml
##########
@@ -1,9 +1,10 @@
 name = "Arrow"
 uuid = "69666777-d1a9-59fb-9406-91d4454c9d45"
 authors = ["quinnj <qu...@gmail.com>"]
-version = "0.3.0"
+version = "1.1.0"

Review comment:
       This version number tracks the currently registered package in the Julia registry; for the Apache version, I plan on updating the instructions in the README with the `Pkg` command to install the library release version according to the apache/arrow release tag (i.e. `apache-arrow-2.0.0`); that will ensure people can install the exact, official Apache release code.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#discussion_r553604933



##########
File path: julia/Arrow/Project.toml
##########
@@ -1,9 +1,10 @@
 name = "Arrow"
 uuid = "69666777-d1a9-59fb-9406-91d4454c9d45"
 authors = ["quinnj <qu...@gmail.com>"]
-version = "0.3.0"
+version = "1.1.0"

Review comment:
       Apache Arrow libraries are currently at 2.0.0 and are about to release 3.0.0; should this version number track that as well?

##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:

Review comment:
       I believe "Apache" should be capitalized throughout

##########
File path: julia/Arrow/docs/src/manual.md
##########
@@ -55,10 +55,50 @@ In the arrow data format, specific logical types are supported, a list of which
 
 * `Date`, `Time`, `Timestamp`, and `Duration` all have natural Julia defintions in `Dates.Date`, `Dates.Time`, `TimeZones.ZonedDateTime`, and `Dates.Period` subtypes, respectively. 
 * `Char` and `Symbol` Julia types are mapped to arrow string types, with additional metadata of the original Julia type; this allows deserializing directly to `Char` and `Symbol` in Julia, while other language implementations will see these columns as just strings
-* `Decimal128` has no corresponding builtin Julia type, so it's deserialized using a compatible type definition in Arrow.jl itself: `Arrow.Decimal`
+* `Decimal128` and `Decimal256` have no corresponding builtin Julia types, so they're deserialized using a compatible type definition in Arrow.jl itself: `Arrow.Decimal`
 
 Note that when `convert=false` is passed, data will be returned in Arrow.jl-defined types that exactly match the arrow definitions of those types; the authoritative source for how each type represents its data can be found in the arrow [`Schema.fbs`](https://github.com/apache/arrow/blob/master/format/Schema.fbs) file.
 
+#### Custom types
+
+To support writing your custom Julia struct, Arrow.jl utilizes the format's mechanism for "extension types" by storing
+the Julia type name in the field metadata. To "hook in" to this machinery, custom types can just call

Review comment:
       FWIW, both Pandas and R store custom metadata that only they recognize in the Schema's key-value metadata, not as extension types. By our understanding, extension types make most sense when it's a type definition that is to be supported in other languages/implementations of Arrow. That's often not the case when you're looking just to make sure you have high fidelity when write/reading data to Feather/Parquet/etc.
   
   That's not to say that it is wrong to do what you're doing here: just sharing how we in python/R have worked through similar issues.

##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:
+
+```julia
+julia> using Pkg; Pkg.add("Arrow")
+```
+
+## Difference between this code and the JuliaData/Arrow.jl repository
+
+This code is officially part of the apache/arrow repository and as such follows the regulated release cadence of the entire project, following standard community
+voting protocols. The JuliaData/Arrow.jl repository can be viewed as a sort of "dev" or "latest" branch of this code that may release more frequently, but without following
+official apache release guidelines. The two repositories are synced, however, so any bugfix patches in JuliaData will be upstreamed to apache/arrow for each release.

Review comment:
       I would encourage you to consider inverting this workflow: do your issue tracking and bugfixes in apache/arrow, and if you need to cut a bugfix release, sync from here to JuliaData/Arrow.jl.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#issuecomment-755889561


   https://issues.apache.org/jira/browse/ARROW-11158


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9121: [ARROW-11158] [Julia] Implement Decimal256 support for Julia

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#issuecomment-755888625


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org