You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/07 21:50:34 UTC

[GitHub] [arrow] nealrichardson commented on a change in pull request #9121: ARROW-11158: [Julia] Implement Decimal256 support for Julia

nealrichardson commented on a change in pull request #9121:
URL: https://github.com/apache/arrow/pull/9121#discussion_r553604933



##########
File path: julia/Arrow/Project.toml
##########
@@ -1,9 +1,10 @@
 name = "Arrow"
 uuid = "69666777-d1a9-59fb-9406-91d4454c9d45"
 authors = ["quinnj <qu...@gmail.com>"]
-version = "0.3.0"
+version = "1.1.0"

Review comment:
       Apache Arrow libraries are currently at 2.0.0 and are about to release 3.0.0; should this version number track that as well?

##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:

Review comment:
       I believe "Apache" should be capitalized throughout

##########
File path: julia/Arrow/docs/src/manual.md
##########
@@ -55,10 +55,50 @@ In the arrow data format, specific logical types are supported, a list of which
 
 * `Date`, `Time`, `Timestamp`, and `Duration` all have natural Julia defintions in `Dates.Date`, `Dates.Time`, `TimeZones.ZonedDateTime`, and `Dates.Period` subtypes, respectively. 
 * `Char` and `Symbol` Julia types are mapped to arrow string types, with additional metadata of the original Julia type; this allows deserializing directly to `Char` and `Symbol` in Julia, while other language implementations will see these columns as just strings
-* `Decimal128` has no corresponding builtin Julia type, so it's deserialized using a compatible type definition in Arrow.jl itself: `Arrow.Decimal`
+* `Decimal128` and `Decimal256` have no corresponding builtin Julia types, so they're deserialized using a compatible type definition in Arrow.jl itself: `Arrow.Decimal`
 
 Note that when `convert=false` is passed, data will be returned in Arrow.jl-defined types that exactly match the arrow definitions of those types; the authoritative source for how each type represents its data can be found in the arrow [`Schema.fbs`](https://github.com/apache/arrow/blob/master/format/Schema.fbs) file.
 
+#### Custom types
+
+To support writing your custom Julia struct, Arrow.jl utilizes the format's mechanism for "extension types" by storing
+the Julia type name in the field metadata. To "hook in" to this machinery, custom types can just call

Review comment:
       FWIW, both Pandas and R store custom metadata that only they recognize in the Schema's key-value metadata, not as extension types. By our understanding, extension types make most sense when it's a type definition that is to be supported in other languages/implementations of Arrow. That's often not the case when you're looking just to make sure you have high fidelity when write/reading data to Feather/Parquet/etc.
   
   That's not to say that it is wrong to do what you're doing here: just sharing how we in python/R have worked through similar issues.

##########
File path: julia/Arrow/README.md
##########
@@ -1,13 +1,38 @@
 # Arrow
 
-[![Build Status](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)](https://travis-ci.com/JuliaData/Arrow.jl.svg?branch=master)
+[![docs](https://img.shields.io/badge/docs-latest-blue&logo=julia)](https://arrow.juliadata.org/dev/)
+[![CI](https://github.com/JuliaData/Arrow.jl/workflows/CI/badge.svg)](https://github.com/JuliaData/Arrow.jl/actions?query=workflow%3ACI)
 [![codecov](https://codecov.io/gh/JuliaData/Arrow.jl/branch/master/graph/badge.svg)](https://codecov.io/gh/JuliaData/Arrow.jl)
 
+[![deps](https://juliahub.com/docs/Arrow/deps.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w?t=2)
+[![version](https://juliahub.com/docs/Arrow/version.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+[![pkgeval](https://juliahub.com/docs/Arrow/pkgeval.svg)](https://juliahub.com/ui/Packages/Arrow/QnF3w)
+
 This is a pure Julia implementation of the [Apache Arrow](https://arrow.apache.org) data standard.  This package provides Julia `AbstractVector` objects for
 referencing data that conforms to the Arrow standard.  This allows users to seamlessly interface Arrow formatted data with a great deal of existing Julia code.
 
 Please see this [document](https://arrow.apache.org/docs/format/Columnar.html#physical-memory-layout) for a description of the Arrow memory layout.
 
+## Installation
+
+The package can be installed by typing in the following in a Julia REPL:
+
+```julia
+julia> using Pkg; Pkg.add(url="https://github.com/apache/arrow", subdir="julia/Arrow.jl")
+```
+
+or to use the non-official-apache code that will sometimes include bugfix patches between apache releases, you can do:
+
+```julia
+julia> using Pkg; Pkg.add("Arrow")
+```
+
+## Difference between this code and the JuliaData/Arrow.jl repository
+
+This code is officially part of the apache/arrow repository and as such follows the regulated release cadence of the entire project, following standard community
+voting protocols. The JuliaData/Arrow.jl repository can be viewed as a sort of "dev" or "latest" branch of this code that may release more frequently, but without following
+official apache release guidelines. The two repositories are synced, however, so any bugfix patches in JuliaData will be upstreamed to apache/arrow for each release.

Review comment:
       I would encourage you to consider inverting this workflow: do your issue tracking and bugfixes in apache/arrow, and if you need to cut a bugfix release, sync from here to JuliaData/Arrow.jl.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org