You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/09/19 22:07:02 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

jorgecarleitao opened a new pull request #8224:
URL: https://github.com/apache/arrow/pull/8224


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on a change in pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #8224:
URL: https://github.com/apache/arrow/pull/8224#discussion_r492069938



##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.

Review comment:
       The null bitmap is optional (for arrays that do not contain any null values) but this makes it sound like that always has to be one.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on a change in pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #8224:
URL: https://github.com/apache/arrow/pull/8224#discussion_r492070822



##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.
+Thus, a central enum of this crate is `arrow::datatypes::DataType`, that contains the set of valid
+DataTypes in the specification. For example, `arrow::datatypes::DataType::Utf8`.
+
+Many (but not all) data types have an associated Rust native type. The trait that represents 
+this relationship is `arrow::datatypes::ArrowNativeType`, that most native types implement.
+
+`arrow::datatypes::Field` is a struct that contains an arrays' metadata (datatype and whether its values
+can be null), and a name. `arrow::datatypes::Schema` is just a vector of fields with some optional metadata.
+
+Finally, `arrow::record_batch::RecordBatch` is a struct with a `Schema` and a vector of `Array`s all with the same `len`. A record batch is the highest order struct that this crate currently offers.
+
+### Array
+
+The central trait of this package is `arrow::array::Array`, a dynamically-typed trait that
+can be down-casted to specific implementations, such as `arrow::array::UInt32Array`.
+
+`Array` has `Array::len()`, `Array::data_type()`, and nullability of each of its entries, that can be obtained via `Array::is_null(index)`. As mentioned above, `Array` can be downcasted to specific implementations, via
+
+```rust
+let specific_array = array.as_any().downcast_ref<UInt32Array>().unwrap();
+```
+
+Once downcasted, it offers two calls to retrieve specific values (and nullability):
+
+```rust
+let is_null_0: bool = specifcic_array.is_null(0)
+let value_at_0: u32 = specifcic_array.value(0)
+```
+
+### Memory and Buffers
+
+You can access the whole buffer of an `Array` via `Array::data()`, which returns an `arrow::data::ArrayData`. This struct holds the array's `DataType`, `arrow::buffer::Buffer`s, and `childs` (which are themselves `ArrayData`).
+
+The central structs that array implementations use to allocate and refer to memory
+aligned according to the specification are the `arrow::buffer::Buffer` and `arrow::buffer::MuttableBuffer`.

Review comment:
       typo in `MutableBuffer`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8224:
URL: https://github.com/apache/arrow/pull/8224#issuecomment-695361993


   https://issues.apache.org/jira/browse/ARROW-10044


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on a change in pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #8224:
URL: https://github.com/apache/arrow/pull/8224#discussion_r492069938



##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.

Review comment:
       The null bitmap is optional (for arrays that do not contain any null values) but this makes it sound like that always has to be one.

##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.
+Thus, a central enum of this crate is `arrow::datatypes::DataType`, that contains the set of valid
+DataTypes in the specification. For example, `arrow::datatypes::DataType::Utf8`.
+
+Many (but not all) data types have an associated Rust native type. The trait that represents 
+this relationship is `arrow::datatypes::ArrowNativeType`, that most native types implement.
+
+`arrow::datatypes::Field` is a struct that contains an arrays' metadata (datatype and whether its values
+can be null), and a name. `arrow::datatypes::Schema` is just a vector of fields with some optional metadata.
+
+Finally, `arrow::record_batch::RecordBatch` is a struct with a `Schema` and a vector of `Array`s all with the same `len`. A record batch is the highest order struct that this crate currently offers.
+
+### Array
+
+The central trait of this package is `arrow::array::Array`, a dynamically-typed trait that
+can be down-casted to specific implementations, such as `arrow::array::UInt32Array`.
+
+`Array` has `Array::len()`, `Array::data_type()`, and nullability of each of its entries, that can be obtained via `Array::is_null(index)`. As mentioned above, `Array` can be downcasted to specific implementations, via
+
+```rust
+let specific_array = array.as_any().downcast_ref<UInt32Array>().unwrap();
+```
+
+Once downcasted, it offers two calls to retrieve specific values (and nullability):
+
+```rust
+let is_null_0: bool = specifcic_array.is_null(0)
+let value_at_0: u32 = specifcic_array.value(0)
+```
+
+### Memory and Buffers
+
+You can access the whole buffer of an `Array` via `Array::data()`, which returns an `arrow::data::ArrayData`. This struct holds the array's `DataType`, `arrow::buffer::Buffer`s, and `childs` (which are themselves `ArrayData`).
+
+The central structs that array implementations use to allocate and refer to memory
+aligned according to the specification are the `arrow::buffer::Buffer` and `arrow::buffer::MuttableBuffer`.

Review comment:
       typo in `MutableBuffer`

##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.
+Thus, a central enum of this crate is `arrow::datatypes::DataType`, that contains the set of valid
+DataTypes in the specification. For example, `arrow::datatypes::DataType::Utf8`.
+
+Many (but not all) data types have an associated Rust native type. The trait that represents 
+this relationship is `arrow::datatypes::ArrowNativeType`, that most native types implement.
+
+`arrow::datatypes::Field` is a struct that contains an arrays' metadata (datatype and whether its values
+can be null), and a name. `arrow::datatypes::Schema` is just a vector of fields with some optional metadata.
+
+Finally, `arrow::record_batch::RecordBatch` is a struct with a `Schema` and a vector of `Array`s all with the same `len`. A record batch is the highest order struct that this crate currently offers.
+
+### Array
+
+The central trait of this package is `arrow::array::Array`, a dynamically-typed trait that
+can be down-casted to specific implementations, such as `arrow::array::UInt32Array`.
+
+`Array` has `Array::len()`, `Array::data_type()`, and nullability of each of its entries, that can be obtained via `Array::is_null(index)`. As mentioned above, `Array` can be downcasted to specific implementations, via
+
+```rust
+let specific_array = array.as_any().downcast_ref<UInt32Array>().unwrap();
+```
+
+Once downcasted, it offers two calls to retrieve specific values (and nullability):
+
+```rust
+let is_null_0: bool = specifcic_array.is_null(0)
+let value_at_0: u32 = specifcic_array.value(0)
+```
+
+### Memory and Buffers
+
+You can access the whole buffer of an `Array` via `Array::data()`, which returns an `arrow::data::ArrayData`. This struct holds the array's `DataType`, `arrow::buffer::Buffer`s, and `childs` (which are themselves `ArrayData`).
+
+The central structs that array implementations use to allocate and refer to memory
+aligned according to the specification are the `arrow::buffer::Buffer` and `arrow::buffer::MuttableBuffer`.
+These are the lowest abstraction level of this crate, and are used throughout the crate to 

Review comment:
       nit: `level` should be plural?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] andygrove commented on a change in pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
andygrove commented on a change in pull request #8224:
URL: https://github.com/apache/arrow/pull/8224#discussion_r492071208



##########
File path: rust/arrow/README.md
##########
@@ -21,10 +21,62 @@
 
 [![Coverage Status](https://coveralls.io/repos/github/apache/arrow/badge.svg)](https://coveralls.io/github/apache/arrow)
 
+This crate contains a native Rust implementation of the [Arrow columnar format](https://arrow.apache.org/docs/format/Columnar.html). It uses nightly Rust.
+
+## Developer's guide
+
+Here you can find general information about this crate's content and its organization.
+
+### DataType, Field, Schema and RecordBatch
+
+Every array in Arrow has a data type and a null bitmap, that specifies whether each value is null or not.
+Thus, a central enum of this crate is `arrow::datatypes::DataType`, that contains the set of valid
+DataTypes in the specification. For example, `arrow::datatypes::DataType::Utf8`.
+
+Many (but not all) data types have an associated Rust native type. The trait that represents 
+this relationship is `arrow::datatypes::ArrowNativeType`, that most native types implement.
+
+`arrow::datatypes::Field` is a struct that contains an arrays' metadata (datatype and whether its values
+can be null), and a name. `arrow::datatypes::Schema` is just a vector of fields with some optional metadata.
+
+Finally, `arrow::record_batch::RecordBatch` is a struct with a `Schema` and a vector of `Array`s all with the same `len`. A record batch is the highest order struct that this crate currently offers.
+
+### Array
+
+The central trait of this package is `arrow::array::Array`, a dynamically-typed trait that
+can be down-casted to specific implementations, such as `arrow::array::UInt32Array`.
+
+`Array` has `Array::len()`, `Array::data_type()`, and nullability of each of its entries, that can be obtained via `Array::is_null(index)`. As mentioned above, `Array` can be downcasted to specific implementations, via
+
+```rust
+let specific_array = array.as_any().downcast_ref<UInt32Array>().unwrap();
+```
+
+Once downcasted, it offers two calls to retrieve specific values (and nullability):
+
+```rust
+let is_null_0: bool = specifcic_array.is_null(0)
+let value_at_0: u32 = specifcic_array.value(0)
+```
+
+### Memory and Buffers
+
+You can access the whole buffer of an `Array` via `Array::data()`, which returns an `arrow::data::ArrayData`. This struct holds the array's `DataType`, `arrow::buffer::Buffer`s, and `childs` (which are themselves `ArrayData`).
+
+The central structs that array implementations use to allocate and refer to memory
+aligned according to the specification are the `arrow::buffer::Buffer` and `arrow::buffer::MuttableBuffer`.
+These are the lowest abstraction level of this crate, and are used throughout the crate to 

Review comment:
       nit: `level` should be plural?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me closed pull request #8224: ARROW-10044: [Rust] Improved Arrow's README.

Posted by GitBox <gi...@apache.org>.
nevi-me closed pull request #8224:
URL: https://github.com/apache/arrow/pull/8224


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org