You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/24 14:56:10 UTC

[GitHub] [arrow] manojkarthick opened a new pull request #9306: ARROW-10297: [Rust] Add flag to show json output in parquet-read binary

manojkarthick opened a new pull request #9306:
URL: https://github.com/apache/arrow/pull/9306


   Add an option to print output in JSON format. in the parquet-read binary. Having json output allows for easy analysis using tools like [jq](https://stedolan.github.io/jq/). This PR builds on the changes implemented in https://github.com/apache/arrow/pull/8686 and incorporates the suggestions in that PR.
   
   **Changelog**
   
   * Update all three binaries `parquet-schema`, `parquet-rowcount` and `parquet-read` to use [clap](https://github.com/clap-rs/clap) for argument parsing
   * Add `to_json_value()` method to get `serde_json::Value` from `Row` and `Field` structs (Thanks to @jhorstmann for these changes!)
   * parquet-schema:
      * Convert verbose argument into `-v/--verbose` flag
    * parquet-read:
      * Add a new flag `-j/--json` that prints the file contents in json lines format
      * The feature is gated under the `json_output` cargo feature
   * Update documentation and README with instructions for running
   * The binaries now use version and author information as defined in Cargo.toml
   
   Example output:
   
   ```
   ❯ parquet-read cities.parquet 3 --json
   {"continent":"Europe","country":{"name":"France","city":["Paris","Nice","Marseilles","Cannes"]}}
   {"continent":"Europe","country":{"name":"Greece","city":["Athens","Piraeus","Hania","Heraklion","Rethymnon","Fira"]}}
   {"continent":"North America","country":{"name":"Canada","city":["Toronto","Vancouver","St. John's","Saint John","Montreal","Halifax","Winnipeg","Calgary","Saskatoon","Ottawa","Yellowknife"]}}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r564465581



##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,98 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-read <file-path> [num-records]
+//!  parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file
+//!     <num-records>    Number of records to read. When not provided, all records are read.
 //!
 //! Note that `parquet-read` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
+use parquet::record::Row;
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() != 2 && args.len() != 3 {
-        println!("Usage: parquet-read <file-path> [num-records]");
-        process::exit(1);
-    }
+    #[cfg(not(feature = "json_output"))]
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")
+        .arg(
+            Arg::with_name("file_path")
+                .value_name("file-path")
+                .required(true)
+                .index(1)
+                .help("Path to a parquet file"),
+        )
+        .arg(
+            Arg::with_name("num_records")
+                .value_name("num-records")
+                .index(2)
+                .help(
+                    "Number of records to read. When not provided, all records are read.",
+                ),
+        );
+
+    #[cfg(feature = "json_output")]
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")
+        .arg(
+            Arg::with_name("file_path")
+                .value_name("file-path")
+                .required(true)
+                .index(1)
+                .help("Path to a parquet file"),
+        )
+        .arg(
+            Arg::with_name("num_records")
+                .value_name("num-records")
+                .index(2)
+                .help(
+                    "Number of records to read. When not provided, all records are read.",
+                ),
+        )
+        .arg(

Review comment:
       I wonder if it would make sense to try to reduce the redundancy. Maybe something like 
   ```
   let app = App::new("parquet-read")...;
   
   #[cfg(feature = "json_output")]
   let app = app.arg(Arg::with_name("json")...)
   
   ```
   
   

##########
File path: rust/parquet/Cargo.toml
##########
@@ -43,6 +43,8 @@ chrono = "0.4"
 num-bigint = "0.3"
 arrow = { path = "../arrow", version = "4.0.0-SNAPSHOT", optional = true }
 base64 = { version = "0.12", optional = true }
+clap = "2.33.3"

Review comment:
       would it be possible to make the `clap` dependency optional as well? Or maybe we can move the command line tools into their own feature?  The hope for parquet is that it becomes a widely used library at the lower levels of a software stack so I think keeping dependencies minimal is a good goal
   
   @nevi-me  what do you think?

##########
File path: rust/parquet/src/record/api.rs
##########
@@ -635,6 +648,55 @@ impl Field {
             _ => nyi!(descr, value),
         }
     }
+
+    #[cfg(feature = "json_output")]
+    pub fn to_json_value(&self) -> Value {
+        match &self {
+            Field::Null => Value::Null,
+            Field::Bool(b) => Value::Bool(*b),
+            Field::Byte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Short(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Int(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Long(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UByte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UShort(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UInt(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::ULong(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Float(n) => serde_json::Number::from_f64(*n as f64)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Double(n) => serde_json::Number::from_f64(*n)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Decimal(n) => Value::String(convert_decimal_to_string(&n)),
+            Field::Str(s) => Value::String(s.to_owned()),
+            Field::Bytes(b) => Value::String(base64::encode(b.data())),
+            Field::Date(d) => Value::String(convert_date_to_string(*d)),
+            Field::TimestampMillis(ts) => {
+                Value::String(convert_timestamp_millis_to_string(*ts))
+            }
+            Field::TimestampMicros(ts) => {
+                Value::String(convert_timestamp_micros_to_string(*ts))
+            }
+            Field::Group(row) => row.to_json_value(),
+            Field::ListInternal(fields) => {
+                Value::Array(fields.elements.iter().map(|f| f.to_json_value()).collect())
+            }
+            Field::MapInternal(map) => Value::Object(
+                map.entries
+                    .iter()
+                    .map(|(key_field, value_field)| {
+                        let key_val = key_field.to_json_value();
+                        let key_str = key_val
+                            .as_str()
+                            .map(|s| s.to_owned())
+                            .unwrap_or_else(|| key_val.to_string());
+                        (key_str, value_field.to_json_value())
+                    })
+                    .collect(),
+            ),
+        }
+    }
 }

Review comment:
       I suggest adding some tests to show this code working as well as avoid regressions in the future (aka avoid someone breaking this code accidentally in the future but not realizing it)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (d18cfd7) into [master](https://codecov.io/gh/apache/arrow/commit/437c8c944acb3479b76804f041f5f8cbce309fa7?el=desc) (437c8c9) will **increase** coverage by `0.09%`.
   > The diff coverage is `8.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   + Coverage   81.88%   81.98%   +0.09%     
   ==========================================
     Files         215      227      +12     
     Lines       52988    53363     +375     
   ==========================================
   + Hits        43391    43751     +360     
   - Misses       9597     9612      +15     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.63% <8.33%> (-5.26%)` | :arrow_down: |
   | [rust/datafusion/src/datasource/csv.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9kYXRhc291cmNlL2Nzdi5ycw==) | `60.46% <0.00%> (-4.54%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `55.85% <0.00%> (-3.12%)` | :arrow_down: |
   | [rust/arrow/src/array/equal/boolean.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZXF1YWwvYm9vbGVhbi5ycw==) | `97.43% <0.00%> (-2.57%)` | :arrow_down: |
   | [rust/datafusion/src/physical\_plan/projection.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL3Byb2plY3Rpb24ucnM=) | `84.93% <0.00%> (-0.99%)` | :arrow_down: |
   | [rust/datafusion/src/sql/parser.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGFyc2VyLnJz) | `81.45% <0.00%> (-0.90%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.41% <0.00%> (-0.80%)` | :arrow_down: |
   | [...t/datafusion/src/physical\_plan/coalesce\_batches.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2NvYWxlc2NlX2JhdGNoZXMucnM=) | `84.11% <0.00%> (-0.80%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/filter.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2ZpbHRlci5ycw==) | `97.71% <0.00%> (-0.77%)` | :arrow_down: |
   | [...datafusion/src/physical\_plan/string\_expressions.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL3N0cmluZ19leHByZXNzaW9ucy5ycw==) | `86.95% <0.00%> (-0.55%)` | :arrow_down: |
   | ... and [87 more](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [437c8c9...d18cfd7](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-770443904


   https://issues.apache.org/jira/browse/ARROW-10297


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-770432060


   Thanks for the reviews @alamb and @sunchao ! I've addressed all the comments on the PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567471430



##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
 //! parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file

Review comment:
       My bad: Missed capitalizing in the doc comments! Thank you :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9306: ARROW-10297: [Rust] Add flag to show json output in parquet-read binary

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766360952


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567264445



##########
File path: rust/parquet/src/record/api.rs
##########
@@ -635,6 +648,55 @@ impl Field {
             _ => nyi!(descr, value),
         }
     }
+
+    #[cfg(feature = "json_output")]
+    pub fn to_json_value(&self) -> Value {
+        match &self {
+            Field::Null => Value::Null,
+            Field::Bool(b) => Value::Bool(*b),
+            Field::Byte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Short(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Int(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Long(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UByte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UShort(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UInt(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::ULong(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Float(n) => serde_json::Number::from_f64(*n as f64)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Double(n) => serde_json::Number::from_f64(*n)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Decimal(n) => Value::String(convert_decimal_to_string(&n)),
+            Field::Str(s) => Value::String(s.to_owned()),
+            Field::Bytes(b) => Value::String(base64::encode(b.data())),
+            Field::Date(d) => Value::String(convert_date_to_string(*d)),
+            Field::TimestampMillis(ts) => {
+                Value::String(convert_timestamp_millis_to_string(*ts))
+            }
+            Field::TimestampMicros(ts) => {
+                Value::String(convert_timestamp_micros_to_string(*ts))
+            }
+            Field::Group(row) => row.to_json_value(),
+            Field::ListInternal(fields) => {
+                Value::Array(fields.elements.iter().map(|f| f.to_json_value()).collect())
+            }
+            Field::MapInternal(map) => Value::Object(
+                map.entries
+                    .iter()
+                    .map(|(key_field, value_field)| {
+                        let key_val = key_field.to_json_value();
+                        let key_str = key_val
+                            .as_str()
+                            .map(|s| s.to_owned())
+                            .unwrap_or_else(|| key_val.to_string());
+                        (key_str, value_field.to_json_value())
+                    })
+                    .collect(),
+            ),
+        }
+    }
 }

Review comment:
       I've also updated the PR and added the test: [test_to_json_value](https://github.com/apache/arrow/pull/9306/files#diff-d6de2aa116b82bf7bfdf40fcbad13dc947139ce7a703990517662ae25eab0159R1676)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao closed pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
sunchao closed pull request #9306:
URL: https://github.com/apache/arrow/pull/9306


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (0bd412e) into [master](https://codecov.io/gh/apache/arrow/commit/437c8c944acb3479b76804f041f5f8cbce309fa7?el=desc) (437c8c9) will **decrease** coverage by `0.06%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   - Coverage   81.88%   81.82%   -0.07%     
   ==========================================
     Files         215      215              
     Lines       52988    53032      +44     
   ==========================================
   + Hits        43391    43392       +1     
   - Misses       9597     9640      +43     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.50% <0.00%> (-5.38%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `95.43% <0.00%> (+0.19%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [437c8c9...0bd412e](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567462432



##########
File path: rust/parquet/README.md
##########
@@ -79,23 +79,23 @@ enabled by adding `RUSTFLAGS="-C target-feature=+sse4.2"` before the
 `cargo build` command.
 
 ## Test
-Run `cargo test` for unit tests.
+Run `cargo test` for unit tests. To also run tests related to the binaries, use `cargo test --features cli`.
 
 ## Binaries
-The following binaries are provided (use `cargo install` to install them):
+The following binaries are provided (use `cargo install --features cli` to install them):
 - **parquet-schema** for printing Parquet file schema and metadata.
-`Usage: parquet-schema <file-path> [verbose]`, where `file-path` is the path to a Parquet file,
-and optional `verbose` is the boolean flag that allows to print full metadata or schema only
-(when not specified only schema will be printed).
+`Usage: parquet-schema <file-path>`, where `file-path` is the path to a Parquet file. Use `-v/--verbose` flag 
+to print full metadata or schema only (when not specified only schema will be printed).
 
 - **parquet-read** for reading records from a Parquet file.
 `Usage: parquet-read <file-path> [num-records]`, where `file-path` is the path to a Parquet file,
 and `num-records` is the number of records to read from a file (when not specified all records will
-be printed).
+be printed). Use `cargo install --features json_output` to enable `-j/--json` flag for printing output

Review comment:
       Hmm is this correct? I don't see `json_output` in the Cargo file. The flag is associated with the command?

##########
File path: rust/parquet/src/bin/parquet-rowcount.rs
##########
@@ -34,30 +34,44 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-rowcount <file-path> ...
+//! parquet-rowcount <file-paths>...
 //! ```
-//! where `file-path` is the path to a Parquet file and `...` is any additional number of
-//! parquet files to count the number of rows from.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-paths>...    List of parquet files to read from
 //!
 //! Note that `parquet-rowcount` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() < 2 {
-        println!("Usage: parquet-rowcount <file-path> ...");
-        process::exit(1);
-    }
+    let matches = App::new("parquet-rowcount")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Return number of rows in parquet file")
+        .arg(
+            Arg::with_name("file_paths")
+                .value_name("file-paths")
+                .required(true)
+                .multiple(true)
+                .help("List of parquet files to read from"),

Review comment:
       maybe add "separated by space"

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-read <file-path> [num-records]
+//!  parquet-read <file-path> [num-records]

Review comment:
       nit: remove space?

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-read <file-path> [num-records]
+//!  parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file
+//!     <num-records>    Number of records to read. When not provided, all records are read.
 //!
 //! Note that `parquet-read` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
+use parquet::record::Row;
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() != 2 && args.len() != 3 {
-        println!("Usage: parquet-read <file-path> [num-records]");
-        process::exit(1);
-    }
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")

Review comment:
       maybe say "Read data from a Parquet file and print output in console, in either built-in or JSON format"?

##########
File path: rust/parquet/src/bin/parquet-rowcount.rs
##########
@@ -34,30 +34,44 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-rowcount <file-path> ...
+//! parquet-rowcount <file-paths>...
 //! ```
-//! where `file-path` is the path to a Parquet file and `...` is any additional number of
-//! parquet files to count the number of rows from.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-paths>...    List of parquet files to read from
 //!
 //! Note that `parquet-rowcount` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() < 2 {
-        println!("Usage: parquet-rowcount <file-path> ...");
-        process::exit(1);
-    }
+    let matches = App::new("parquet-rowcount")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Return number of rows in parquet file")

Review comment:
       nit: let's use Parquet instead of parquet everywhere

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-read <file-path> [num-records]
+//!  parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file
+//!     <num-records>    Number of records to read. When not provided, all records are read.
 //!
 //! Note that `parquet-read` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
+use parquet::record::Row;
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() != 2 && args.len() != 3 {
-        println!("Usage: parquet-read <file-path> [num-records]");
-        process::exit(1);
-    }
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")
+        .arg(
+            Arg::with_name("file_path")
+                .value_name("file-path")
+                .required(true)
+                .index(1)
+                .help("Path to a parquet file"),
+        )
+        .arg(
+            Arg::with_name("num_records")
+                .value_name("num-records")
+                .index(2)
+                .help(
+                    "Number of records to read. When not provided, all records are read.",
+                ),
+        )
+        .arg(
+            Arg::with_name("json")
+                .short("j")
+                .long("json")
+                .takes_value(false)
+                .help("Print parquet file in JSON lines Format"),

Review comment:
       nit: Format -> format




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (01c0333) into [master](https://codecov.io/gh/apache/arrow/commit/437c8c944acb3479b76804f041f5f8cbce309fa7?el=desc) (437c8c9) will **increase** coverage by `0.13%`.
   > The diff coverage is `8.33%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   + Coverage   81.88%   82.02%   +0.13%     
   ==========================================
     Files         215      227      +12     
     Lines       52988    53476     +488     
   ==========================================
   + Hits        43391    43864     +473     
   - Misses       9597     9612      +15     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.63% <8.33%> (-5.26%)` | :arrow_down: |
   | [rust/datafusion/src/datasource/csv.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9kYXRhc291cmNlL2Nzdi5ycw==) | `60.46% <0.00%> (-4.54%)` | :arrow_down: |
   | [rust/datafusion/src/scalar.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zY2FsYXIucnM=) | `55.85% <0.00%> (-3.12%)` | :arrow_down: |
   | [rust/arrow/src/array/equal/boolean.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZXF1YWwvYm9vbGVhbi5ycw==) | `97.43% <0.00%> (-2.57%)` | :arrow_down: |
   | [rust/datafusion/src/physical\_plan/projection.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL3Byb2plY3Rpb24ucnM=) | `84.93% <0.00%> (-0.99%)` | :arrow_down: |
   | [rust/datafusion/src/sql/parser.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9zcWwvcGFyc2VyLnJz) | `81.45% <0.00%> (-0.90%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.41% <0.00%> (-0.80%)` | :arrow_down: |
   | [...t/datafusion/src/physical\_plan/coalesce\_batches.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2NvYWxlc2NlX2JhdGNoZXMucnM=) | `84.11% <0.00%> (-0.80%)` | :arrow_down: |
   | [rust/arrow/src/compute/kernels/filter.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2ZpbHRlci5ycw==) | `97.71% <0.00%> (-0.77%)` | :arrow_down: |
   | [...datafusion/src/physical\_plan/string\_expressions.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL3N0cmluZ19leHByZXNzaW9ucy5ycw==) | `86.95% <0.00%> (-0.55%)` | :arrow_down: |
   | ... and [90 more](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [437c8c9...01c0333](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r565660664



##########
File path: rust/parquet/Cargo.toml
##########
@@ -43,6 +43,8 @@ chrono = "0.4"
 num-bigint = "0.3"
 arrow = { path = "../arrow", version = "4.0.0-SNAPSHOT", optional = true }
 base64 = { version = "0.12", optional = true }
+clap = "2.33.3"

Review comment:
       That is what I would suggest -- do you have any thoughts @nevi-me  or @sunchao ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567470915



##########
File path: rust/parquet/src/bin/parquet-rowcount.rs
##########
@@ -34,30 +34,44 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-rowcount <file-path> ...
+//! parquet-rowcount <file-paths>...
 //! ```
-//! where `file-path` is the path to a Parquet file and `...` is any additional number of
-//! parquet files to count the number of rows from.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-paths>...    List of parquet files to read from

Review comment:
       ditto

##########
File path: rust/parquet/src/bin/parquet-schema.rs
##########
@@ -34,49 +34,65 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-schema <file-path> [verbose]
+//! parquet-schema [FLAGS] <file-path>
 //! ```
-//! where `file-path` is the path to a Parquet file and `verbose` is the optional boolean
-//! flag that allows to print schema only, when set to `false` (default behaviour when
-//! not provided), or print full file metadata, when set to `true`.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -V, --version    Prints version information
+//!     -v, --verbose    Enable printing full file metadata
+//!
+//! ## Args
+//!     <file-path>    Path to a parquet file

Review comment:
       ditto

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
 //! parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file

Review comment:
       nit: let's use Parquet instead of parquet :)

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,72 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
 //! parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format

Review comment:
       ditto




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567264338



##########
File path: rust/parquet/Cargo.toml
##########
@@ -43,6 +43,8 @@ chrono = "0.4"
 num-bigint = "0.3"
 arrow = { path = "../arrow", version = "4.0.0-SNAPSHOT", optional = true }
 base64 = { version = "0.12", optional = true }
+clap = "2.33.3"

Review comment:
       Thanks @alamb - I've added a new feature called `cli` that encapsulates the binaries and has `serde_json` and `clap` as dependencies. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (1a064a3) into [master](https://codecov.io/gh/apache/arrow/commit/67d0c2e38011cd883059e3a9fd0ea08088661707?el=desc) (67d0c2e) will **decrease** coverage by `0.06%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   - Coverage   81.84%   81.77%   -0.07%     
   ==========================================
     Files         215      215              
     Lines       52949    52993      +44     
   ==========================================
     Hits        43336    43336              
   - Misses       9613     9657      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.50% <0.00%> (-5.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [67d0c2e...1a064a3](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-771160489


   Thank you @alamb and @sunchao ! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r564488177



##########
File path: rust/parquet/src/record/api.rs
##########
@@ -635,6 +648,55 @@ impl Field {
             _ => nyi!(descr, value),
         }
     }
+
+    #[cfg(feature = "json_output")]
+    pub fn to_json_value(&self) -> Value {
+        match &self {
+            Field::Null => Value::Null,
+            Field::Bool(b) => Value::Bool(*b),
+            Field::Byte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Short(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Int(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Long(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UByte(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UShort(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::UInt(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::ULong(n) => Value::Number(serde_json::Number::from(*n)),
+            Field::Float(n) => serde_json::Number::from_f64(*n as f64)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Double(n) => serde_json::Number::from_f64(*n)
+                .map(Value::Number)
+                .unwrap_or(Value::Null),
+            Field::Decimal(n) => Value::String(convert_decimal_to_string(&n)),
+            Field::Str(s) => Value::String(s.to_owned()),
+            Field::Bytes(b) => Value::String(base64::encode(b.data())),
+            Field::Date(d) => Value::String(convert_date_to_string(*d)),
+            Field::TimestampMillis(ts) => {
+                Value::String(convert_timestamp_millis_to_string(*ts))
+            }
+            Field::TimestampMicros(ts) => {
+                Value::String(convert_timestamp_micros_to_string(*ts))
+            }
+            Field::Group(row) => row.to_json_value(),
+            Field::ListInternal(fields) => {
+                Value::Array(fields.elements.iter().map(|f| f.to_json_value()).collect())
+            }
+            Field::MapInternal(map) => Value::Object(
+                map.entries
+                    .iter()
+                    .map(|(key_field, value_field)| {
+                        let key_val = key_field.to_json_value();
+                        let key_str = key_val
+                            .as_str()
+                            .map(|s| s.to_owned())
+                            .unwrap_or_else(|| key_val.to_string());
+                        (key_str, value_field.to_json_value())
+                    })
+                    .collect(),
+            ),
+        }
+    }
 }

Review comment:
       Thank you - I will work on that.

##########
File path: rust/parquet/src/bin/parquet-read.rs
##########
@@ -34,39 +34,98 @@
 //! ```
 //!
 //! # Usage
-//!
 //! ```
-//! parquet-read <file-path> [num-records]
+//!  parquet-read <file-path> [num-records]
 //! ```
-//! where `file-path` is the path to a Parquet file and `num-records` is the optional
-//! numeric option that allows to specify number of records to read from a file.
-//! When not provided, all records are read.
+//!
+//! ## Flags
+//!     -h, --help       Prints help information
+//!     -j, --json       Print parquet file in JSON lines Format
+//!     -V, --version    Prints version information
+//!
+//! ## Args
+//!     <file-path>      Path to a parquet file
+//!     <num-records>    Number of records to read. When not provided, all records are read.
 //!
 //! Note that `parquet-read` reads full file schema, no projection or filtering is
 //! applied.
 
 extern crate parquet;
 
-use std::{env, fs::File, path::Path, process};
+use std::{env, fs::File, path::Path};
+
+use clap::{crate_authors, crate_version, App, Arg};
 
 use parquet::file::reader::{FileReader, SerializedFileReader};
+use parquet::record::Row;
 
 fn main() {
-    let args: Vec<String> = env::args().collect();
-    if args.len() != 2 && args.len() != 3 {
-        println!("Usage: parquet-read <file-path> [num-records]");
-        process::exit(1);
-    }
+    #[cfg(not(feature = "json_output"))]
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")
+        .arg(
+            Arg::with_name("file_path")
+                .value_name("file-path")
+                .required(true)
+                .index(1)
+                .help("Path to a parquet file"),
+        )
+        .arg(
+            Arg::with_name("num_records")
+                .value_name("num-records")
+                .index(2)
+                .help(
+                    "Number of records to read. When not provided, all records are read.",
+                ),
+        );
+
+    #[cfg(feature = "json_output")]
+    let app = App::new("parquet-read")
+        .version(crate_version!())
+        .author(crate_authors!())
+        .about("Read data from parquet file")
+        .arg(
+            Arg::with_name("file_path")
+                .value_name("file-path")
+                .required(true)
+                .index(1)
+                .help("Path to a parquet file"),
+        )
+        .arg(
+            Arg::with_name("num_records")
+                .value_name("num-records")
+                .index(2)
+                .help(
+                    "Number of records to read. When not provided, all records are read.",
+                ),
+        )
+        .arg(

Review comment:
       I tried but that didn't work since `clap::App` doesn't implement the `Copy` trait. OTOH, if we move this into a CLI tools feature, I don't think we need to have a separate feature for json output. Thoughts?

##########
File path: rust/parquet/Cargo.toml
##########
@@ -43,6 +43,8 @@ chrono = "0.4"
 num-bigint = "0.3"
 arrow = { path = "../arrow", version = "4.0.0-SNAPSHOT", optional = true }
 base64 = { version = "0.12", optional = true }
+clap = "2.33.3"

Review comment:
       Thanks @alamb ! Just to confirm: Having one feature for command line tools that has both `clap` + `serde_json` as dependencies and when used without that feature, no binaries are produced. Is my understanding correct?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567465341



##########
File path: rust/parquet/README.md
##########
@@ -79,23 +79,23 @@ enabled by adding `RUSTFLAGS="-C target-feature=+sse4.2"` before the
 `cargo build` command.
 
 ## Test
-Run `cargo test` for unit tests.
+Run `cargo test` for unit tests. To also run tests related to the binaries, use `cargo test --features cli`.
 
 ## Binaries
-The following binaries are provided (use `cargo install` to install them):
+The following binaries are provided (use `cargo install --features cli` to install them):
 - **parquet-schema** for printing Parquet file schema and metadata.
-`Usage: parquet-schema <file-path> [verbose]`, where `file-path` is the path to a Parquet file,
-and optional `verbose` is the boolean flag that allows to print full metadata or schema only
-(when not specified only schema will be printed).
+`Usage: parquet-schema <file-path>`, where `file-path` is the path to a Parquet file. Use `-v/--verbose` flag 
+to print full metadata or schema only (when not specified only schema will be printed).
 
 - **parquet-read** for reading records from a Parquet file.
 `Usage: parquet-read <file-path> [num-records]`, where `file-path` is the path to a Parquet file,
 and `num-records` is the number of records to read from a file (when not specified all records will
-be printed).
+be printed). Use `cargo install --features json_output` to enable `-j/--json` flag for printing output

Review comment:
       That was an outdated comment as I had removed that feature based on earlier feedback. I've updated the PR to reflect your other comments as well. Thanks @sunchao !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (e9e6fe3) into [master](https://codecov.io/gh/apache/arrow/commit/67d0c2e38011cd883059e3a9fd0ea08088661707?el=desc) (67d0c2e) will **decrease** coverage by `0.06%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   - Coverage   81.84%   81.77%   -0.07%     
   ==========================================
     Files         215      215              
     Lines       52949    52993      +44     
   ==========================================
     Hits        43336    43336              
   - Misses       9613     9657      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.50% <0.00%> (-5.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [67d0c2e...e9e6fe3](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-766366191


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=h1) Report
   > Merging [#9306](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=desc) (5fd156a) into [master](https://codecov.io/gh/apache/arrow/commit/67d0c2e38011cd883059e3a9fd0ea08088661707?el=desc) (67d0c2e) will **decrease** coverage by `0.06%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9306/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9306      +/-   ##
   ==========================================
   - Coverage   81.84%   81.77%   -0.07%     
   ==========================================
     Files         215      215              
     Lines       52949    52993      +44     
   ==========================================
     Hits        43336    43336              
   - Misses       9613     9657      +44     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/bin/parquet-read.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yZWFkLnJz) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-rowcount.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1yb3djb3VudC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/bin/parquet-schema.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iaW4vcGFycXVldC1zY2hlbWEucnM=) | `0.00% <0.00%> (ø)` | |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/9306/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `92.50% <0.00%> (-5.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=footer). Last update [67d0c2e...e9e6fe3](https://codecov.io/gh/apache/arrow/pull/9306?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#issuecomment-771158985


   Thanks again @manojkarthick  👍 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567413050



##########
File path: rust/parquet/src/record/api.rs
##########
@@ -1608,6 +1670,117 @@ mod tests {
             );
         }
     }
+
+    #[test]
+    #[cfg(feature = "cli")]
+    fn test_to_json_value() {
+        assert_eq!(Field::Null.to_json_value(), Value::Null);
+        assert_eq!(Field::Bool(true).to_json_value(), Value::Bool(true));
+        assert_eq!(Field::Bool(false).to_json_value(), Value::Bool(false));
+        assert_eq!(
+            Field::Byte(1).to_json_value(),
+            Value::Number(serde_json::Number::from(1))
+        );
+        assert_eq!(
+            Field::Short(2).to_json_value(),
+            Value::Number(serde_json::Number::from(2))
+        );
+        assert_eq!(
+            Field::Int(3).to_json_value(),
+            Value::Number(serde_json::Number::from(3))
+        );
+        assert_eq!(
+            Field::Long(4).to_json_value(),
+            Value::Number(serde_json::Number::from(4))
+        );
+        assert_eq!(
+            Field::UByte(1).to_json_value(),
+            Value::Number(serde_json::Number::from(1))
+        );
+        assert_eq!(
+            Field::UShort(2).to_json_value(),
+            Value::Number(serde_json::Number::from(2))
+        );
+        assert_eq!(
+            Field::UInt(3).to_json_value(),
+            Value::Number(serde_json::Number::from(3))
+        );
+        assert_eq!(
+            Field::ULong(4).to_json_value(),
+            Value::Number(serde_json::Number::from(4))
+        );
+        assert_eq!(
+            Field::Float(5.0).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(f64::from(5.0 as f32)).unwrap())
+        );
+        assert_eq!(
+            Field::Float(5.1234).to_json_value(),
+            Value::Number(
+                serde_json::Number::from_f64(f64::from(5.1234 as f32)).unwrap()
+            )
+        );
+        assert_eq!(
+            Field::Double(6.0).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(6.0 as f64).unwrap())
+        );
+        assert_eq!(
+            Field::Double(6.1234).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(6.1234 as f64).unwrap())
+        );
+        assert_eq!(
+            Field::Str("abc".to_string()).to_json_value(),
+            Value::String(String::from("abc"))
+        );
+        assert_eq!(
+            Field::Decimal(Decimal::from_i32(4, 8, 2)).to_json_value(),
+            Value::String(String::from("0.04"))
+        );
+        assert_eq!(
+            Field::Bytes(ByteArray::from(vec![1, 2, 3])).to_json_value(),
+            Value::String(String::from("AQID"))
+        );
+        assert_eq!(

Review comment:
       When I ran the new test, it failed for me (probably because my timezone is different). Maybe we should print out UTF timestamps for consistency? 
   
   ```
   cargo test --features=cli -- test_to_json_value
   cargo test --features=cli -- test_to_json_value
       Finished test [unoptimized + debuginfo] target(s) in 0.10s
        Running /Users/alamb/Software/arrow2/rust/target/debug/deps/parquet-113eb074b13ea986
   
   running 1 test
   test record::api::tests::test_to_json_value ... FAILED
   
   failures:
   
   ---- record::api::tests::test_to_json_value stdout ----
   thread 'record::api::tests::test_to_json_value' panicked at 'assertion failed: `(left == right)`
     left: `String("1969-12-31 19:00:12 -05:00")`,
    right: `String("1969-12-31 16:00:12 -08:00")`', parquet/src/record/api.rs:1742:9
   note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   
   
   failures:
       record::api::tests::test_to_json_value
   
   test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 429 filtered out
   
   error: test failed, to rerun pass '--lib'
   alamb@ip-10-0-0-49 parquet % 
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] manojkarthick commented on a change in pull request #9306: ARROW-10297: [Rust] Parameter for parquet-read to output data in json format, add "cli" feature to parquet crate

Posted by GitBox <gi...@apache.org>.
manojkarthick commented on a change in pull request #9306:
URL: https://github.com/apache/arrow/pull/9306#discussion_r567423337



##########
File path: rust/parquet/src/record/api.rs
##########
@@ -1608,6 +1670,117 @@ mod tests {
             );
         }
     }
+
+    #[test]
+    #[cfg(feature = "cli")]
+    fn test_to_json_value() {
+        assert_eq!(Field::Null.to_json_value(), Value::Null);
+        assert_eq!(Field::Bool(true).to_json_value(), Value::Bool(true));
+        assert_eq!(Field::Bool(false).to_json_value(), Value::Bool(false));
+        assert_eq!(
+            Field::Byte(1).to_json_value(),
+            Value::Number(serde_json::Number::from(1))
+        );
+        assert_eq!(
+            Field::Short(2).to_json_value(),
+            Value::Number(serde_json::Number::from(2))
+        );
+        assert_eq!(
+            Field::Int(3).to_json_value(),
+            Value::Number(serde_json::Number::from(3))
+        );
+        assert_eq!(
+            Field::Long(4).to_json_value(),
+            Value::Number(serde_json::Number::from(4))
+        );
+        assert_eq!(
+            Field::UByte(1).to_json_value(),
+            Value::Number(serde_json::Number::from(1))
+        );
+        assert_eq!(
+            Field::UShort(2).to_json_value(),
+            Value::Number(serde_json::Number::from(2))
+        );
+        assert_eq!(
+            Field::UInt(3).to_json_value(),
+            Value::Number(serde_json::Number::from(3))
+        );
+        assert_eq!(
+            Field::ULong(4).to_json_value(),
+            Value::Number(serde_json::Number::from(4))
+        );
+        assert_eq!(
+            Field::Float(5.0).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(f64::from(5.0 as f32)).unwrap())
+        );
+        assert_eq!(
+            Field::Float(5.1234).to_json_value(),
+            Value::Number(
+                serde_json::Number::from_f64(f64::from(5.1234 as f32)).unwrap()
+            )
+        );
+        assert_eq!(
+            Field::Double(6.0).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(6.0 as f64).unwrap())
+        );
+        assert_eq!(
+            Field::Double(6.1234).to_json_value(),
+            Value::Number(serde_json::Number::from_f64(6.1234 as f64).unwrap())
+        );
+        assert_eq!(
+            Field::Str("abc".to_string()).to_json_value(),
+            Value::String(String::from("abc"))
+        );
+        assert_eq!(
+            Field::Decimal(Decimal::from_i32(4, 8, 2)).to_json_value(),
+            Value::String(String::from("0.04"))
+        );
+        assert_eq!(
+            Field::Bytes(ByteArray::from(vec![1, 2, 3])).to_json_value(),
+            Value::String(String::from("AQID"))
+        );
+        assert_eq!(

Review comment:
       Thanks for the quick feedback @alamb ! I've updated all the tests to use UTC timestamps.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org