You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Vadim Goy (Jira)" <ji...@apache.org> on 2022/08/23 14:35:00 UTC

[jira] [Created] (ARROW-17506) [Python][C++] pyarrow parquet writer - missing time logical type

Vadim Goy created ARROW-17506:
---------------------------------

             Summary: [Python][C++] pyarrow parquet writer - missing time logical type
                 Key: ARROW-17506
                 URL: https://issues.apache.org/jira/browse/ARROW-17506
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++, Parquet, Python
    Affects Versions: 9.0.0, 8.0.0
            Reporter: Vadim Goy


 

pyarrow.parquet.write_table used for write Parquet file
In parquet schema missed logical type for TIME, it’s just long type

PyArrow Schema
{code:java}
NUMBER: int64
DECIMAL: int64
NUMERIC: int64
INT: int64
FLOAT: double
VARCHAR: string
TEXT: string
CHAR: string
BOOLEAN: bool
ARR: string
VAR: string
OBJ: string
TIMESTAMP: timestamp[ns]
DATE: date64[ms]
TIME: time64[ns]
PK: int64
UUID: binary
UUID2: string
UUID3: string {code}

Parquet schema
{code:java}
{
  "type" : "record",
  "name" : "schema",
  "fields" : [ {
    "name" : "NUMBER",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "DECIMAL",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "NUMERIC",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "INT",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "FLOAT",
    "type" : [ "null", "double" ],
    "default" : null
  }, {
    "name" : "VARCHAR",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "TEXT",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "CHAR",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "BOOLEAN",
    "type" : [ "null", "boolean" ],
    "default" : null
  }, {
    "name" : "ARR",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "VAR",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "OBJ",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "TIMESTAMP",
    "type" : [ "null", {
      "type" : "long",
      "logicalType" : "timestamp-micros"
    } ],
    "default" : null
  }, {
    "name" : "DATE",
    "type" : [ "null", {
      "type" : "int",
      "logicalType" : "date"
    } ],
    "default" : null
  }, {
    "name" : "TIME",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "PK",
    "type" : [ "null", "long" ],
    "default" : null
  }, {
    "name" : "UUID",
    "type" : [ "null", "bytes" ],
    "default" : null
  }, {
    "name" : "UUID2",
    "type" : [ "null", "string" ],
    "default" : null
  }, {
    "name" : "UUID3",
    "type" : [ "null", "string" ],
    "default" : null
  } ]
}{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)