You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Pavel Kovalenko (Jira)" <ji...@apache.org> on 2022/10/12 07:44:00 UTC
[jira] [Updated] (ARROW-17998) [Java] JSON representation of pojo.Schema is incompatible with flatbuffers JSON generated via C++ API
[ https://issues.apache.org/jira/browse/ARROW-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Kovalenko updated ARROW-17998:
------------------------------------
Description:
I have JSON arrow::Schema representation generated from flatbuffers format in C++:
{code:java}
const void* schemaBytes;
std::string fbsSchemaFile;
flatbuffers::LoadFile("/path/to/Schema.fbs", false, &fbsSchemaFile);
flatbuffers::Parser parser;
parser.Parse(fbsSchemaFile.c_str());
std::string json;
flatbuffers::GenerateTextFromTable(parser, schemaBytes, "org.apache.arrow.flatbuf.Schema", &json);
return json;{code}
When I'm trying to read this JSON in Java and create pojo.Schema:
{code:java}
String json; // Read from file.
Schema.fromJson(json);{code}
It fails because JSON formats in flatbuffers generation and in Java using Jackson bindings are a bit different:
C++ Schema Flatbuffers JSON example:
{code:java}
{
fields: [
{
name: "cc_call_center_sk",
type_type: "Int",
type: {
bitWidth: 32,
is_signed: true
},
children: [
],
custom_metadata: [
{
key: "metadata",
value: "some_metadata"
}
]
},
],
custom_metadata: [
{
key: "metadata",
value: "some_metadata"
}
]
}{code}
Java Schema JSON example:
{code:java}
{
"fields" : [ {
"name" : "cc_call_center_sk",
"nullable" : true,
"type" : {
"name" : "int",
"bitWidth" : 32,
"isSigned" : true
},
"children" : [ ],
"metadata" : [ {
"value" : "some_metadata",
"key" : "metadata"
} ]
} ],
"metadata" : [ {
"value" : "some_metadata",
"key" : "metadata"
} ]
} {code}
There is a difference in type id declaration:
`{*}type_type{*}` field is used in C++ flatbuffers
`{*}name{*}` field inside `{*}type{*}` field is used in Java
Also, there is a difference in `{*}metadata{*}` field:
`{*}custom_metadata{*}` name is used in C++ flatbuffers
`{*}metadata{*}` name is used in Java
It makes it impossible to re-use JSON representation from Java in C++ and vice-versa
Probably the same issue exists in other languages
was:
I have JSON arrow::Schema representation generated from flatbuffers format in C++:
{code:java}
const void* schemaBytes;
std::string fbsSchemaFile;
flatbuffers::LoadFile("/path/to/Schema.fbs", false, &fbsSchemaFile);
flatbuffers::Parser parser;
parser.Parse(fbsSchemaFile.c_str());
std::string json;
flatbuffers::GenerateTextFromTable(parser, schemaBytes, "org.apache.arrow.flatbuf.Schema", &json);
return json;{code}
When I'm trying to read this JSON in Java and create pojo.Schema:
{code:java}
String json; // Read from file.
Schema.fromJson(json);{code}
It fails because JSON formats in flatbuffers generation and in Java using Jackson bindings are a bit different:
C++ Schema Flatbuffers JSON example:
{code:java}
{
fields: [
{
name: "cc_call_center_sk",
type_type: "Int",
type: {
bitWidth: 32,
is_signed: true
},
children: [
],
custom_metadata: [
{
key: "metadata",
value: "some_metadata"
}
]
},
],
custom_metadata: [
{
key: "metadata",
value: "some_metadata"
}
]
}{code}
Java Schema JSON example:
{code:java}
table does not exist
{
"fields" : [ {
"name" : "cc_call_center_sk",
"nullable" : true,
"type" : {
"name" : "int",
"bitWidth" : 32,
"isSigned" : true
},
"children" : [ ],
"metadata" : [ {
"value" : "some_metadata",
"key" : "metadata"
} ]
} ],
"metadata" : [ {
"value" : "some_metadata",
"key" : "metadata"
} ]
} {code}
There is a difference in type id declaration:
`type_type` field is used in C++ flatbuffers
`name` field inside `type` field is used in Java
Also, there is a difference in `metadata` field:
`custom_metadata` name is used in C++ flatbuffers
`metadata` name is used in Java
It makes it impossible to re-use JSON representation from Java in C++ and vice-versa
> [Java] JSON representation of pojo.Schema is incompatible with flatbuffers JSON generated via C++ API
> -----------------------------------------------------------------------------------------------------
>
> Key: ARROW-17998
> URL: https://issues.apache.org/jira/browse/ARROW-17998
> Project: Apache Arrow
> Issue Type: Bug
> Components: Format, Java
> Affects Versions: 6.0.1
> Reporter: Pavel Kovalenko
> Priority: Major
>
> I have JSON arrow::Schema representation generated from flatbuffers format in C++:
>
> {code:java}
> const void* schemaBytes;
> std::string fbsSchemaFile;
> flatbuffers::LoadFile("/path/to/Schema.fbs", false, &fbsSchemaFile);
> flatbuffers::Parser parser;
> parser.Parse(fbsSchemaFile.c_str());
> std::string json;
> flatbuffers::GenerateTextFromTable(parser, schemaBytes, "org.apache.arrow.flatbuf.Schema", &json);
> return json;{code}
>
> When I'm trying to read this JSON in Java and create pojo.Schema:
>
> {code:java}
> String json; // Read from file.
> Schema.fromJson(json);{code}
>
>
> It fails because JSON formats in flatbuffers generation and in Java using Jackson bindings are a bit different:
>
> C++ Schema Flatbuffers JSON example:
> {code:java}
> {
> fields: [
> {
> name: "cc_call_center_sk",
> type_type: "Int",
> type: {
> bitWidth: 32,
> is_signed: true
> },
> children: [
> ],
> custom_metadata: [
> {
> key: "metadata",
> value: "some_metadata"
> }
> ]
> },
> ],
> custom_metadata: [
> {
> key: "metadata",
> value: "some_metadata"
> }
> ]
> }{code}
> Java Schema JSON example:
> {code:java}
> {
> "fields" : [ {
> "name" : "cc_call_center_sk",
> "nullable" : true,
> "type" : {
> "name" : "int",
> "bitWidth" : 32,
> "isSigned" : true
> },
> "children" : [ ],
> "metadata" : [ {
> "value" : "some_metadata",
> "key" : "metadata"
> } ]
> } ],
> "metadata" : [ {
> "value" : "some_metadata",
> "key" : "metadata"
> } ]
> } {code}
> There is a difference in type id declaration:
> `{*}type_type{*}` field is used in C++ flatbuffers
> `{*}name{*}` field inside `{*}type{*}` field is used in Java
>
> Also, there is a difference in `{*}metadata{*}` field:
> `{*}custom_metadata{*}` name is used in C++ flatbuffers
> `{*}metadata{*}` name is used in Java
>
> It makes it impossible to re-use JSON representation from Java in C++ and vice-versa
> Probably the same issue exists in other languages
--
This message was sent by Atlassian Jira
(v8.20.10#820010)