You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2017/06/09 22:49:18 UTC

[jira] [Comment Edited] (ARROW-692) Java<->C++ Integration tests for dictionary-encoded vectors

    [ https://issues.apache.org/jira/browse/ARROW-692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037284#comment-16037284 ] 

Bryan Cutler edited comment on ARROW-692 at 6/9/17 10:48 PM:
-------------------------------------------------------------

here is an example of JSON data with a dictionary I'll be working towards, let me know if there was something else in mind [~wesmckinn] [~julienledem]

{noformat}
{
  "schema": {
    "fields": [
      {
        "name": "foo",
        "type": {"name": "utf8"},
        "nullable": true, "children": [],
        "typeLayout": {
          "vectors": [
            {"type": "VALIDITY", "typeBitWidth": 1},
            {"type": "OFFSET", "typeBitWidth": 32}
            {"type": "DATA", "typeBitWidth": 8}
          ]
        },
        "dictionary": {
          "id": 0, 
          "indexType": {"name": "int", "bitWidth": 8, "isSigned": false}, 
          "isOrdered": false
        }
      }
    ]
  },
  "dictionaries": [
    {
      "id": 0,
      "data": {
        "count": 3,
        "columns": [
          {
            "name": "foo",
            "count": 3,
            "VALIDITY": [1, 1, 1],
            "OFFSET": [0, 3, 6, 9], 
            "DATA": ["foo", "bar", "baz"]
          }
        ]
      }
    }
  ],
  "batches": [
    {
      "count": 6,
      "columns": [
        {
          "name": "foo",
          "count": 6,
          "VALIDITY": [1, 1, 0, 1, 1, 1],
          "DATA": [0, 1, 0, 1, 2]
        }
      ]
    }
  ]
}
{noformat}


was (Author: bryanc):
here is an example of JSON data with a dictionary I'll be working towards, let me know if there was something else in mind [~wesmckinn] [~julienledem]

{noformat}
{
  "schema": {
    "fields": [
      {
        "name": "foo",
        "type": {"name": "int", "isSigned": false, "bitWidth": 32},
        "nullable": true, "children": [],
        "typeLayout": {
          "vectors": [
            {"type": "VALIDITY", "typeBitWidth": 1},
            {"type": "DATA", "typeBitWidth": 32}
          ]
        },
        "dictionary": {
          "id": 0, 
          "indexType": {"bitWidth": 8, "isSigned": false}, 
          "isOrdered": false
        }
      }
    ]
  },
  "dictionaries": [
    {
      "id": 0,
      "data": {
        "count": 3,
        "columns": [
          {
            "name": "foo",
            "count": 3,
            "VALIDITY": [1, 1, 1],
            "DATA": ["foo", "bar", "baz"]
          }
        ]
      }
    }
  ],
  "batches": [
    {
      "count": 6,
      "columns": [
        {
          "name": "foo",
          "count": 6,
          "VALIDITY": [1, 1, 0, 1, 1, 1],
          "DATA": [0, 1, 0, 1, 2]
        }
      ]
    }
  ]
}
{noformat}

> Java<->C++ Integration tests for dictionary-encoded vectors
> -----------------------------------------------------------
>
>                 Key: ARROW-692
>                 URL: https://issues.apache.org/jira/browse/ARROW-692
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++, Java - Vectors
>            Reporter: Wes McKinney
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)