You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "aditya menon (JIRA)" <ji...@apache.org> on 2015/11/17 13:44:11 UTC
[jira] [Created] (DRILL-4102) Only one row found in a JSON document
that contains multiple items.
aditya menon created DRILL-4102:
-----------------------------------
Summary: Only one row found in a JSON document that contains multiple items.
Key: DRILL-4102
URL: https://issues.apache.org/jira/browse/DRILL-4102
Project: Apache Drill
Issue Type: Bug
Environment: OS X, Drill embedded, v1.1.0 installed via HomeBrew
Reporter: aditya menon
I tried to analyse a JSON file that had the following (sample) structure:
```
{
"Key1": {
"htmltags": "<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag attr3='charlie' />"
},
"Key2": {
"htmltags": "<htmltag attr1='kilo' /><htmltag attr2='lima' /><htmltag attr3='mike' />"
},
"Key3": {
"htmltags": "<htmltag attr1='november' /><htmltag attr2='foxtrot' /><htmltag attr3='sierra' />"
}
}
```
(Apologies for the obfuscation, I am unable to publish the original dataset. But the structure is exactly the same. Note especially how the keys and other data points *differ* in some places, and remain identical in others.)
When I run a `SELECT * FROM DataFile.json` what I get is a single row listed under three columns: `"<htmltag attr1='bravo' /><htmltag attr2='delta' /><htmltag attr3='charlie' />"` [i.e., only the entry `Key1.htmltags`] .
Ideally, I should see three rows, each with entries from Key1..Key3, listed under the correct respective column.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)