You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Khurram Faraaz <kf...@maprtech.com> on 2016/11/08 08:43:00 UTC

Query JSON that has null as value for each key

Hi All,

Drill 1.9.0 git commit ID : 83513daf

Drill returns same result with or without `store.json.all_text_mode`=true

[root@cent01 null_eq_joins]# cat right_all_nulls.json
{
     "intKey" : null,
     "bgintKey": null,
     "strKey": null,
     "boolKey": null,
     "fltKey": null,
     "dblKey": null,
     "timKey": null,
     "dtKey": null,
     "tmstmpKey": null,
     "intrvldyKey": null,
     "intrvlyrKey": null
}
[root@cent01 null_eq_joins]#

Querying the above JSON file results in null as query result.
 -  We should see each of the keys in the JSON as a column in query result.
 -  And in each column the value should be a null value.
Current behavior does not look right.

{noformat}
0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
+-------+
|   *   |
+-------+
| null  |
+-------+
1 row selected (0.313 seconds)
{noformat}

Thanks,
Khurram

Re: Query JSON that has null as value for each key

Posted by Khurram Faraaz <kf...@maprtech.com>.
Thanks Julian. DRILL-5033 <https://issues.apache.org/jira/browse/DRILL-5033> is
filed to track this.

On Thu, Nov 10, 2016 at 10:50 PM, Julian Hyde <jh...@apache.org> wrote:

> IMHO it is similar but not the same as DRILL-1256. Worth logging an issue
> and let Jacques (or someone) put on the record what should be the behavior
> of an empty record (empty JSON map) when it is top-level (as in this case)
> or in a collection.
>
> Julian
>
>
> > On Nov 10, 2016, at 9:12 AM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
> >
> > Khurram,
> >
> > Take a look at this jira [1]. It looks similar to what you have
> mentioned.
> >
> > [1] https://issues.apache.org/jira/browse/DRILL-1256
> >
> > - Rahul
> >
> > On Wed, Nov 9, 2016 at 10:49 AM, Khurram Faraaz <kf...@maprtech.com>
> > wrote:
> >
> >> I dont think it is by design. Some one from dev please confirm.
> >>
> >> That is because having several columns in a CSV and each column has a
> null
> >> value, select * on such CSV returns null for each column. Why is it that
> >> JSON is treated differently ?
> >>
> >> 0: jdbc:drill:schema=dfs.tmp> select * from `r1.csv`;
> >> +-----------------------------------------------------------
> >> ----------------------+
> >> |                                     columns
> >>      |
> >> +-----------------------------------------------------------
> >> ----------------------+
> >> |
> >> ["null","null","null","null","null","null","null","null","
> >> null","null","null"]
> >> |
> >> +-----------------------------------------------------------
> >> ----------------------+
> >> 1 row selected (0.318 seconds)
> >>
> >> On Thu, Nov 10, 2016 at 12:11 AM, rahul challapalli <
> >> challapallirahul@gmail.com> wrote:
> >>
> >>> I think this is expected as drill does not differentiate between
> missing
> >>> field and a field which has a null value for all records.
> >>>
> >>> On Wed, Nov 9, 2016 at 10:20 AM, Khurram Faraaz <kf...@maprtech.com>
> >>> wrote:
> >>>
> >>>> Is this by design or is this a bug ?
> >>>>
> >>>> On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com>
> >>>> wrote:
> >>>>
> >>>>> Hi All,
> >>>>>
> >>>>> Drill 1.9.0 git commit ID : 83513daf
> >>>>>
> >>>>> Drill returns same result with or without `store.json.all_text_mode`=
> >>>> true
> >>>>>
> >>>>> [root@cent01 null_eq_joins]# cat right_all_nulls.json
> >>>>> {
> >>>>>     "intKey" : null,
> >>>>>     "bgintKey": null,
> >>>>>     "strKey": null,
> >>>>>     "boolKey": null,
> >>>>>     "fltKey": null,
> >>>>>     "dblKey": null,
> >>>>>     "timKey": null,
> >>>>>     "dtKey": null,
> >>>>>     "tmstmpKey": null,
> >>>>>     "intrvldyKey": null,
> >>>>>     "intrvlyrKey": null
> >>>>> }
> >>>>> [root@cent01 null_eq_joins]#
> >>>>>
> >>>>> Querying the above JSON file results in null as query result.
> >>>>> -  We should see each of the keys in the JSON as a column in query
> >>>> result.
> >>>>> -  And in each column the value should be a null value.
> >>>>> Current behavior does not look right.
> >>>>>
> >>>>> {noformat}
> >>>>> 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> >>>>> +-------+
> >>>>> |   *   |
> >>>>> +-------+
> >>>>> | null  |
> >>>>> +-------+
> >>>>> 1 row selected (0.313 seconds)
> >>>>> {noformat}
> >>>>>
> >>>>> Thanks,
> >>>>> Khurram
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: Query JSON that has null as value for each key

Posted by Julian Hyde <jh...@apache.org>.
IMHO it is similar but not the same as DRILL-1256. Worth logging an issue and let Jacques (or someone) put on the record what should be the behavior of an empty record (empty JSON map) when it is top-level (as in this case) or in a collection.

Julian


> On Nov 10, 2016, at 9:12 AM, rahul challapalli <ch...@gmail.com> wrote:
> 
> Khurram,
> 
> Take a look at this jira [1]. It looks similar to what you have mentioned.
> 
> [1] https://issues.apache.org/jira/browse/DRILL-1256
> 
> - Rahul
> 
> On Wed, Nov 9, 2016 at 10:49 AM, Khurram Faraaz <kf...@maprtech.com>
> wrote:
> 
>> I dont think it is by design. Some one from dev please confirm.
>> 
>> That is because having several columns in a CSV and each column has a null
>> value, select * on such CSV returns null for each column. Why is it that
>> JSON is treated differently ?
>> 
>> 0: jdbc:drill:schema=dfs.tmp> select * from `r1.csv`;
>> +-----------------------------------------------------------
>> ----------------------+
>> |                                     columns
>>      |
>> +-----------------------------------------------------------
>> ----------------------+
>> |
>> ["null","null","null","null","null","null","null","null","
>> null","null","null"]
>> |
>> +-----------------------------------------------------------
>> ----------------------+
>> 1 row selected (0.318 seconds)
>> 
>> On Thu, Nov 10, 2016 at 12:11 AM, rahul challapalli <
>> challapallirahul@gmail.com> wrote:
>> 
>>> I think this is expected as drill does not differentiate between missing
>>> field and a field which has a null value for all records.
>>> 
>>> On Wed, Nov 9, 2016 at 10:20 AM, Khurram Faraaz <kf...@maprtech.com>
>>> wrote:
>>> 
>>>> Is this by design or is this a bug ?
>>>> 
>>>> On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com>
>>>> wrote:
>>>> 
>>>>> Hi All,
>>>>> 
>>>>> Drill 1.9.0 git commit ID : 83513daf
>>>>> 
>>>>> Drill returns same result with or without `store.json.all_text_mode`=
>>>> true
>>>>> 
>>>>> [root@cent01 null_eq_joins]# cat right_all_nulls.json
>>>>> {
>>>>>     "intKey" : null,
>>>>>     "bgintKey": null,
>>>>>     "strKey": null,
>>>>>     "boolKey": null,
>>>>>     "fltKey": null,
>>>>>     "dblKey": null,
>>>>>     "timKey": null,
>>>>>     "dtKey": null,
>>>>>     "tmstmpKey": null,
>>>>>     "intrvldyKey": null,
>>>>>     "intrvlyrKey": null
>>>>> }
>>>>> [root@cent01 null_eq_joins]#
>>>>> 
>>>>> Querying the above JSON file results in null as query result.
>>>>> -  We should see each of the keys in the JSON as a column in query
>>>> result.
>>>>> -  And in each column the value should be a null value.
>>>>> Current behavior does not look right.
>>>>> 
>>>>> {noformat}
>>>>> 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
>>>>> +-------+
>>>>> |   *   |
>>>>> +-------+
>>>>> | null  |
>>>>> +-------+
>>>>> 1 row selected (0.313 seconds)
>>>>> {noformat}
>>>>> 
>>>>> Thanks,
>>>>> Khurram
>>>>> 
>>>> 
>>> 
>> 


Re: Query JSON that has null as value for each key

Posted by rahul challapalli <ch...@gmail.com>.
Khurram,

Take a look at this jira [1]. It looks similar to what you have mentioned.

[1] https://issues.apache.org/jira/browse/DRILL-1256

- Rahul

On Wed, Nov 9, 2016 at 10:49 AM, Khurram Faraaz <kf...@maprtech.com>
wrote:

> I dont think it is by design. Some one from dev please confirm.
>
> That is because having several columns in a CSV and each column has a null
> value, select * on such CSV returns null for each column. Why is it that
> JSON is treated differently ?
>
> 0: jdbc:drill:schema=dfs.tmp> select * from `r1.csv`;
> +-----------------------------------------------------------
> ----------------------+
> |                                     columns
>       |
> +-----------------------------------------------------------
> ----------------------+
> |
> ["null","null","null","null","null","null","null","null","
> null","null","null"]
>  |
> +-----------------------------------------------------------
> ----------------------+
> 1 row selected (0.318 seconds)
>
> On Thu, Nov 10, 2016 at 12:11 AM, rahul challapalli <
> challapallirahul@gmail.com> wrote:
>
> > I think this is expected as drill does not differentiate between missing
> > field and a field which has a null value for all records.
> >
> > On Wed, Nov 9, 2016 at 10:20 AM, Khurram Faraaz <kf...@maprtech.com>
> > wrote:
> >
> > > Is this by design or is this a bug ?
> > >
> > > On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > Drill 1.9.0 git commit ID : 83513daf
> > > >
> > > > Drill returns same result with or without `store.json.all_text_mode`=
> > > true
> > > >
> > > > [root@cent01 null_eq_joins]# cat right_all_nulls.json
> > > > {
> > > >      "intKey" : null,
> > > >      "bgintKey": null,
> > > >      "strKey": null,
> > > >      "boolKey": null,
> > > >      "fltKey": null,
> > > >      "dblKey": null,
> > > >      "timKey": null,
> > > >      "dtKey": null,
> > > >      "tmstmpKey": null,
> > > >      "intrvldyKey": null,
> > > >      "intrvlyrKey": null
> > > > }
> > > > [root@cent01 null_eq_joins]#
> > > >
> > > > Querying the above JSON file results in null as query result.
> > > >  -  We should see each of the keys in the JSON as a column in query
> > > result.
> > > >  -  And in each column the value should be a null value.
> > > > Current behavior does not look right.
> > > >
> > > > {noformat}
> > > > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> > > > +-------+
> > > > |   *   |
> > > > +-------+
> > > > | null  |
> > > > +-------+
> > > > 1 row selected (0.313 seconds)
> > > > {noformat}
> > > >
> > > > Thanks,
> > > > Khurram
> > > >
> > >
> >
>

Re: Query JSON that has null as value for each key

Posted by Khurram Faraaz <kf...@maprtech.com>.
I dont think it is by design. Some one from dev please confirm.

That is because having several columns in a CSV and each column has a null
value, select * on such CSV returns null for each column. Why is it that
JSON is treated differently ?

0: jdbc:drill:schema=dfs.tmp> select * from `r1.csv`;
+---------------------------------------------------------------------------------+
|                                     columns
      |
+---------------------------------------------------------------------------------+
|
["null","null","null","null","null","null","null","null","null","null","null"]
 |
+---------------------------------------------------------------------------------+
1 row selected (0.318 seconds)

On Thu, Nov 10, 2016 at 12:11 AM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> I think this is expected as drill does not differentiate between missing
> field and a field which has a null value for all records.
>
> On Wed, Nov 9, 2016 at 10:20 AM, Khurram Faraaz <kf...@maprtech.com>
> wrote:
>
> > Is this by design or is this a bug ?
> >
> > On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Drill 1.9.0 git commit ID : 83513daf
> > >
> > > Drill returns same result with or without `store.json.all_text_mode`=
> > true
> > >
> > > [root@cent01 null_eq_joins]# cat right_all_nulls.json
> > > {
> > >      "intKey" : null,
> > >      "bgintKey": null,
> > >      "strKey": null,
> > >      "boolKey": null,
> > >      "fltKey": null,
> > >      "dblKey": null,
> > >      "timKey": null,
> > >      "dtKey": null,
> > >      "tmstmpKey": null,
> > >      "intrvldyKey": null,
> > >      "intrvlyrKey": null
> > > }
> > > [root@cent01 null_eq_joins]#
> > >
> > > Querying the above JSON file results in null as query result.
> > >  -  We should see each of the keys in the JSON as a column in query
> > result.
> > >  -  And in each column the value should be a null value.
> > > Current behavior does not look right.
> > >
> > > {noformat}
> > > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> > > +-------+
> > > |   *   |
> > > +-------+
> > > | null  |
> > > +-------+
> > > 1 row selected (0.313 seconds)
> > > {noformat}
> > >
> > > Thanks,
> > > Khurram
> > >
> >
>

Re: Query JSON that has null as value for each key

Posted by rahul challapalli <ch...@gmail.com>.
I think this is expected as drill does not differentiate between missing
field and a field which has a null value for all records.

On Wed, Nov 9, 2016 at 10:20 AM, Khurram Faraaz <kf...@maprtech.com>
wrote:

> Is this by design or is this a bug ?
>
> On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com>
> wrote:
>
> > Hi All,
> >
> > Drill 1.9.0 git commit ID : 83513daf
> >
> > Drill returns same result with or without `store.json.all_text_mode`=
> true
> >
> > [root@cent01 null_eq_joins]# cat right_all_nulls.json
> > {
> >      "intKey" : null,
> >      "bgintKey": null,
> >      "strKey": null,
> >      "boolKey": null,
> >      "fltKey": null,
> >      "dblKey": null,
> >      "timKey": null,
> >      "dtKey": null,
> >      "tmstmpKey": null,
> >      "intrvldyKey": null,
> >      "intrvlyrKey": null
> > }
> > [root@cent01 null_eq_joins]#
> >
> > Querying the above JSON file results in null as query result.
> >  -  We should see each of the keys in the JSON as a column in query
> result.
> >  -  And in each column the value should be a null value.
> > Current behavior does not look right.
> >
> > {noformat}
> > 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> > +-------+
> > |   *   |
> > +-------+
> > | null  |
> > +-------+
> > 1 row selected (0.313 seconds)
> > {noformat}
> >
> > Thanks,
> > Khurram
> >
>

Re: Query JSON that has null as value for each key

Posted by Khurram Faraaz <kf...@maprtech.com>.
Is this by design or is this a bug ?

On Tue, Nov 8, 2016 at 2:13 PM, Khurram Faraaz <kf...@maprtech.com> wrote:

> Hi All,
>
> Drill 1.9.0 git commit ID : 83513daf
>
> Drill returns same result with or without `store.json.all_text_mode`=true
>
> [root@cent01 null_eq_joins]# cat right_all_nulls.json
> {
>      "intKey" : null,
>      "bgintKey": null,
>      "strKey": null,
>      "boolKey": null,
>      "fltKey": null,
>      "dblKey": null,
>      "timKey": null,
>      "dtKey": null,
>      "tmstmpKey": null,
>      "intrvldyKey": null,
>      "intrvlyrKey": null
> }
> [root@cent01 null_eq_joins]#
>
> Querying the above JSON file results in null as query result.
>  -  We should see each of the keys in the JSON as a column in query result.
>  -  And in each column the value should be a null value.
> Current behavior does not look right.
>
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select * from `right_all_nulls.json`;
> +-------+
> |   *   |
> +-------+
> | null  |
> +-------+
> 1 row selected (0.313 seconds)
> {noformat}
>
> Thanks,
> Khurram
>