You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by John Schneider <js...@apixio.com> on 2015/12/01 18:57:33 UTC

Having difficulties using CASE statement to manage heterogeneous schemas

Hi All,

I'm trying to use case statements to manage a heterogeneous stream of json
objects as
shown in the example from
https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
goodness and case statements will help me with the last real hurdles I have
using drill with my logs.
Would you please review the tests I created below and tell me if I'm just
missing something obvious?

Thanks
/jos

## first test, two lines, one with a field that's a string  and second
field is a map
## first lets just select all records, I expect this to barf since there
are two schemas
: jdbc:drill:zk=local> select *  from
dfs.`/Users/jos/work/drill/casetest.json` t ;
Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
are using a ValueWriter of type NullableVarCharWriterImpl.

File  /Users/jos/work/drill/casetest.json
Record  2
Fragment 0:0

[Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
(state=,code=0)

## now lets use a case statement to sort out the schemas, I don't expect
this to
## barf but barf it does, seems like this should have worked, what am I
missing

0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
'map' else 'string' end from dfs.`/Users/jos/work/drill/casetest.json` t ;
Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
are using a ValueWriter of type NullableVarCharWriterImpl.

File  /Users/jos/Downloads/2015-11-30-bad-3.json
Record  2
Fragment 0:0

[Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
(state=,code=0)
0: jdbc:drill:zk=local>


## data I used is this
## casetest.json has two lines in it

{"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
ndagdagan_apex@apixio.com"}}
{"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
ndagdagan_apex@apixio.com
","roles":null,"isNotadmins":true,"iscoders":true}}}


## now lets see if any case will work on any structure
## new test file with same line in it twice
## select * works as expected
0: jdbc:drill:zk=local> select * from
dfs.`/Users/jos/work/drill/testcase2.json` t ;
+-------+------+-----------+
| level | time | user_info |
+-------+------+-----------+
| EVENT | 1448844983160 |
{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
| EVENT | 1448844983160 |
{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
+-------+------+-----------+
2 rows selected (1.701 seconds)

## now lets try to use the line in a case statement
## it doesn't work, but we get different more puzzling errors this time
0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
'map' else 'string' end  from dfs.`/Users/jos/work/drill/testcase2.json` t ;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
materialize incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation:
[is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
Error in expression at index -1.  Error: Failure composing If Expression.
All conditions must return a boolean type.  Condition was of Type NULL..
Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on 10.19.220.63:31010]
(state=,code=0)
0: jdbc:drill:zk=local>


## data I used is this test
## casetest2.json has two lines in it
{"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
ndagdagan_apex@apixio.com
","roles":null,"isNotadmins":true,"iscoders":true}}}
{"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
ndagdagan_apex@apixio.com
","roles":null,"isNotadmins":true,"iscoders":true}}}





_____________

john o schneider
jos@apixio.com
408-203-7891

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by John Schneider <js...@apixio.com>.
Thanks guys - jos


_____________

john o schneider
jos@apixio.com
408-203-7891


On Wed, Dec 2, 2015 at 10:19 PM, Steven Phillips <st...@dremio.com> wrote:

> You can file a jira at https://issues.apache.org/jira/browse/DRILL/
>
> When you file, go ahead and assign it to me.
>
> On Wed, Dec 2, 2015 at 1:41 PM, Jacques Nadeau <ja...@dremio.com> wrote:
>
> > Steven, can you look at this?
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Wed, Dec 2, 2015 at 10:05 AM, John Schneider <js...@apixio.com>
> > wrote:
> >
> >> All, Is there a link to Jira where I can log this error?
> >>
> >> _____________
> >>
> >> john o schneider
> >> jos@apixio.com
> >> 408-203-7891
> >>
> >>
> >> On Tue, Dec 1, 2015 at 10:43 AM, John Schneider <js...@apixio.com>
> >> wrote:
> >>
> >> > Hi Jacques, did not know about that, have one similar and one
> different
> >> > result after setting union types on
> >> >
> >> > ## enabling works ok
> >> > ##
> >> > 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` =
> >> true;
> >> > +-------+----------------------------------+
> >> > |  ok   |             summary              |
> >> > +-------+----------------------------------+
> >> > | true  | exec.enable_union_type updated.  |
> >> > +-------+----------------------------------+
> >> > 1 row selected (0.157 seconds)
> >> >
> >> > ## now let's try query over two rows that are the same
> >> > ## got same error as before
> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> then
> >> > 'map' else 'string' end  from
> dfs.`/Users/jos/Downloads/testcase2.json`
> >> t ;
> >> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> >> > materialize incoming schema.  Errors:
> >> >
> >> > Error in expression at index -1.  Error: Missing function
> >> implementation:
> >> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> >> > Error in expression at index -1.  Error: Failure composing If
> >> Expression.
> >> > All conditions must return a boolean type.  Condition was of Type
> NULL..
> >> > Full expression: --UNKNOWN EXPRESSION--..
> >> >
> >> > Fragment 0:0
> >> >
> >> > [Error Id: 7c1b4dd4-8485-4429-a082-d936f5b3b95a on 10.19.220.63:31010
> ]
> >> > (state=,code=0)
> >> >
> >> > ## now lets try the query over different row types,
> >> > ## this time we get an exception - I will cut and past full stack
> trace
> >> at
> >> > end
> >> >
> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> then
> >> > 'map' else 'string' end from dfs.`/Users/jos/Downloads/testcase.json`
> t
> >> ;
> >> > Error: SYSTEM ERROR: NullPointerException
> >> >
> >> > Fragment 0:0
> >> >
> >> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010
> ]
> >> > (state=,code=0)
> >> >
> >> >
> >> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010
> ]
> >> > at
> >> >
> >>
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
> >> > ~[drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> >> > [drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> >> > [na:1.8.0_51]
> >> > at
> >> >
> >>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> >> > [na:1.8.0_51]
> >> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> >> > Caused by: java.lang.NullPointerException: null
> >> > at
> >> >
> >>
> org.apache.drill.exec.vector.complex.UnionVector.getFieldIdIfMatches(UnionVector.java:729)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches(FieldIdUtil.java:95)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.vector.complex.AbstractContainerVector.getFieldIdIfMatches(AbstractContainerVector.java:114)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches(SimpleVectorWrapper.java:146)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.record.VectorContainer.getValueVectorId(VectorContainer.java:252)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId(ScanBatch.java:307)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:628)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:217)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.expression.SchemaPath.accept(SchemaPath.java:152)
> >> > ~[drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:274)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:217)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60)
> >> > ~[drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:494)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:217)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.expression.IfExpression.accept(IfExpression.java:64)
> >> > ~[drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:120)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:386)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:131)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
> >> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> >> > at java.security.AccessController.doPrivileged(Native Method)
> >> > ~[na:1.8.0_51]
> >> > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_51]
> >> > at
> >> >
> >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> >> > ~[hadoop-common-2.7.1.jar:na]
> >> > at
> >> >
> >>
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > ... 4 common frames omitted
> >> > 2015-12-01 10:36:15,231 [CONTROL-rpc-event-queue] WARN
> >> >  o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
> >> > COMPLETED state as query is already at FAILED state (which is
> terminal).
> >> > 2015-12-01 10:36:15,232 [CONTROL-rpc-event-queue] WARN
> >> >  o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
> >> fragment.
> >> > 29a2175f-d3f2-caf9-2b51-12754264abe9:0:0 does not exist.
> >> > 2015-12-01 10:36:15,234 [USER-rpc-event-queue] INFO
> >> >  o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> >> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> >> > NullPointerException
> >> >
> >> > Fragment 0:0
> >> >
> >> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010
> ]
> >> > at
> >> >
> >>
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
> >> > [drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
> >> > [drill-common-1.3.0.jar:1.3.0]
> >> > at
> >> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
> >> > [drill-java-exec-1.3.0.jar:1.3.0]
> >> > at
> >> >
> >>
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> >> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> >> > [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> >> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> >> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> >> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> >> > at
> >> >
> >>
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> >> > [netty-common-4.0.27.Final.jar:4.0.27.Final]
> >> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> >> >
> >> >
> >> > _____________
> >> >
> >> > john o schneider
> >> > jos@apixio.com
> >> > 408-203-7891
> >> >
> >> >
> >> > On Tue, Dec 1, 2015 at 10:01 AM, Jacques Nadeau <ja...@dremio.com>
> >> > wrote:
> >> >
> >> >> Did you enable the union type? You'll need to do that (for now) as
> >> >> Heterogeneous type support is currently an experimental feature.
> >> >>
> >> >> ALTER SESSION SET `exec.enable_union_type` = true;
> >> >>
> >> >> See here:
> >> >>
> >> >>
> >> >>
> >>
> https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types
> >> >>
> >> >> --
> >> >> Jacques Nadeau
> >> >> CTO and Co-Founder, Dremio
> >> >>
> >> >> On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <
> jschneider@apixio.com>
> >> >> wrote:
> >> >>
> >> >> > Hi All,
> >> >> >
> >> >> > I'm trying to use case statements to manage a heterogeneous stream
> of
> >> >> json
> >> >> > objects as
> >> >> > shown in the example from
> >> >> > https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
> >> >> > but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
> >> >> > goodness and case statements will help me with the last real
> hurdles
> >> I
> >> >> have
> >> >> > using drill with my logs.
> >> >> > Would you please review the tests I created below and tell me if
> I'm
> >> >> just
> >> >> > missing something obvious?
> >> >> >
> >> >> > Thanks
> >> >> > /jos
> >> >> >
> >> >> > ## first test, two lines, one with a field that's a string  and
> >> second
> >> >> > field is a map
> >> >> > ## first lets just select all records, I expect this to barf since
> >> there
> >> >> > are two schemas
> >> >> > : jdbc:drill:zk=local> select *  from
> >> >> > dfs.`/Users/jos/work/drill/casetest.json` t ;
> >> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start
> when
> >> you
> >> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >> >> >
> >> >> > File  /Users/jos/work/drill/casetest.json
> >> >> > Record  2
> >> >> > Fragment 0:0
> >> >> >
> >> >> > [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
> >> >> > (state=,code=0)
> >> >> >
> >> >> > ## now lets use a case statement to sort out the schemas, I don't
> >> expect
> >> >> > this to
> >> >> > ## barf but barf it does, seems like this should have worked, what
> >> am I
> >> >> > missing
> >> >> >
> >> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> >> then
> >> >> > 'map' else 'string' end from
> >> dfs.`/Users/jos/work/drill/casetest.json`
> >> >> t ;
> >> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start
> when
> >> you
> >> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >> >> >
> >> >> > File  /Users/jos/Downloads/2015-11-30-bad-3.json
> >> >> > Record  2
> >> >> > Fragment 0:0
> >> >> >
> >> >> > [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
> >> >> > (state=,code=0)
> >> >> > 0: jdbc:drill:zk=local>
> >> >> >
> >> >> >
> >> >> > ## data I used is this
> >> >> > ## casetest.json has two lines in it
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
> >> >> > ndagdagan_apex@apixio.com"}}
> >> >> >
> >> >> >
> >> >>
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> >> > ndagdagan_apex@apixio.com
> >> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >> >
> >> >> >
> >> >> > ## now lets see if any case will work on any structure
> >> >> > ## new test file with same line in it twice
> >> >> > ## select * works as expected
> >> >> > 0: jdbc:drill:zk=local> select * from
> >> >> > dfs.`/Users/jos/work/drill/testcase2.json` t ;
> >> >> > +-------+------+-----------+
> >> >> > | level | time | user_info |
> >> >> > +-------+------+-----------+
> >> >> > | EVENT | 1448844983160 |
> >> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> >> >> > | EVENT | 1448844983160 |
> >> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> >> >> > +-------+------+-----------+
> >> >> > 2 rows selected (1.701 seconds)
> >> >> >
> >> >> > ## now lets try to use the line in a case statement
> >> >> > ## it doesn't work, but we get different more puzzling errors this
> >> time
> >> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> >> then
> >> >> > 'map' else 'string' end  from
> >> >> dfs.`/Users/jos/work/drill/testcase2.json` t
> >> >> > ;
> >> >> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> >> >> > materialize incoming schema.  Errors:
> >> >> >
> >> >> > Error in expression at index -1.  Error: Missing function
> >> >> implementation:
> >> >> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> >> >> > Error in expression at index -1.  Error: Failure composing If
> >> >> Expression.
> >> >> > All conditions must return a boolean type.  Condition was of Type
> >> NULL..
> >> >> > Full expression: --UNKNOWN EXPRESSION--..
> >> >> >
> >> >> > Fragment 0:0
> >> >> >
> >> >> > [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on
> >> 10.19.220.63:31010]
> >> >> > (state=,code=0)
> >> >> > 0: jdbc:drill:zk=local>
> >> >> >
> >> >> >
> >> >> > ## data I used is this test
> >> >> > ## casetest2.json has two lines in it
> >> >> >
> >> >> >
> >> >>
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> >> > ndagdagan_apex@apixio.com
> >> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >> >
> >> >> >
> >> >>
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> >> > ndagdagan_apex@apixio.com
> >> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > _____________
> >> >> >
> >> >> > john o schneider
> >> >> > jos@apixio.com
> >> >> > 408-203-7891
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
> >
>

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by Steven Phillips <st...@dremio.com>.
You can file a jira at https://issues.apache.org/jira/browse/DRILL/

When you file, go ahead and assign it to me.

On Wed, Dec 2, 2015 at 1:41 PM, Jacques Nadeau <ja...@dremio.com> wrote:

> Steven, can you look at this?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Wed, Dec 2, 2015 at 10:05 AM, John Schneider <js...@apixio.com>
> wrote:
>
>> All, Is there a link to Jira where I can log this error?
>>
>> _____________
>>
>> john o schneider
>> jos@apixio.com
>> 408-203-7891
>>
>>
>> On Tue, Dec 1, 2015 at 10:43 AM, John Schneider <js...@apixio.com>
>> wrote:
>>
>> > Hi Jacques, did not know about that, have one similar and one different
>> > result after setting union types on
>> >
>> > ## enabling works ok
>> > ##
>> > 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` =
>> true;
>> > +-------+----------------------------------+
>> > |  ok   |             summary              |
>> > +-------+----------------------------------+
>> > | true  | exec.enable_union_type updated.  |
>> > +-------+----------------------------------+
>> > 1 row selected (0.157 seconds)
>> >
>> > ## now let's try query over two rows that are the same
>> > ## got same error as before
>> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
>> > 'map' else 'string' end  from dfs.`/Users/jos/Downloads/testcase2.json`
>> t ;
>> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
>> > materialize incoming schema.  Errors:
>> >
>> > Error in expression at index -1.  Error: Missing function
>> implementation:
>> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
>> > Error in expression at index -1.  Error: Failure composing If
>> Expression.
>> > All conditions must return a boolean type.  Condition was of Type NULL..
>> > Full expression: --UNKNOWN EXPRESSION--..
>> >
>> > Fragment 0:0
>> >
>> > [Error Id: 7c1b4dd4-8485-4429-a082-d936f5b3b95a on 10.19.220.63:31010]
>> > (state=,code=0)
>> >
>> > ## now lets try the query over different row types,
>> > ## this time we get an exception - I will cut and past full stack trace
>> at
>> > end
>> >
>> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
>> > 'map' else 'string' end from dfs.`/Users/jos/Downloads/testcase.json` t
>> ;
>> > Error: SYSTEM ERROR: NullPointerException
>> >
>> > Fragment 0:0
>> >
>> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
>> > (state=,code=0)
>> >
>> >
>> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
>> > at
>> >
>> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>> > ~[drill-common-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>> > [drill-common-1.3.0.jar:1.3.0]
>> > at
>> >
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> > [na:1.8.0_51]
>> > at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> > [na:1.8.0_51]
>> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
>> > Caused by: java.lang.NullPointerException: null
>> > at
>> >
>> org.apache.drill.exec.vector.complex.UnionVector.getFieldIdIfMatches(UnionVector.java:729)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches(FieldIdUtil.java:95)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.vector.complex.AbstractContainerVector.getFieldIdIfMatches(AbstractContainerVector.java:114)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches(SimpleVectorWrapper.java:146)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.record.VectorContainer.getValueVectorId(VectorContainer.java:252)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId(ScanBatch.java:307)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:628)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:217)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.expression.SchemaPath.accept(SchemaPath.java:152)
>> > ~[drill-common-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:274)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:217)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60)
>> > ~[drill-common-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:494)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:217)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.expression.IfExpression.accept(IfExpression.java:64)
>> > ~[drill-common-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:120)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:386)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:131)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
>> > ~[drill-java-exec-1.3.0.jar:1.3.0]
>> > at java.security.AccessController.doPrivileged(Native Method)
>> > ~[na:1.8.0_51]
>> > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_51]
>> > at
>> >
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>> > ~[hadoop-common-2.7.1.jar:na]
>> > at
>> >
>> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > ... 4 common frames omitted
>> > 2015-12-01 10:36:15,231 [CONTROL-rpc-event-queue] WARN
>> >  o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
>> > COMPLETED state as query is already at FAILED state (which is terminal).
>> > 2015-12-01 10:36:15,232 [CONTROL-rpc-event-queue] WARN
>> >  o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel
>> fragment.
>> > 29a2175f-d3f2-caf9-2b51-12754264abe9:0:0 does not exist.
>> > 2015-12-01 10:36:15,234 [USER-rpc-event-queue] INFO
>> >  o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
>> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>> > NullPointerException
>> >
>> > Fragment 0:0
>> >
>> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
>> > at
>> >
>> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>> > [drill-common-1.3.0.jar:1.3.0]
>> > at
>> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
>> > [drill-common-1.3.0.jar:1.3.0]
>> > at
>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
>> > [drill-java-exec-1.3.0.jar:1.3.0]
>> > at
>> >
>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>> > [netty-handler-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
>> > at
>> >
>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>> > [netty-common-4.0.27.Final.jar:4.0.27.Final]
>> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
>> >
>> >
>> > _____________
>> >
>> > john o schneider
>> > jos@apixio.com
>> > 408-203-7891
>> >
>> >
>> > On Tue, Dec 1, 2015 at 10:01 AM, Jacques Nadeau <ja...@dremio.com>
>> > wrote:
>> >
>> >> Did you enable the union type? You'll need to do that (for now) as
>> >> Heterogeneous type support is currently an experimental feature.
>> >>
>> >> ALTER SESSION SET `exec.enable_union_type` = true;
>> >>
>> >> See here:
>> >>
>> >>
>> >>
>> https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types
>> >>
>> >> --
>> >> Jacques Nadeau
>> >> CTO and Co-Founder, Dremio
>> >>
>> >> On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <js...@apixio.com>
>> >> wrote:
>> >>
>> >> > Hi All,
>> >> >
>> >> > I'm trying to use case statements to manage a heterogeneous stream of
>> >> json
>> >> > objects as
>> >> > shown in the example from
>> >> > https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
>> >> > but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
>> >> > goodness and case statements will help me with the last real hurdles
>> I
>> >> have
>> >> > using drill with my logs.
>> >> > Would you please review the tests I created below and tell me if I'm
>> >> just
>> >> > missing something obvious?
>> >> >
>> >> > Thanks
>> >> > /jos
>> >> >
>> >> > ## first test, two lines, one with a field that's a string  and
>> second
>> >> > field is a map
>> >> > ## first lets just select all records, I expect this to barf since
>> there
>> >> > are two schemas
>> >> > : jdbc:drill:zk=local> select *  from
>> >> > dfs.`/Users/jos/work/drill/casetest.json` t ;
>> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when
>> you
>> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
>> >> >
>> >> > File  /Users/jos/work/drill/casetest.json
>> >> > Record  2
>> >> > Fragment 0:0
>> >> >
>> >> > [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
>> >> > (state=,code=0)
>> >> >
>> >> > ## now lets use a case statement to sort out the schemas, I don't
>> expect
>> >> > this to
>> >> > ## barf but barf it does, seems like this should have worked, what
>> am I
>> >> > missing
>> >> >
>> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
>> then
>> >> > 'map' else 'string' end from
>> dfs.`/Users/jos/work/drill/casetest.json`
>> >> t ;
>> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when
>> you
>> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
>> >> >
>> >> > File  /Users/jos/Downloads/2015-11-30-bad-3.json
>> >> > Record  2
>> >> > Fragment 0:0
>> >> >
>> >> > [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
>> >> > (state=,code=0)
>> >> > 0: jdbc:drill:zk=local>
>> >> >
>> >> >
>> >> > ## data I used is this
>> >> > ## casetest.json has two lines in it
>> >> >
>> >> >
>> >> >
>> >>
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
>> >> > ndagdagan_apex@apixio.com"}}
>> >> >
>> >> >
>> >>
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> >> > ndagdagan_apex@apixio.com
>> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >> >
>> >> >
>> >> > ## now lets see if any case will work on any structure
>> >> > ## new test file with same line in it twice
>> >> > ## select * works as expected
>> >> > 0: jdbc:drill:zk=local> select * from
>> >> > dfs.`/Users/jos/work/drill/testcase2.json` t ;
>> >> > +-------+------+-----------+
>> >> > | level | time | user_info |
>> >> > +-------+------+-----------+
>> >> > | EVENT | 1448844983160 |
>> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
>> >> > | EVENT | 1448844983160 |
>> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
>> >> > +-------+------+-----------+
>> >> > 2 rows selected (1.701 seconds)
>> >> >
>> >> > ## now lets try to use the line in a case statement
>> >> > ## it doesn't work, but we get different more puzzling errors this
>> time
>> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
>> then
>> >> > 'map' else 'string' end  from
>> >> dfs.`/Users/jos/work/drill/testcase2.json` t
>> >> > ;
>> >> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
>> >> > materialize incoming schema.  Errors:
>> >> >
>> >> > Error in expression at index -1.  Error: Missing function
>> >> implementation:
>> >> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
>> >> > Error in expression at index -1.  Error: Failure composing If
>> >> Expression.
>> >> > All conditions must return a boolean type.  Condition was of Type
>> NULL..
>> >> > Full expression: --UNKNOWN EXPRESSION--..
>> >> >
>> >> > Fragment 0:0
>> >> >
>> >> > [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on
>> 10.19.220.63:31010]
>> >> > (state=,code=0)
>> >> > 0: jdbc:drill:zk=local>
>> >> >
>> >> >
>> >> > ## data I used is this test
>> >> > ## casetest2.json has two lines in it
>> >> >
>> >> >
>> >>
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> >> > ndagdagan_apex@apixio.com
>> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >> >
>> >> >
>> >>
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> >> > ndagdagan_apex@apixio.com
>> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > _____________
>> >> >
>> >> > john o schneider
>> >> > jos@apixio.com
>> >> > 408-203-7891
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by Jacques Nadeau <ja...@dremio.com>.
Steven, can you look at this?

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Wed, Dec 2, 2015 at 10:05 AM, John Schneider <js...@apixio.com>
wrote:

> All, Is there a link to Jira where I can log this error?
>
> _____________
>
> john o schneider
> jos@apixio.com
> 408-203-7891
>
>
> On Tue, Dec 1, 2015 at 10:43 AM, John Schneider <js...@apixio.com>
> wrote:
>
> > Hi Jacques, did not know about that, have one similar and one different
> > result after setting union types on
> >
> > ## enabling works ok
> > ##
> > 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` =
> true;
> > +-------+----------------------------------+
> > |  ok   |             summary              |
> > +-------+----------------------------------+
> > | true  | exec.enable_union_type updated.  |
> > +-------+----------------------------------+
> > 1 row selected (0.157 seconds)
> >
> > ## now let's try query over two rows that are the same
> > ## got same error as before
> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> > 'map' else 'string' end  from dfs.`/Users/jos/Downloads/testcase2.json`
> t ;
> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> > materialize incoming schema.  Errors:
> >
> > Error in expression at index -1.  Error: Missing function implementation:
> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> > Error in expression at index -1.  Error: Failure composing If Expression.
> > All conditions must return a boolean type.  Condition was of Type NULL..
> > Full expression: --UNKNOWN EXPRESSION--..
> >
> > Fragment 0:0
> >
> > [Error Id: 7c1b4dd4-8485-4429-a082-d936f5b3b95a on 10.19.220.63:31010]
> > (state=,code=0)
> >
> > ## now lets try the query over different row types,
> > ## this time we get an exception - I will cut and past full stack trace
> at
> > end
> >
> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> > 'map' else 'string' end from dfs.`/Users/jos/Downloads/testcase.json` t ;
> > Error: SYSTEM ERROR: NullPointerException
> >
> > Fragment 0:0
> >
> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> > (state=,code=0)
> >
> >
> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> > at
> >
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
> > ~[drill-common-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> > [drill-common-1.3.0.jar:1.3.0]
> > at
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > [na:1.8.0_51]
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > [na:1.8.0_51]
> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> > Caused by: java.lang.NullPointerException: null
> > at
> >
> org.apache.drill.exec.vector.complex.UnionVector.getFieldIdIfMatches(UnionVector.java:729)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches(FieldIdUtil.java:95)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.vector.complex.AbstractContainerVector.getFieldIdIfMatches(AbstractContainerVector.java:114)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches(SimpleVectorWrapper.java:146)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.record.VectorContainer.getValueVectorId(VectorContainer.java:252)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId(ScanBatch.java:307)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:628)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:217)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> > org.apache.drill.common.expression.SchemaPath.accept(SchemaPath.java:152)
> > ~[drill-common-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:274)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:217)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60)
> > ~[drill-common-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:494)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:217)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.common.expression.IfExpression.accept(IfExpression.java:64)
> > ~[drill-common-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:120)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:386)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:131)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
> > ~[drill-java-exec-1.3.0.jar:1.3.0]
> > at java.security.AccessController.doPrivileged(Native Method)
> > ~[na:1.8.0_51]
> > at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_51]
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> > ~[hadoop-common-2.7.1.jar:na]
> > at
> >
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > ... 4 common frames omitted
> > 2015-12-01 10:36:15,231 [CONTROL-rpc-event-queue] WARN
> >  o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
> > COMPLETED state as query is already at FAILED state (which is terminal).
> > 2015-12-01 10:36:15,232 [CONTROL-rpc-event-queue] WARN
> >  o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment.
> > 29a2175f-d3f2-caf9-2b51-12754264abe9:0:0 does not exist.
> > 2015-12-01 10:36:15,234 [USER-rpc-event-queue] INFO
> >  o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> > NullPointerException
> >
> > Fragment 0:0
> >
> > [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> > at
> >
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
> > [drill-common-1.3.0.jar:1.3.0]
> > at org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
> > [drill-common-1.3.0.jar:1.3.0]
> > at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
> > [drill-java-exec-1.3.0.jar:1.3.0]
> > at
> >
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> > [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> > [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> > [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> > at
> >
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> > [netty-common-4.0.27.Final.jar:4.0.27.Final]
> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> >
> >
> > _____________
> >
> > john o schneider
> > jos@apixio.com
> > 408-203-7891
> >
> >
> > On Tue, Dec 1, 2015 at 10:01 AM, Jacques Nadeau <ja...@dremio.com>
> > wrote:
> >
> >> Did you enable the union type? You'll need to do that (for now) as
> >> Heterogeneous type support is currently an experimental feature.
> >>
> >> ALTER SESSION SET `exec.enable_union_type` = true;
> >>
> >> See here:
> >>
> >>
> >>
> https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types
> >>
> >> --
> >> Jacques Nadeau
> >> CTO and Co-Founder, Dremio
> >>
> >> On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <js...@apixio.com>
> >> wrote:
> >>
> >> > Hi All,
> >> >
> >> > I'm trying to use case statements to manage a heterogeneous stream of
> >> json
> >> > objects as
> >> > shown in the example from
> >> > https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
> >> > but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
> >> > goodness and case statements will help me with the last real hurdles I
> >> have
> >> > using drill with my logs.
> >> > Would you please review the tests I created below and tell me if I'm
> >> just
> >> > missing something obvious?
> >> >
> >> > Thanks
> >> > /jos
> >> >
> >> > ## first test, two lines, one with a field that's a string  and second
> >> > field is a map
> >> > ## first lets just select all records, I expect this to barf since
> there
> >> > are two schemas
> >> > : jdbc:drill:zk=local> select *  from
> >> > dfs.`/Users/jos/work/drill/casetest.json` t ;
> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when
> you
> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >> >
> >> > File  /Users/jos/work/drill/casetest.json
> >> > Record  2
> >> > Fragment 0:0
> >> >
> >> > [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
> >> > (state=,code=0)
> >> >
> >> > ## now lets use a case statement to sort out the schemas, I don't
> expect
> >> > this to
> >> > ## barf but barf it does, seems like this should have worked, what am
> I
> >> > missing
> >> >
> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> then
> >> > 'map' else 'string' end from dfs.`/Users/jos/work/drill/casetest.json`
> >> t ;
> >> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when
> you
> >> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >> >
> >> > File  /Users/jos/Downloads/2015-11-30-bad-3.json
> >> > Record  2
> >> > Fragment 0:0
> >> >
> >> > [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
> >> > (state=,code=0)
> >> > 0: jdbc:drill:zk=local>
> >> >
> >> >
> >> > ## data I used is this
> >> > ## casetest.json has two lines in it
> >> >
> >> >
> >> >
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
> >> > ndagdagan_apex@apixio.com"}}
> >> >
> >> >
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> > ndagdagan_apex@apixio.com
> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >
> >> >
> >> > ## now lets see if any case will work on any structure
> >> > ## new test file with same line in it twice
> >> > ## select * works as expected
> >> > 0: jdbc:drill:zk=local> select * from
> >> > dfs.`/Users/jos/work/drill/testcase2.json` t ;
> >> > +-------+------+-----------+
> >> > | level | time | user_info |
> >> > +-------+------+-----------+
> >> > | EVENT | 1448844983160 |
> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> >> > | EVENT | 1448844983160 |
> >> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> >> > +-------+------+-----------+
> >> > 2 rows selected (1.701 seconds)
> >> >
> >> > ## now lets try to use the line in a case statement
> >> > ## it doesn't work, but we get different more puzzling errors this
> time
> >> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`)
> then
> >> > 'map' else 'string' end  from
> >> dfs.`/Users/jos/work/drill/testcase2.json` t
> >> > ;
> >> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> >> > materialize incoming schema.  Errors:
> >> >
> >> > Error in expression at index -1.  Error: Missing function
> >> implementation:
> >> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> >> > Error in expression at index -1.  Error: Failure composing If
> >> Expression.
> >> > All conditions must return a boolean type.  Condition was of Type
> NULL..
> >> > Full expression: --UNKNOWN EXPRESSION--..
> >> >
> >> > Fragment 0:0
> >> >
> >> > [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on 10.19.220.63:31010
> ]
> >> > (state=,code=0)
> >> > 0: jdbc:drill:zk=local>
> >> >
> >> >
> >> > ## data I used is this test
> >> > ## casetest2.json has two lines in it
> >> >
> >> >
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> > ndagdagan_apex@apixio.com
> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >
> >> >
> >>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> >> > ndagdagan_apex@apixio.com
> >> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > _____________
> >> >
> >> > john o schneider
> >> > jos@apixio.com
> >> > 408-203-7891
> >> >
> >>
> >
> >
>

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by John Schneider <js...@apixio.com>.
All, Is there a link to Jira where I can log this error?

_____________

john o schneider
jos@apixio.com
408-203-7891


On Tue, Dec 1, 2015 at 10:43 AM, John Schneider <js...@apixio.com>
wrote:

> Hi Jacques, did not know about that, have one similar and one different
> result after setting union types on
>
> ## enabling works ok
> ##
> 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` = true;
> +-------+----------------------------------+
> |  ok   |             summary              |
> +-------+----------------------------------+
> | true  | exec.enable_union_type updated.  |
> +-------+----------------------------------+
> 1 row selected (0.157 seconds)
>
> ## now let's try query over two rows that are the same
> ## got same error as before
> 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> 'map' else 'string' end  from dfs.`/Users/jos/Downloads/testcase2.json` t ;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> materialize incoming schema.  Errors:
>
> Error in expression at index -1.  Error: Missing function implementation:
> [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> Error in expression at index -1.  Error: Failure composing If Expression.
> All conditions must return a boolean type.  Condition was of Type NULL..
> Full expression: --UNKNOWN EXPRESSION--..
>
> Fragment 0:0
>
> [Error Id: 7c1b4dd4-8485-4429-a082-d936f5b3b95a on 10.19.220.63:31010]
> (state=,code=0)
>
> ## now lets try the query over different row types,
> ## this time we get an exception - I will cut and past full stack trace at
> end
>
> 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> 'map' else 'string' end from dfs.`/Users/jos/Downloads/testcase.json` t ;
> Error: SYSTEM ERROR: NullPointerException
>
> Fragment 0:0
>
> [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> (state=,code=0)
>
>
> [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
> ~[drill-common-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-1.3.0.jar:1.3.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_51]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_51]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
> Caused by: java.lang.NullPointerException: null
> at
> org.apache.drill.exec.vector.complex.UnionVector.getFieldIdIfMatches(UnionVector.java:729)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches(FieldIdUtil.java:95)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.vector.complex.AbstractContainerVector.getFieldIdIfMatches(AbstractContainerVector.java:114)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches(SimpleVectorWrapper.java:146)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.record.VectorContainer.getValueVectorId(VectorContainer.java:252)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId(ScanBatch.java:307)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:628)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:217)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.expression.SchemaPath.accept(SchemaPath.java:152)
> ~[drill-common-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:274)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:217)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60)
> ~[drill-common-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:494)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:217)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.expression.IfExpression.accept(IfExpression.java:64)
> ~[drill-common-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:120)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:386)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:131)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
> ~[drill-java-exec-1.3.0.jar:1.3.0]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_51]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_51]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> ~[hadoop-common-2.7.1.jar:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
> [drill-java-exec-1.3.0.jar:1.3.0]
> ... 4 common frames omitted
> 2015-12-01 10:36:15,231 [CONTROL-rpc-event-queue] WARN
>  o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
> COMPLETED state as query is already at FAILED state (which is terminal).
> 2015-12-01 10:36:15,232 [CONTROL-rpc-event-queue] WARN
>  o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment.
> 29a2175f-d3f2-caf9-2b51-12754264abe9:0:0 does not exist.
> 2015-12-01 10:36:15,234 [USER-rpc-event-queue] INFO
>  o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
> NullPointerException
>
> Fragment 0:0
>
> [Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
> at
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
> [drill-common-1.3.0.jar:1.3.0]
> at org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
> [drill-common-1.3.0.jar:1.3.0]
> at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
> [drill-java-exec-1.3.0.jar:1.3.0]
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
> [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
> [netty-handler-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
> [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
> [netty-codec-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> [netty-transport-4.0.27.Final.jar:4.0.27.Final]
> at
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
> [netty-common-4.0.27.Final.jar:4.0.27.Final]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
>
>
> _____________
>
> john o schneider
> jos@apixio.com
> 408-203-7891
>
>
> On Tue, Dec 1, 2015 at 10:01 AM, Jacques Nadeau <ja...@dremio.com>
> wrote:
>
>> Did you enable the union type? You'll need to do that (for now) as
>> Heterogeneous type support is currently an experimental feature.
>>
>> ALTER SESSION SET `exec.enable_union_type` = true;
>>
>> See here:
>>
>>
>> https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <js...@apixio.com>
>> wrote:
>>
>> > Hi All,
>> >
>> > I'm trying to use case statements to manage a heterogeneous stream of
>> json
>> > objects as
>> > shown in the example from
>> > https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
>> > but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
>> > goodness and case statements will help me with the last real hurdles I
>> have
>> > using drill with my logs.
>> > Would you please review the tests I created below and tell me if I'm
>> just
>> > missing something obvious?
>> >
>> > Thanks
>> > /jos
>> >
>> > ## first test, two lines, one with a field that's a string  and second
>> > field is a map
>> > ## first lets just select all records, I expect this to barf since there
>> > are two schemas
>> > : jdbc:drill:zk=local> select *  from
>> > dfs.`/Users/jos/work/drill/casetest.json` t ;
>> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
>> > are using a ValueWriter of type NullableVarCharWriterImpl.
>> >
>> > File  /Users/jos/work/drill/casetest.json
>> > Record  2
>> > Fragment 0:0
>> >
>> > [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
>> > (state=,code=0)
>> >
>> > ## now lets use a case statement to sort out the schemas, I don't expect
>> > this to
>> > ## barf but barf it does, seems like this should have worked, what am I
>> > missing
>> >
>> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
>> > 'map' else 'string' end from dfs.`/Users/jos/work/drill/casetest.json`
>> t ;
>> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
>> > are using a ValueWriter of type NullableVarCharWriterImpl.
>> >
>> > File  /Users/jos/Downloads/2015-11-30-bad-3.json
>> > Record  2
>> > Fragment 0:0
>> >
>> > [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
>> > (state=,code=0)
>> > 0: jdbc:drill:zk=local>
>> >
>> >
>> > ## data I used is this
>> > ## casetest.json has two lines in it
>> >
>> >
>> >
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
>> > ndagdagan_apex@apixio.com"}}
>> >
>> >
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> > ndagdagan_apex@apixio.com
>> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >
>> >
>> > ## now lets see if any case will work on any structure
>> > ## new test file with same line in it twice
>> > ## select * works as expected
>> > 0: jdbc:drill:zk=local> select * from
>> > dfs.`/Users/jos/work/drill/testcase2.json` t ;
>> > +-------+------+-----------+
>> > | level | time | user_info |
>> > +-------+------+-----------+
>> > | EVENT | 1448844983160 |
>> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
>> > | EVENT | 1448844983160 |
>> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
>> > +-------+------+-----------+
>> > 2 rows selected (1.701 seconds)
>> >
>> > ## now lets try to use the line in a case statement
>> > ## it doesn't work, but we get different more puzzling errors this time
>> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
>> > 'map' else 'string' end  from
>> dfs.`/Users/jos/work/drill/testcase2.json` t
>> > ;
>> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
>> > materialize incoming schema.  Errors:
>> >
>> > Error in expression at index -1.  Error: Missing function
>> implementation:
>> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
>> > Error in expression at index -1.  Error: Failure composing If
>> Expression.
>> > All conditions must return a boolean type.  Condition was of Type NULL..
>> > Full expression: --UNKNOWN EXPRESSION--..
>> >
>> > Fragment 0:0
>> >
>> > [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on 10.19.220.63:31010]
>> > (state=,code=0)
>> > 0: jdbc:drill:zk=local>
>> >
>> >
>> > ## data I used is this test
>> > ## casetest2.json has two lines in it
>> >
>> >
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> > ndagdagan_apex@apixio.com
>> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >
>> >
>> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
>> > ndagdagan_apex@apixio.com
>> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
>> >
>> >
>> >
>> >
>> >
>> > _____________
>> >
>> > john o schneider
>> > jos@apixio.com
>> > 408-203-7891
>> >
>>
>
>

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by John Schneider <js...@apixio.com>.
Hi Jacques, did not know about that, have one similar and one different
result after setting union types on

## enabling works ok
##
0: jdbc:drill:zk=local> ALTER SESSION SET `exec.enable_union_type` = true;
+-------+----------------------------------+
|  ok   |             summary              |
+-------+----------------------------------+
| true  | exec.enable_union_type updated.  |
+-------+----------------------------------+
1 row selected (0.157 seconds)

## now let's try query over two rows that are the same
## got same error as before
0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
'map' else 'string' end  from dfs.`/Users/jos/Downloads/testcase2.json` t ;
Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
materialize incoming schema.  Errors:

Error in expression at index -1.  Error: Missing function implementation:
[is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
Error in expression at index -1.  Error: Failure composing If Expression.
All conditions must return a boolean type.  Condition was of Type NULL..
Full expression: --UNKNOWN EXPRESSION--..

Fragment 0:0

[Error Id: 7c1b4dd4-8485-4429-a082-d936f5b3b95a on 10.19.220.63:31010]
(state=,code=0)

## now lets try the query over different row types,
## this time we get an exception - I will cut and past full stack trace at
end

0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
'map' else 'string' end from dfs.`/Users/jos/Downloads/testcase.json` t ;
Error: SYSTEM ERROR: NullPointerException

Fragment 0:0

[Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
(state=,code=0)


[Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
~[drill-common-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
[drill-common-1.3.0.jar:1.3.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_51]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_51]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
Caused by: java.lang.NullPointerException: null
at
org.apache.drill.exec.vector.complex.UnionVector.getFieldIdIfMatches(UnionVector.java:729)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.vector.complex.FieldIdUtil.getFieldIdIfMatches(FieldIdUtil.java:95)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.vector.complex.AbstractContainerVector.getFieldIdIfMatches(AbstractContainerVector.java:114)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.record.SimpleVectorWrapper.getFieldIdIfMatches(SimpleVectorWrapper.java:146)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.record.VectorContainer.getValueVectorId(VectorContainer.java:252)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.ScanBatch.getValueVectorId(ScanBatch.java:307)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:628)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitSchemaPath(ExpressionTreeMaterializer.java:217)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.expression.SchemaPath.accept(SchemaPath.java:152)
~[drill-common-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:274)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitFunctionCall(ExpressionTreeMaterializer.java:217)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.expression.FunctionCall.accept(FunctionCall.java:60)
~[drill-common-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:494)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer$MaterializeVisitor.visitIfExpression(ExpressionTreeMaterializer.java:217)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.expression.IfExpression.accept(IfExpression.java:64)
~[drill-common-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.expr.ExpressionTreeMaterializer.materialize(ExpressionTreeMaterializer.java:120)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:386)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:131)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
~[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
~[drill-java-exec-1.3.0.jar:1.3.0]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_51]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_51]
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
~[hadoop-common-2.7.1.jar:na]
at
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
[drill-java-exec-1.3.0.jar:1.3.0]
... 4 common frames omitted
2015-12-01 10:36:15,231 [CONTROL-rpc-event-queue] WARN
 o.a.drill.exec.work.foreman.Foreman - Dropping request to move to
COMPLETED state as query is already at FAILED state (which is terminal).
2015-12-01 10:36:15,232 [CONTROL-rpc-event-queue] WARN
 o.a.d.e.w.b.ControlMessageHandler - Dropping request to cancel fragment.
29a2175f-d3f2-caf9-2b51-12754264abe9:0:0 does not exist.
2015-12-01 10:36:15,234 [USER-rpc-event-queue] INFO
 o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#7] Query failed:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
NullPointerException

Fragment 0:0

[Error Id: 9daa2496-d774-47b6-b786-014aac9abe59 on 10.19.220.63:31010]
at
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:112)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
[drill-java-exec-1.3.0.jar:1.3.0]
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
[drill-java-exec-1.3.0.jar:1.3.0]
at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
[drill-common-1.3.0.jar:1.3.0]
at org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
[drill-java-exec-1.3.0.jar:1.3.0]
at
org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
[drill-common-1.3.0.jar:1.3.0]
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
[drill-java-exec-1.3.0.jar:1.3.0]
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
[drill-java-exec-1.3.0.jar:1.3.0]
at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
[netty-handler-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
[netty-transport-4.0.27.Final.jar:4.0.27.Final]
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
[netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]


_____________

john o schneider
jos@apixio.com
408-203-7891


On Tue, Dec 1, 2015 at 10:01 AM, Jacques Nadeau <ja...@dremio.com> wrote:

> Did you enable the union type? You'll need to do that (for now) as
> Heterogeneous type support is currently an experimental feature.
>
> ALTER SESSION SET `exec.enable_union_type` = true;
>
> See here:
>
>
> https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <js...@apixio.com>
> wrote:
>
> > Hi All,
> >
> > I'm trying to use case statements to manage a heterogeneous stream of
> json
> > objects as
> > shown in the example from
> > https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
> > but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
> > goodness and case statements will help me with the last real hurdles I
> have
> > using drill with my logs.
> > Would you please review the tests I created below and tell me if I'm just
> > missing something obvious?
> >
> > Thanks
> > /jos
> >
> > ## first test, two lines, one with a field that's a string  and second
> > field is a map
> > ## first lets just select all records, I expect this to barf since there
> > are two schemas
> > : jdbc:drill:zk=local> select *  from
> > dfs.`/Users/jos/work/drill/casetest.json` t ;
> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >
> > File  /Users/jos/work/drill/casetest.json
> > Record  2
> > Fragment 0:0
> >
> > [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
> > (state=,code=0)
> >
> > ## now lets use a case statement to sort out the schemas, I don't expect
> > this to
> > ## barf but barf it does, seems like this should have worked, what am I
> > missing
> >
> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> > 'map' else 'string' end from dfs.`/Users/jos/work/drill/casetest.json` t
> ;
> > Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
> > are using a ValueWriter of type NullableVarCharWriterImpl.
> >
> > File  /Users/jos/Downloads/2015-11-30-bad-3.json
> > Record  2
> > Fragment 0:0
> >
> > [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
> > (state=,code=0)
> > 0: jdbc:drill:zk=local>
> >
> >
> > ## data I used is this
> > ## casetest.json has two lines in it
> >
> >
> >
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
> > ndagdagan_apex@apixio.com"}}
> >
> >
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> > ndagdagan_apex@apixio.com
> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >
> >
> > ## now lets see if any case will work on any structure
> > ## new test file with same line in it twice
> > ## select * works as expected
> > 0: jdbc:drill:zk=local> select * from
> > dfs.`/Users/jos/work/drill/testcase2.json` t ;
> > +-------+------+-----------+
> > | level | time | user_info |
> > +-------+------+-----------+
> > | EVENT | 1448844983160 |
> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> > | EVENT | 1448844983160 |
> > {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> > ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> > +-------+------+-----------+
> > 2 rows selected (1.701 seconds)
> >
> > ## now lets try to use the line in a case statement
> > ## it doesn't work, but we get different more puzzling errors this time
> > 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> > 'map' else 'string' end  from dfs.`/Users/jos/work/drill/testcase2.json`
> t
> > ;
> > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> > materialize incoming schema.  Errors:
> >
> > Error in expression at index -1.  Error: Missing function implementation:
> > [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> > Error in expression at index -1.  Error: Failure composing If Expression.
> > All conditions must return a boolean type.  Condition was of Type NULL..
> > Full expression: --UNKNOWN EXPRESSION--..
> >
> > Fragment 0:0
> >
> > [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on 10.19.220.63:31010]
> > (state=,code=0)
> > 0: jdbc:drill:zk=local>
> >
> >
> > ## data I used is this test
> > ## casetest2.json has two lines in it
> >
> >
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> > ndagdagan_apex@apixio.com
> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >
> >
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> > ndagdagan_apex@apixio.com
> > ","roles":null,"isNotadmins":true,"iscoders":true}}}
> >
> >
> >
> >
> >
> > _____________
> >
> > john o schneider
> > jos@apixio.com
> > 408-203-7891
> >
>

Re: Having difficulties using CASE statement to manage heterogeneous schemas

Posted by Jacques Nadeau <ja...@dremio.com>.
Did you enable the union type? You'll need to do that (for now) as
Heterogeneous type support is currently an experimental feature.

ALTER SESSION SET `exec.enable_union_type` = true;

See here:

https://drill.apache.org/docs/json-data-model/#experimental-feature:-heterogeneous-types

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Tue, Dec 1, 2015 at 9:57 AM, John Schneider <js...@apixio.com>
wrote:

> Hi All,
>
> I'm trying to use case statements to manage a heterogeneous stream of json
> objects as
> shown in the example from
> https://drill.apache.org/blog/2015/11/23/drill-1.3-released/
> but I'm not getting any love yet. drill 1.1 -> 1.3 is chock full of
> goodness and case statements will help me with the last real hurdles I have
> using drill with my logs.
> Would you please review the tests I created below and tell me if I'm just
> missing something obvious?
>
> Thanks
> /jos
>
> ## first test, two lines, one with a field that's a string  and second
> field is a map
> ## first lets just select all records, I expect this to barf since there
> are two schemas
> : jdbc:drill:zk=local> select *  from
> dfs.`/Users/jos/work/drill/casetest.json` t ;
> Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
> are using a ValueWriter of type NullableVarCharWriterImpl.
>
> File  /Users/jos/work/drill/casetest.json
> Record  2
> Fragment 0:0
>
> [Error Id: 1385aea5-68cb-4775-ae17-fad6b4901ea6 on 10.0.1.9:31010]
> (state=,code=0)
>
> ## now lets use a case statement to sort out the schemas, I don't expect
> this to
> ## barf but barf it does, seems like this should have worked, what am I
> missing
>
> 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> 'map' else 'string' end from dfs.`/Users/jos/work/drill/casetest.json` t ;
> Error: DATA_READ ERROR: Error parsing JSON - You tried to start when you
> are using a ValueWriter of type NullableVarCharWriterImpl.
>
> File  /Users/jos/Downloads/2015-11-30-bad-3.json
> Record  2
> Fragment 0:0
>
> [Error Id: 872a5347-93dd-49ae-a55c-e861b807b4a6 on 10.0.1.9:31010]
> (state=,code=0)
> 0: jdbc:drill:zk=local>
>
>
> ## data I used is this
> ## casetest.json has two lines in it
>
>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":"
> ndagdagan_apex@apixio.com"}}
>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> ndagdagan_apex@apixio.com
> ","roles":null,"isNotadmins":true,"iscoders":true}}}
>
>
> ## now lets see if any case will work on any structure
> ## new test file with same line in it twice
> ## select * works as expected
> 0: jdbc:drill:zk=local> select * from
> dfs.`/Users/jos/work/drill/testcase2.json` t ;
> +-------+------+-----------+
> | level | time | user_info |
> +-------+------+-----------+
> | EVENT | 1448844983160 |
> {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> | EVENT | 1448844983160 |
> {"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> ndagdagan_apex@apixio.com","isNotadmins":true,"iscoders":true}} |
> +-------+------+-----------+
> 2 rows selected (1.701 seconds)
>
> ## now lets try to use the line in a case statement
> ## it doesn't work, but we get different more puzzling errors this time
> 0: jdbc:drill:zk=local> select case when is_map(t.user_info.`user`) then
> 'map' else 'string' end  from dfs.`/Users/jos/work/drill/testcase2.json` t
> ;
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to
> materialize incoming schema.  Errors:
>
> Error in expression at index -1.  Error: Missing function implementation:
> [is_map(MAP-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.
> Error in expression at index -1.  Error: Failure composing If Expression.
> All conditions must return a boolean type.  Condition was of Type NULL..
> Full expression: --UNKNOWN EXPRESSION--..
>
> Fragment 0:0
>
> [Error Id: c3a7f989-4d93-48c0-9a16-a38dd195314c on 10.19.220.63:31010]
> (state=,code=0)
> 0: jdbc:drill:zk=local>
>
>
> ## data I used is this test
> ## casetest2.json has two lines in it
>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> ndagdagan_apex@apixio.com
> ","roles":null,"isNotadmins":true,"iscoders":true}}}
>
> {"level":"EVENT","time":1448844983160,"user_info":{"session":"9OOLJ8HEGEQ0sTCVSXsK9ddJWVpFM5wM","user":{"id":"
> ndagdagan_apex@apixio.com
> ","roles":null,"isNotadmins":true,"iscoders":true}}}
>
>
>
>
>
> _____________
>
> john o schneider
> jos@apixio.com
> 408-203-7891
>