You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "liyun Liu (JIRA)" <ji...@apache.org> on 2016/02/17 08:16:18 UTC

[jira] [Commented] (DRILL-4039) Query fails when non-ascii characters are used in string literals

    [ https://issues.apache.org/jira/browse/DRILL-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150025#comment-15150025 ] 

liyun Liu commented on DRILL-4039:
----------------------------------

I am using Drill 1.4.0.

Use bin/drill-embedded to launch Drill shell. Issue a SQL which contains some
Chinese characters, there will be the following error:

{quote}
0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json` WHERE last_name = '世界' LIMIT 3;
Feb 17, 2016 11:17:38 AM org.apache.calcite.runtime.CalciteException <init>
SEVERE: org.apache.calcite.runtime.CalciteException: Failed to encode '世界' in character set 'ISO-8859-1'
Error: SYSTEM ERROR: CalciteException: Failed to encode '世界' in character set 'ISO-8859-1'


[Error Id: 33cfc8ba-acde-4122-9020-cf61abbf2b42 on jinglin:31010] (state=,code=0)
{quote}

And log/sqlline.log has the following log message:

{quote}
[Error Id: 33cfc8ba-acde-4122-9020-cf61abbf2b42 on jinglin:31010]
	at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) ~[drill-common-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:742) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:841) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:786) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) [drill-common-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:788) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:894) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:255) [drill-java-exec-1.4.0.jar:1.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_72]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception during fragment initialization: Internal error: while converting `employee.json`.`last_name` = '世界'
	... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: while converting `employee.json`.`last_name` = '世界'
	at org.apache.calcite.util.Util.newInternal(Util.java:792) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.ReflectiveConvertletTable$1.convertCall(ReflectiveConvertletTable.java:96) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:59) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4165) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3598) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4057) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertWhere(SqlToRelConverter.java:920) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:606) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:583) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2790) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:537) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.prepare.PlannerImpl.convert(PlannerImpl.java:214) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:471) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:201) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:167) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:197) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:909) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.4.0.jar:1.4.0]
	... 3 common frames omitted
Caused by: java.lang.reflect.InvocationTargetException: null
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_72]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_72]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_72]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_72]
	at org.apache.calcite.sql2rel.ReflectiveConvertletTable$1.convertCall(ReflectiveConvertletTable.java:87) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	... 20 common frames omitted
Caused by: org.apache.calcite.runtime.CalciteException: Failed to encode '世界' in character set 'ISO-8859-1'
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_72]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_72]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_72]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_72]
	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:405) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:514) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.util.NlsString.<init>(NlsString.java:81) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:810) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.rex.RexBuilder.makeCharLiteral(RexBuilder.java:985) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertLiteral(SqlNodeToRexConverterImpl.java:115) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:4153) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.visit(SqlToRelConverter.java:3598) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlLiteral.accept(SqlLiteral.java:404) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:4057) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.StandardConvertletTable.convertExpressionList(StandardConvertletTable.java:810) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:785) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql2rel.StandardConvertletTable.convertCall(StandardConvertletTable.java:770) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	... 25 common frames omitted
{quote}

I searched on Google for a solution.
http://nagix.hatenablog.com/entry/2015/05/29/150215 gives a solution. Following
the advice given in the blog, I added {{-Dsaffron.default.charset=UTF-16LE}} to
DRILL_JAVA_OPTS in conf/drill-env.sh. This fixed the above problem. But
if I ran a similiar SQL against Hive, I got the another error:

{quote}
0: jdbc:drill:zk=local> select * from hive.adminlog where stuname = '世界';
Feb 17, 2016 2:11:18 PM org.apache.calcite.sql.validate.SqlValidatorException <init>
SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
Feb 17, 2016 2:11:18 PM org.apache.calcite.runtime.CalciteException <init>
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 35 to line 1, column 48: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
Error: VALIDATION ERROR: From line 1, column 35 to line 1, column 48: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE


[Error Id: 964a6e7b-84f1-468e-bc6b-4256f1057da7 on jinglin:31010] (state=,code=0)
{quote}

And log/sqlline.log has the following log message:

{quote}
[Error Id: 964a6e7b-84f1-468e-bc6b-4256f1057da7 ]
	at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) ~[drill-common-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:200) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:909) [drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) [drill-java-exec-1.4.0.jar:1.4.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_72]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_72]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Caused by: org.apache.calcite.tools.ValidationException: org.apache.calcite.runtime.CalciteContextException: From line 1, column 35 to line 1, column 48: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
	at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:189) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:198) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:451) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:198) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:167) ~[drill-java-exec-1.4.0.jar:1.4.0]
	at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:197) [drill-java-exec-1.4.0.jar:1.4.0]
	... 5 common frames omitted
Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, column 35 to line 1, column 48: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_72]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_72]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_72]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_72]
	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:405) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:714) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:702) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:3931) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlBinaryOperator.adjustType(SqlBinaryOperator.java:112) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlOperator.deriveType(SqlOperator.java:491) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlBinaryOperator.deriveType(SqlBinaryOperator.java:143) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4268) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:4255) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1495) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1478) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereOrOn(SqlValidatorImpl.java:3375) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateWhereClause(SqlValidatorImpl.java:3362) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:2987) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:877) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:863) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:837) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:551) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:187) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	... 10 common frames omitted
Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_72]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_72]
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_72]
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_72]
	at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:405) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:514) ~[calcite-core-1.4.0-drill-r10.jar:1.4.0-drill-r10]
	... 32 common frames omitted
2016-02-17 14:11:18,560 [USER-rpc-event-queue] INFO  o.a.d.j.i.DrillResultSetImpl$ResultsListener - [#1] Query failed:
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 35 to line 1, column 48: Cannot apply = to the two different charsets ISO-8859-1 and UTF-16LE
{quote}

org.apache.calcite.util.SaffronProperties#defaultCharset decides the encoding of
a SQL literal. But org.apache.drill.exec.store.hive.schema.DrillHiveTable#getRelDataTypeFromHivePrimitiveType
converts a Hive varchar data to ISO-8859-1 in the following code:

{code}
case VARCHAR: {
	int maxLen = TypeInfoUtils.getCharacterLengthForType(pTypeInfo);
	return typeFactory.createTypeWithCharsetAndCollation(
		typeFactory.createSqlType(SqlTypeName.VARCHAR, maxLen), /*input type*/
		Charset.forName("ISO-8859-1"), /*unicode char set*/
		SqlCollation.IMPLICIT /* TODO: need to decide if implicit is the correct one */
	);
}
{code}

After replacing Charset.forName("ISO-8859-1") with org.apache.calcite.util.Util.getDefaultCharset(), 
the above SQL works against Hive.

I will submit a patch for review soon.


> Query fails when non-ascii characters are used in string literals
> -----------------------------------------------------------------
>
>                 Key: DRILL-4039
>                 URL: https://issues.apache.org/jira/browse/DRILL-4039
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - JDBC
>    Affects Versions: 1.1.0
>         Environment: Linux lnxx64r6 2.6.32-131.0.15.el6.x86_64 #1 SMP Tue May 10 15:42:40 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Sergio Lob
>
> The following query against DRILL returns this error:
> SYSTEM ERROR: CalciteException: Failed to encode  'НАСТРОЕние' in character set 'ISO-8859-1'
>  cc39118a-cde6-4a6e-a1d6-4b6b7e847b8a on maprd
> Query is:
>     SELECT
>    T1.`F01INT`,
>    T1.`F02UCHAR_10`,
>    T1.`F03UVARCHAR_10`
>     FROM
>    DPRV64R6_TRDUNI01T T1
>     WHERE
>    (T1.`F03UVARCHAR_10` =  'НАСТРОЕние')
>     ORDER BY
>    T1.`F01INT`;
> This issue looks similar to jira HIVE-12207.
> Is there a fix or workaround for this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)