You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vyacheslav Koptilin (Jira)" <ji...@apache.org> on 2021/11/09 22:17:00 UTC

[jira] [Commented] (IGNITE-15892) Calcite tests can be used with a limited list of charsets

    [ https://issues.apache.org/jira/browse/IGNITE-15892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441389#comment-17441389 ] 

Vyacheslav Koptilin commented on IGNITE-15892:
----------------------------------------------

Hello [~korlov],
Could you please take a look?

> Calcite tests can be used with a limited list of charsets
> ---------------------------------------------------------
>
>                 Key: IGNITE-15892
>                 URL: https://issues.apache.org/jira/browse/IGNITE-15892
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-alpha4
>
>
> The attempt to run calcite tests on a platform with the default charset WINDOWS-1252, for example, results in the following exception:
> {noformat}
> java.nio.charset.UnsupportedCharsetException: WINDOWS-1252
>     at org.apache.calcite.sql.SqlUtil.getCharset(SqlUtil.java:1008)
>     at org.apache.calcite.util.NlsString.<init>(NlsString.java:124)
>     at org.apache.calcite.util.NlsString.<init>(NlsString.java:116)
>     at org.apache.calcite.rex.RexBuilder.makeLiteral(RexBuilder.java:964)
>     at org.apache.calcite.rex.RexBuilder.<init>(RexBuilder.java:132)
>     at org.apache.ignite.internal.processors.query.calcite.prepare.IgnitePlanner.<init>(IgnitePlanner.java:126)
>     at org.apache.ignite.internal.processors.query.calcite.prepare.PlanningContext.planner(PlanningContext.java:163)
>     at org.apache.ignite.internal.processors.query.calcite.planner.AbstractPlannerTest.physicalPlan(AbstractPlannerTest.java:232)
>     at org.apache.ignite.internal.processors.query.calcite.planner.TableDmlPlannerTest.insertCachesTableScan(TableDmlPlannerTest.java:57)
> {noformat}
> It seems the list of supported charsets can be found here:
> {code:java|title=org.apache.calcite.sql.SqlUtil}
>   /**
>    * Translates a character set name from a SQL-level name into a Java-level
>    * name.
>    *
>    * @param name SQL-level name
>    * @return Java-level name, or null if SQL-level name is unknown
>    */
>   public static @Nullable String translateCharacterSetName(String name) {
>     switch (name) {
>     case "BIG5":
>       return "Big5";
>     case "LATIN1":
>       return "ISO-8859-1";
>     case "UTF8":
>       return "UTF-8";
>     case "UTF16":
>     case "UTF-16":
>       return ConversionUtil.NATIVE_UTF16_CHARSET_NAME;
>     case "GB2312":
>     case "GBK":
>     case "UTF-16BE":
>     case "UTF-16LE":
>     case "ISO-8859-1":
>     case "UTF-8":
>       return name;
>     default:
>       return null;
>     }
>   }
> {code}
> On the other hand, IGNITE-14545 proposes getting the default charset from the JVM that may contradict the list of supported charsets:
> {code:java|title=IgniteTypeFactory}
>     @Override
>     public Charset getDefaultCharset() {
>         // Use JVM default charset rather then Calcite default charset (ISO-8859-1).
>         return Charset.defaultCharset();
>     }
> {code}
> The obvious and dirty way to work around this issue is to specify the default charset explicitly
> {code:java|title=calcite/pom.xml}
>             <!--
>                 Need to override file.encoding property as calcite tests support a limited list of charsets.
>             -->
>             <plugin>
>                 <groupId>org.apache.maven.plugins</groupId>
>                 <artifactId>maven-surefire-plugin</artifactId>
>                 <configuration>
>                     <argLine>-Djava.util.logging.config.file=../../config/java.util.logging.properties -Dfile.encoding=UTF-8</argLine>
>                 </configuration>
>             </plugin>
> {code}
> Another option is switching to UTF-8 or UTF-16 in case of the default JVM charset is not properly supported:
> {code:java|title=IgniteTypeFactory}
>     public Charset getDefaultCharset() {
>         // Use JVM default charset rather then Calcite default charset (ISO-8859-1).
>         Charset jvmDefault = Charset.defaultCharset();
>         
>         if (SqlUtil.translateCharacterSetName(jvmDefault.name().toUpperCase(Locale.ROOT)) == null)
>             jvmDefault = StandardCharsets.UTF_8;
>         
>         return jvmDefault;
>     }
> {code}
> [Calcite charset support proposal|https://lists.apache.org/thread/944838cb40a34d19f5d0a50ed1e86e5fbd181cd8a529c3d98ef40f46@%3cdev.calcite.apache.org%3e]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)