You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/12/08 02:23:02 UTC

[GitHub] [incubator-doris] demon-gu opened a new issue #5036: spark SQL在yarn上获取Doris数据中文乱码

demon-gu opened a new issue #5036:
URL: https://github.com/apache/incubator-doris/issues/5036


   val dorisSparkDF = sqlContext.read.format("doris")
                   .option("doris.table.identifier", "rms.ods_rms_device")
                   .option("doris.fenodes", "aidata-node003:8035")
                   .option("user", "root")
                   .option("password", "")
                   .load()
   
   dorisSparkDF.show()
   代码如上,本地跑中文不乱码,提交到yarn上跑中文就乱码


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] vagetablechicken commented on issue #5036: spark SQL在yarn上获取Doris数据中文乱码

Posted by GitBox <gi...@apache.org>.
vagetablechicken commented on issue #5036:
URL: https://github.com/apache/incubator-doris/issues/5036#issuecomment-755848621


   I've found a similar error. The reason is:
   1. be side: use the utf8 charset to encode 
   https://github.com/apache/incubator-doris/blob/65d33cf43c837e56a2a36e78b358bfc0a9d1916b/be/src/util/arrow/row_batch.cpp#L80
   1. spark-doris-connector side: use the default charset
   https://github.com/apache/incubator-doris/blob/65d33cf43c837e56a2a36e78b358bfc0a9d1916b/extension/spark-doris-connector/src/main/java/org/apache/doris/spark/serialization/RowBatch.java#L271
   
   In my environment, the default charset is US-ASCII, so the Chinese characters become messy.
   It's better to specify charset `UTF_8` in `serialization/RowBatch`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman closed issue #5036: spark SQL在yarn上获取Doris数据中文乱码

Posted by GitBox <gi...@apache.org>.
morningman closed issue #5036:
URL: https://github.com/apache/incubator-doris/issues/5036


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] HappenLee commented on issue #5036: spark SQL在yarn上获取Doris数据中文乱码

Posted by GitBox <gi...@apache.org>.
HappenLee commented on issue #5036:
URL: https://github.com/apache/incubator-doris/issues/5036#issuecomment-740330735


   This maybe the problem of encoding. Doris only support utf8 now, please check the encoding of spark and yarm.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] demon-gu commented on issue #5036: spark SQL在yarn上获取Doris数据中文乱码

Posted by GitBox <gi...@apache.org>.
demon-gu commented on issue #5036:
URL: https://github.com/apache/incubator-doris/issues/5036#issuecomment-740333242


   > This maybe the problem of encoding. Doris only support utf8 now, please check the encoding of spark and yarm.
   
   ok,thanks


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org