You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Shao Grant <sd...@live.com> on 2022/08/19 02:49:08 UTC

[datafusion] datafusion cannot recognize chinese charactors?

I'm testing the datafusion and like it very much.
but I found it cannot recognize the chinese charactors, how to solve it?
my simple code as below: (using the python for example)
```
# encoding=utf8
use datafusion::prelude::*;

#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
  // register the table
  let ctx = SessionContext::new();
  ctx.register_csv("example", "lite.csv", CsvReadOptions::new()).await?;

  // create a plan to run a SQL query
  let df = ctx.sql("SELECT distinct 扫描人 FROM example").await?;

  // execute and print results
  df.show().await?;
  Ok(())
}

Error: SQL(ParserError("Expected an expression:, found: 扫"))
```

Re: [datafusion] datafusion cannot recognize chinese charactors?

Posted by Andrew Lamb <al...@influxdata.com>.
For completeness, I believe there is a corresponding issue [1] with
resolution in the datafusion repo (double quotes are needed)

Andrew

[1] https://github.com/apache/arrow-datafusion/issues/3203

On Mon, Aug 22, 2022 at 2:16 PM Shao Grant <sd...@live.com> wrote:

> sorry, it's in rust not python, my bad..
>
>
> ________________________________
> 发件人: Shao Grant
> 发送时间: 2022年8月19日 10:49
> 收件人: dev@arrow.apache.org <de...@arrow.apache.org>
> 主题: [datafusion] datafusion cannot recognize chinese charactors?
>
> I'm testing the datafusion and like it very much.
> but I found it cannot recognize the chinese charactors, how to solve it?
> my simple code as below: (using the python for example)
> ```
> # encoding=utf8
> use datafusion::prelude::*;
>
> #[tokio::main]
> async fn main() -> datafusion::error::Result<()> {
>   // register the table
>   let ctx = SessionContext::new();
>   ctx.register_csv("example", "lite.csv", CsvReadOptions::new()).await?;
>
>   // create a plan to run a SQL query
>   let df = ctx.sql("SELECT distinct 扫描人 FROM example").await?;
>
>   // execute and print results
>   df.show().await?;
>   Ok(())
> }
>
> Error: SQL(ParserError("Expected an expression:, found: 扫"))
> ```
>

回复: [datafusion] datafusion cannot recognize chinese charactors?

Posted by Shao Grant <sd...@live.com>.
sorry, it's in rust not python, my bad..


________________________________
发件人: Shao Grant
发送时间: 2022年8月19日 10:49
收件人: dev@arrow.apache.org <de...@arrow.apache.org>
主题: [datafusion] datafusion cannot recognize chinese charactors?

I'm testing the datafusion and like it very much.
but I found it cannot recognize the chinese charactors, how to solve it?
my simple code as below: (using the python for example)
```
# encoding=utf8
use datafusion::prelude::*;

#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
  // register the table
  let ctx = SessionContext::new();
  ctx.register_csv("example", "lite.csv", CsvReadOptions::new()).await?;

  // create a plan to run a SQL query
  let df = ctx.sql("SELECT distinct 扫描人 FROM example").await?;

  // execute and print results
  df.show().await?;
  Ok(())
}

Error: SQL(ParserError("Expected an expression:, found: 扫"))
```