You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@arrow.apache.org by "Andy Grove (Jira)" <ji...@apache.org> on 2019/09/25 13:12:00 UTC

[jira] [Created] (ARROW-6687) [Rust] [DataFusion]

Andy Grove created ARROW-6687:
---------------------------------

             Summary: [Rust] [DataFusion]
                 Key: ARROW-6687
                 URL: https://issues.apache.org/jira/browse/ARROW-6687
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust, Rust - DataFusion
    Affects Versions: 0.15.0
            Reporter: Andy Grove


I received this bug report directly via email:

 

Hi,
 
I've just tried out the master branch of the arrow lib, the SQL interface for parquet file generated by pyarrow 0.14.1 and pandas 0.25.1
 
It returns incorrect num_rows for my file (with ~3000columns x 2456rows), it's actually the batch size number 1024*1024 instead of the 2456 rows. The query is simple SELECT col FROM data and it's the sample code you've created and works for the test file in the arrow testing repo.
 
Sorry for reporting the issue via mail, it was faster & easier this way. 
 
I'm super happy and grateful that you decided to add parquet support. This is an awesome project, keep up the good work!
 
Best regards,
Adam Lippai



--
This message was sent by Atlassian Jira
(v8.3.4#803005)