You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2019/05/08 01:44:01 UTC

[jira] [Commented] (PARQUET-1571) [C++] Can't read data from parquet file in C++ library

    [ https://issues.apache.org/jira/browse/PARQUET-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835235#comment-16835235 ] 

Wes McKinney commented on PARQUET-1571:
---------------------------------------

I moved this from ARROW to PARQUET. Can you provide a way to reproduce the issue? Very large page statistics are a known limitation

> [C++] Can't read data from parquet file in C++ library
> ------------------------------------------------------
>
>                 Key: PARQUET-1571
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1571
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: worker24h
>            Priority: Critical
>
> Specified the second param *parquet::ReaderProperties* When I used parquet::ParquetFileReader::Open, it can't work.
>  The following code：
> {code:java}
> parquet::ReaderProperties _properties;
> _properties = parquet::ReaderProperties(); 
> _properties.enable_buffered_stream();  // used  buffer stream.  Don't set buffer-size
> parquet_reader = parquet::ParquetFileReader::Open(_parquet, _properties);
> ...
> int32_t value;
> parquet::Int32Reader* int32_reader =
> static_cast<parquet::Int32Reader*>(column_reader.get());
> int32_reader->Skip(_current_line_of_group);// skip lines of processed.
> rows_read = int32_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);  
> {code}
> The interface *Skip* throw exception：
> {color:#FF0000}{{Couldn't deserialize thrift: TProtocolException: Invalid data Deserializing page header failed.}}{color}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)