You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "yzr (Jira)" <ji...@apache.org> on 2020/11/20 08:59:00 UTC

[jira] [Updated] (ARROW-10650) [C++][parquet][hadoop]memory leak when read parquet file from hadoop

     [ https://issues.apache.org/jira/browse/ARROW-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

yzr updated ARROW-10650:
------------------------
    Summary: [C++][parquet][hadoop]memory leak when read parquet file from hadoop  (was: [C++]memory leak when read parquet file from hadoop)

> [C++][parquet][hadoop]memory leak when read parquet file from hadoop
> --------------------------------------------------------------------
>
>                 Key: ARROW-10650
>                 URL: https://issues.apache.org/jira/browse/ARROW-10650
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 2.0.0
>         Environment: linux
>            Reporter: yzr
>            Priority: Major
>
> when I use hdfs interface under arrow/io folder to access and read parquets files, it will lead memory leak. some of example codes blow:
>  
> _std::shared_ptr<arrow::io::HadoopFileSystem> hdfs = nullptr;_
> _...//set hdfs conf var;_
> _msg = arrow::io::HadoopFileSystem::Connect(&conf, &hdfs);_
> _...//check is connect hdfs or not;_
> _std::shared_ptr<arrow::io::HdfsReadableFile>_ _file_reader;_
> _...//set file_path var;_
> _arrow::Status ret = hdfs->OpenReadable(file_path, &file_reader);_
> _...//check open success or not;_
> _std::unique_ptr<parquet::ParquetFileReader>_ _parquet_reader;_
> _parquet_reader = parquet::ParquetFileReader::Open(file_reader);_
> _...//through_ _parquet_reader to get data;_
>  
> And I check when I read parquet file in local path,
> _static std::unique_ptr<ParquetFileReader> OpenFile(_
>  _const std::string& path, bool memory_map = true,_
>  _const ReaderProperties& props = default_reader_properties(),_
>  _std::shared_ptr<FileMetaData> metadata = NULLPTR);_
> use OpenFile API, it is OK, will not lead memory leak;
>  
> is there any wrong step when I use hdfs API? or exist some issues. 
> any quetions can reply me, thanks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)