You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Chamikara Madhusanka Jayalath (Jira)" <ji...@apache.org> on 2022/05/26 17:53:00 UTC
[jira] [Commented] (BEAM-14493) HdfsDownloader gets wrong range
[ https://issues.apache.org/jira/browse/BEAM-14493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542611#comment-17542611 ]
Chamikara Madhusanka Jayalath commented on BEAM-14493:
------------------------------------------------------
Hi, thanks for reporting this. Should we reopen [the same Jira|https://issues.apache.org/jira/browse/BEAM-9152] instead of creating a new one ?
Also, will you be able to send a PR with the fix and a test ?
> HdfsDownloader gets wrong range
> -------------------------------
>
> Key: BEAM-14493
> URL: https://issues.apache.org/jira/browse/BEAM-14493
> Project: Beam
> Issue Type: Bug
> Components: io-py-hadoop
> Affects Versions: 2.31.0
> Reporter: Vincent Bernardi
> Priority: P2
>
> Trying to read avro data from HDFS from a python sidecar worker fails with:
> File "python3.7/site-packages/apache_beam/io/filesystemio.py", line 123, in readinto
> b[:len(data)] = data
> ValueError: memoryview assignment: lvalue and rvalue have different structures
> This is the same issue as https://issues.apache.org/jira/browse/BEAM-9152 which was marked as resolved without being resolved at all.
> As remarked by Jean-Christophe CARLES on https://issues.apache.org/jira/browse/BEAM-9152 , patching hadoopfilesystem.py by removing " + 1" in HdfsDownloader.get_range fixes it for us.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)