You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Chamikara Madhusanka Jayalath (Jira)" <ji...@apache.org> on 2022/05/26 17:53:00 UTC

[jira] [Commented] (BEAM-14493) HdfsDownloader gets wrong range

    [ https://issues.apache.org/jira/browse/BEAM-14493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542611#comment-17542611 ] 

Chamikara Madhusanka Jayalath commented on BEAM-14493:
------------------------------------------------------

Hi, thanks for reporting this. Should we reopen [the same Jira|https://issues.apache.org/jira/browse/BEAM-9152] instead of creating a new one ?

Also, will you be able to send a PR with the fix and a test ?



> HdfsDownloader gets wrong range
> -------------------------------
>
>                 Key: BEAM-14493
>                 URL: https://issues.apache.org/jira/browse/BEAM-14493
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-hadoop
>    Affects Versions: 2.31.0
>            Reporter: Vincent Bernardi
>            Priority: P2
>
> Trying to read avro data from HDFS from a python sidecar worker fails with:
> File "python3.7/site-packages/apache_beam/io/filesystemio.py", line 123, in readinto
> b[:len(data)] = data
> ValueError: memoryview assignment: lvalue and rvalue have different structures
> This is the same issue as https://issues.apache.org/jira/browse/BEAM-9152 which was marked as resolved without being resolved at all.
>  As remarked by Jean-Christophe CARLES on https://issues.apache.org/jira/browse/BEAM-9152 , patching hadoopfilesystem.py by removing " + 1" in HdfsDownloader.get_range fixes it for us.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)