You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Jean-Baptiste Onofré (JIRA)" <ji...@apache.org> on 2017/03/02 20:23:45 UTC
[jira] [Assigned] (BEAM-1592) Unify HdfsIO and HadoopInputFormatIO
[ https://issues.apache.org/jira/browse/BEAM-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Baptiste Onofré reassigned BEAM-1592:
------------------------------------------
Assignee: Jean-Baptiste Onofré (was: Davor Bonaci)
> Unify HdfsIO and HadoopInputFormatIO
> ------------------------------------
>
> Key: BEAM-1592
> URL: https://issues.apache.org/jira/browse/BEAM-1592
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-extensions
> Reporter: Stephen Sisk
> Assignee: Jean-Baptiste Onofré
>
> HIFIO is currently in PR (https://github.com/apache/beam/pull/1994) and as per discussion in https://lists.apache.org/thread.html/803857877804165e798cf31edf079e6603eb9682b7690d52124c31e7@%3Cdev.beam.apache.org%3E, we'd like to check HIFIO in as-is, then unify the two since they share a lot of code.
> [~dhalperi@google.com] has mentioned: "the FileInputFormat reader gets to call some special APIs that the
> generic InputFormat reader cannot -- so they are not completely redundant. Specifically, FileInputFormat reader can do size-based splitting."
> Dan recommended: "See if we can "inline" the FileInputFormat specific parts of HdfsIO inside of HadoopInputFormatIO via reflection. If so, we can get the best of both worlds with shared code."
> This seems reasonable to me.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)