You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Stephen Sisk (JIRA)" <ji...@apache.org> on 2017/04/26 17:13:04 UTC
[jira] [Created] (BEAM-2081) I/O Authoring overview - better
clarify how to read from files
Stephen Sisk created BEAM-2081:
----------------------------------
Summary: I/O Authoring overview - better clarify how to read from files
Key: BEAM-2081
URL: https://issues.apache.org/jira/browse/BEAM-2081
Project: Beam
Issue Type: Improvement
Components: website
Reporter: Stephen Sisk
Assignee: Davor Bonaci
Priority: Minor
The I/O authoring doc is a little bit confusing - it has an example of reading from file globs and says to use ParDos, but then mentions "A class derived from FileBasedSource is often the best option when reading from files"
It'd be nice to better clarify this and provide guidance as to when to use which.
I *think* the right answer here is that if you file is splittable you use FBS (and let it handle the glob splitting), and if it's not splittable you use ParDos.
SDF I believe will make all this easier.
cc [~kirpichov] [~dhalperi@google.com]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)