You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2014/05/06 00:05:22 UTC
[jira] [Created] (SAMZA-263) Create SystemConsumer and
SystemProducer for HDFS
Jakob Homan created SAMZA-263:
---------------------------------
Summary: Create SystemConsumer and SystemProducer for HDFS
Key: SAMZA-263
URL: https://issues.apache.org/jira/browse/SAMZA-263
Project: Samza
Issue Type: Improvement
Reporter: Jakob Homan
Assignee: Jakob Homan
It would be nice to be able to read/write from HDFS, particularly for bootstrapping purposes. A few points:
* Per the discussion [about leveldb|https://issues.apache.org/jira/browse/SAMZA-236?focusedCommentId=13985982&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985982] this support should be separated into its own package and project (jar) for easy testing and severability.
* Similar to the Kafka RegexTopicGenerator, we can enumerate (recursively or not) the files in an HDFS directory during job startup.
* Connectivity with HCatalog would be interesting as well, but should be handled in a separate JIRA.
--
This message was sent by Atlassian JIRA
(v6.2#6252)