You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Vinoth Chandar (JIRA)" <ji...@apache.org> on 2015/03/31 17:20:53 UTC

[jira] [Created] (SAMZA-622) Persisting Samza State on HDFS

Vinoth Chandar created SAMZA-622:
------------------------------------

             Summary: Persisting Samza State on HDFS
                 Key: SAMZA-622
                 URL: https://issues.apache.org/jira/browse/SAMZA-622
             Project: Samza
          Issue Type: Improvement
            Reporter: Vinoth Chandar
            Assignee: Jakob Homan


It would be nice to be able to read/write from HDFS, particularly for bootstrapping purposes.  A few points:

* Per the discussion [about leveldb|https://issues.apache.org/jira/browse/SAMZA-236?focusedCommentId=13985982&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13985982] this support should be separated into its own package and project (jar) for easy testing and severability.
* Similar to the Kafka RegexTopicGenerator, we can enumerate (recursively or not) the files in an HDFS directory during job startup.
* Connectivity with HCatalog would be interesting as well, but should be handled in a separate JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)