You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@streams.apache.org by "Steve Blackmon (JIRA)" <ji...@apache.org> on 2015/03/03 23:05:04 UTC

[jira] [Updated] (STREAMS-293) allow for missing metadata fields in streams-persist-hdfs

     [ https://issues.apache.org/jira/browse/STREAMS-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Blackmon updated STREAMS-293:
-----------------------------------
    Description: 
Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns.  this could be made much more flexible without too much effort.  

Update reader and writer to support additional use cases:
a) file paths containing one json document per line
b) file paths containing just id and json on each line, 
c) file paths containing id timestamp and json document on each line




  was:
Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns.  this could be made much more flexible without too much effort.  

Update reader and writer to support additional use cases:
a) files with field delimiter other than \t
b) files with line delimiter other than \n
c) file paths containing one json document per line
d) file paths containing just id and json on each line, 
e) file paths containing id timestamp and json document on each line





> allow for missing metadata fields in streams-persist-hdfs
> ---------------------------------------------------------
>
>                 Key: STREAMS-293
>                 URL: https://issues.apache.org/jira/browse/STREAMS-293
>             Project: Streams
>          Issue Type: Improvement
>            Reporter: Steve Blackmon
>
> Currently streams-persist-hdfs writer creates (and reader expects) exactly four columns.  this could be made much more flexible without too much effort.  
> Update reader and writer to support additional use cases:
> a) file paths containing one json document per line
> b) file paths containing just id and json on each line, 
> c) file paths containing id timestamp and json document on each line



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)