You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Sachin Pasalkar (JIRA)" <ji...@apache.org> on 2017/02/14 02:07:41 UTC

[jira] [Commented] (STORM-2358) Update storm hdfs spout to remove specific implementation handlings

    [ https://issues.apache.org/jira/browse/STORM-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864880#comment-15864880 ] 

Sachin Pasalkar commented on STORM-2358:
----------------------------------------

Comments from Roshan:

1.  Make org.apache.storm.hdfs.spout.AbstractFileReader as public so that it can be used in generics.

Java generics and making a class public are unrelated to my knowledge. But making it public sounds ok to me if its useful for "user defined² readersŠ although it doesn¹t really have that much going on in it. For futurebuilt-in reader types it is immaterial as they can derive from it anywayjust like the existing ones. HdfsSpout class itself doesn¹t care about theŒAbstractFileReader¹ type. For that there is the ŒFileReader¹ interface.



  2.  org.apache.storm.hdfs.spout.HdfsSpout requires readerType as String. It will be great to have class<? extends AbstractFileReader>
readerType; So we will not use Class.forName at multiple places also it will help in below point.

The reason it is a string, is that, for built-in readers,  we wanted to support Œshort aliases¹ like Œtext¹ and Œseq¹ instead of FQCN..

3.  HdfsSpout also needs to provide outFields which are declared as constants in each reader(e.g.SequenceFileReader). We can have abstract
API AbstractFileReader in which return them to user to make it generic.


These consts can¹t go into the AbstractFileReader as they are reader specific.
They are there just for convenience.  Users can call withOutputFields() on the spout and set it to these predefined names or anything else.


> Update storm hdfs spout to remove specific implementation handlings
> -------------------------------------------------------------------
>
>                 Key: STORM-2358
>                 URL: https://issues.apache.org/jira/browse/STORM-2358
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-hdfs
>    Affects Versions: 1.x
>            Reporter: Sachin Pasalkar
>            Assignee: Sachin Pasalkar
>              Labels: newbie
>         Attachments: AbstractFileReader.java, FileReader.java, HDFSSpout.java, SequenceFileReader.java, TextFileReader.java
>
>
> I was looking at storm hdfs spout code in 1.x branch, I found below
> improvements can be made in below code.
>   1.  Make org.apache.storm.hdfs.spout.AbstractFileReader as public so
> that it can be used in generics.
>   2.  org.apache.storm.hdfs.spout.HdfsSpout requires readerType as
> String. It will be great to have class<? extends AbstractFileReader>
> readerType; So we will not use Class.forName at multiple places also it
> will help in below point.
>   3.  HdfsSpout also needs to provide outFields which are declared as
> constants in each reader(e.g.SequenceFileReader). We can have abstract
> API AbstractFileReader in which return them to user to make it generic.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)