You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by pasalkarsachin1 <gi...@git.apache.org> on 2017/02/15 19:51:05 UTC

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

GitHub user pasalkarsachin1 opened a pull request:

    https://github.com/apache/storm/pull/1942

    STORM-2358: Update storm hdfs spout to remove specific implementation handling

    As part of this change I have removed specific handling in code for TextFileReader & SequenceFileReader also made AbstractFileReader as public

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pasalkarsachin1/storm STORM-2358

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1942.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1942
    
----
commit 6f9c199c6a453773c768a9700f2286326079afdf
Author: Sachin Pasalkar <sa...@symantec.com>
Date:   2017-02-15T19:42:02Z

    STORM-2358: Update storm hdfs spout to remove specific implementation handling
    
    As part of this change we have removed specific handling in code for TextFileReader & SequenceFileReader also made AbstractFileReader as public

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1942


---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @HeartSaVioR  @roshannaik @arunmahadevan  are you OK with this approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101697143
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -105,87 +105,87 @@
     
       private String configKey = Configs.DEFAULT_HDFS_CONFIG_KEY; // key for hdfs Kerberos configs
     
    -  public HdfsSpout() {
    +  public HDFSSpout() {
       }
     
    -  public HdfsSpout setHdfsUri(String hdfsUri) {
    +  public HDFSSpout setHdfsUri(String hdfsUri) {
         this.hdfsUri = hdfsUri;
         return this;
       }
     
    -  public HdfsSpout setReaderType(String readerType) {
    +  public HDFSSpout setReaderType(Class<? extends FileReader> readerType) {
    --- End diff --
    
    1. How about backward compatibility ? Changing the signature will break existing clients. May be you can deprecate the earlier method and add a new one like `setFileReader(Class<? extends FileReader> reader)`
    
    2. How would one initialize a `HDFSSpout` via Flux? Earlier since the method accepted a String it was pretty trivial. Try it out and add some example, or add an additional method like `setFileReader(String className)` that takes the class name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101698425
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -722,19 +703,10 @@ private FileReader createFileReader(Path file)
        */
       private FileReader createFileReader(Path file, String offset)
               throws IOException {
    -    if ( readerType.equalsIgnoreCase(Configs.SEQ) ) {
    -      return new SequenceFileReader(this.hdfs, file, conf, offset);
    -    }
    -    if ( readerType.equalsIgnoreCase(Configs.TEXT) ) {
    -      return new TextFileReader(this.hdfs, file, conf, offset);
    -    }
    -
         try {
    -      Class<?> clsType = Class.forName(readerType);
    -      Constructor<?> constructor = clsType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    +      Constructor<?> constructor = readerType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    --- End diff --
    
    Same code is being there before my changes, I can create JIRA for this. ATM I am just trying to get rid of specific implementations


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101699216
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -105,87 +105,87 @@
     
       private String configKey = Configs.DEFAULT_HDFS_CONFIG_KEY; // key for hdfs Kerberos configs
     
    -  public HdfsSpout() {
    +  public HDFSSpout() {
       }
     
    -  public HdfsSpout setHdfsUri(String hdfsUri) {
    +  public HDFSSpout setHdfsUri(String hdfsUri) {
         this.hdfsUri = hdfsUri;
         return this;
       }
     
    -  public HdfsSpout setReaderType(String readerType) {
    +  public HDFSSpout setReaderType(Class<? extends FileReader> readerType) {
    --- End diff --
    
    Sure I will deprecate this. However adding deprecation will add more code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by roshannaik <gi...@git.apache.org>.
Github user roshannaik commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    Clink on this link : 
    http://flume.apache.org/FlumeUserGuide.html
    and search for 'class name'



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @HeartSaVioR @roshannaik @arunmahadevan Can you review this? Let me know if there is any issue you see with backward compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by roshannaik <gi...@git.apache.org>.
Github user roshannaik commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101704609
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -45,11 +46,12 @@
     import org.apache.storm.topology.base.BaseRichSpout;
     import org.apache.storm.tuple.Fields;
     
    -public class HdfsSpout extends BaseRichSpout {
    +public class HDFSSpout extends BaseRichSpout {
    --- End diff --
    
    Do you realize what you are doing here ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101696141
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -722,19 +703,10 @@ private FileReader createFileReader(Path file)
        */
       private FileReader createFileReader(Path file, String offset)
               throws IOException {
    -    if ( readerType.equalsIgnoreCase(Configs.SEQ) ) {
    -      return new SequenceFileReader(this.hdfs, file, conf, offset);
    -    }
    -    if ( readerType.equalsIgnoreCase(Configs.TEXT) ) {
    -      return new TextFileReader(this.hdfs, file, conf, offset);
    -    }
    -
         try {
    -      Class<?> clsType = Class.forName(readerType);
    -      Constructor<?> constructor = clsType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    +      Constructor<?> constructor = readerType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    --- End diff --
    
    This assumes a ctor with expected number and type of params is available and tries to invoke it via reflection. Adding an init/open with required params in the `FileReader` is much cleaner IMO and avoids having to check and throw exceptions at runtime.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    For points made by @HeartSaVioR 
    
    For point 2, I have various implementations in mind to add to open source one of them is implemented by megha10 for ZipTextFileReader another can be anySepratorReader like (pipe,comma). If we add enum for every implementation contributor has to do other extra work to enable it rather just adding reader.
    
    Also if user adds his class where he can add it? He has to provide FCQN which comes back to same implementation which I did.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by roshannaik <gi...@git.apache.org>.
Github user roshannaik commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101703840
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -105,87 +105,87 @@
     
       private String configKey = Configs.DEFAULT_HDFS_CONFIG_KEY; // key for hdfs Kerberos configs
     
    -  public HdfsSpout() {
    +  public HDFSSpout() {
       }
     
    -  public HdfsSpout setHdfsUri(String hdfsUri) {
    +  public HDFSSpout setHdfsUri(String hdfsUri) {
         this.hdfsUri = hdfsUri;
         return this;
       }
     
    -  public HdfsSpout setReaderType(String readerType) {
    +  public HDFSSpout setReaderType(Class<? extends FileReader> readerType) {
    --- End diff --
    
    Looks like you are removing support for aliases. Is that right ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101698782
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -722,19 +703,10 @@ private FileReader createFileReader(Path file)
        */
       private FileReader createFileReader(Path file, String offset)
               throws IOException {
    -    if ( readerType.equalsIgnoreCase(Configs.SEQ) ) {
    -      return new SequenceFileReader(this.hdfs, file, conf, offset);
    -    }
    -    if ( readerType.equalsIgnoreCase(Configs.TEXT) ) {
    -      return new TextFileReader(this.hdfs, file, conf, offset);
    -    }
    -
         try {
    -      Class<?> clsType = Class.forName(readerType);
    -      Constructor<?> constructor = clsType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    +      Constructor<?> constructor = readerType.getConstructor(FileSystem.class, Path.class, Map.class, String.class);
    --- End diff --
    
    Not sure what you mean by "get rid of specific implementations". I see that you are mainly cleaning up and refactoring the code and so its better if you do it with this patch.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    > The current FileReader implementation is expected to have constructor with a specific signature, which cannot be enforced at compile time. IMO, this should also be fixed along with this patch by introducing some init method in the FileReader interface.`
    
    I am thinking to work on this after this patch we can have init() api which verifies things before it proceeds. 
    
    > It enforces implementation to be Serializable, which HDFS classes (fields in FileReader implementations) are not. That seems why HdfsSpout instantiates FileReader in open method.`
    
    Once we provide init method we are providing user a way to initialize the non Serializable member.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101698420
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -409,9 +409,14 @@ public void open(Map conf, TopologyContext context, SpoutOutputCollector collect
     
         // Reader type config
         if ( readerType==null && conf.containsKey(Configs.READER_TYPE) ) {
    -      readerType = conf.get(Configs.READER_TYPE).toString();
    +    	String className = (String) conf.get(Configs.READER_TYPE);
    --- End diff --
    
    It was already there so I updated as per requirement. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    I have reverted class renaming. Other than this I don't see break in backward compatibility 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101702796
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -105,87 +105,87 @@
     
       private String configKey = Configs.DEFAULT_HDFS_CONFIG_KEY; // key for hdfs Kerberos configs
     
    -  public HdfsSpout() {
    +  public HDFSSpout() {
       }
     
    -  public HdfsSpout setHdfsUri(String hdfsUri) {
    +  public HDFSSpout setHdfsUri(String hdfsUri) {
         this.hdfsUri = hdfsUri;
         return this;
       }
     
    -  public HdfsSpout setReaderType(String readerType) {
    +  public HDFSSpout setReaderType(Class<? extends FileReader> readerType) {
    --- End diff --
    
    Done please review


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @arunmahadevan 
    > I am not sure why cant we directly pass the FileReader implementation instead of the class (like setReader(FileReader reader)) and I believe flux will be able to handle this.
    
    It enforces implementation to be `Serializable`, which HDFS classes (fields in FileReader implementations) are not. That seems why HdfsSpout instantiates FileReader in `open` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @pasalkarsachin1 
    
    > For point 2, I have various implementations in mind to add to open source one of them is implemented by megha10 for ZipTextFileReader another can be anySepratorReader like (pipe,comma). If we add enum for every implementation contributor has to do other extra work to enable it rather just adding reader.
    > Also if user adds his class where he can add it? He has to provide FCQN which comes back to same implementation which I did.
    
    For point 2 I suggest **adding** shortcut, not removing previous one. Providing FQDN is not easiest way to do so we can make some implementation taking precedence and support it via easiest way.
    
    IMHO providing shortcut method is easiest (or similar to providing enum), and providing enum is good enough, and receiving only FQDN is not convenient, but just has to use. First two things make IDE providing various hints, and the last thing doesn't.
    
    Adding it to enum or even providing shortcut method will be done in 10 lines. I don't think it gives contributors more extra works, but we might need to guide it properly.
    
    
    
    > The current FileReader implementation is expected to have constructor with a specific signature, which cannot be enforced at compile time. IMO, this should also be fixed along with this patch by introducing some init method in the FileReader interface.`
    > I am thinking to work on this after this patch we can have init() api which verifies things before it proceeds.
    > Once we provide init method we are providing user a way to initialize the non Serializable member.
    
    There're various ways to do it.
    First of all, your suggestion is valid and I've been using that approach.
    One other approach is adding method which receives **builder instance** and build the instance in open() method. Yes builder instance must be serializable, but if it only has configurations it is not that hard.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    >We should check that it works with Flux. In other words, if Flux doesn't support instantiating Class object and passing it, the change just breaks backward compatibility.
    
    Agree and pointed out in one of the earlier comments. One suggestion was to have two methods,
    
    1. setReader(Class<? extends FileReader> reader)
    2. setReaderClass(String className)
    
    >To be honest, accepting alias and class in one method doesn't look intuitive. Users should memorize the alias (TEXT, SEQ) to use.
    
    Right now the method accepts special strings "text", "seq" or class name, which looks confusing and not type safe. The proposed patch addresses this by accepting a FileReader class. I am not sure why cant we directly pass the FileReader implementation instead of the class (like `setReader(FileReader reader)`) and I believe flux will be able to handle this.
    
    The current FileReader implementation is expected to have constructor with a specific signature, which cannot be enforced at compile time. IMO, this should also be fixed along with this patch by introducing some init method in the FileReader interface.
    
    Regarding backward compatibility, if we are fine to break it in the next major version (2.0) we can keep this out of 1.x branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by roshannaik <gi...@git.apache.org>.
Github user roshannaik commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    Deprecating/removing aliasing support **IS** naturally a backward compat issue. 
    
    You have expressed your concern that you don't like registering aliases in HdfsSpout. But removing aliasing is not the solution to address your concern.
    
    Like I said before, If you know how to implement aliasing without registering in HdfsSpout then it would address your concern and we can look into this PR.... otherwise there is nothing left to look into here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @roshannaik  Like I said even on Jira STORM-2358. I don't see point having code *just to specifically handle alias*. Code should be *generic* to handle all cases. If you want to push to have specific handling I don't think its right contribution & we already have this discussion on STORM-2358. I would like to see view from community.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @pasalkarsachin1 @roshannaik 
    Two things from me...
    
    1. We should check that it works with Flux. In other words, if Flux doesn't support instantiating Class object and passing it, the change just breaks backward compatibility.
    
    2. To be honest, accepting alias and class in one method doesn't look intuitive. Users should memorize the alias (TEXT, SEQ) to use. Instead of doing this, why not adding shortcut methods so that IDE can help for us? Like `text()`, `sequence()` or a bit more longer name if it doesn't feel enough. Other options we can create Enum for known implementations and full class name for others, but the former looks simpler unless we're aliasing too much.
    
    cc. @arunmahadevan 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101704350
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -105,87 +105,87 @@
     
       private String configKey = Configs.DEFAULT_HDFS_CONFIG_KEY; // key for hdfs Kerberos configs
     
    -  public HdfsSpout() {
    +  public HDFSSpout() {
       }
     
    -  public HdfsSpout setHdfsUri(String hdfsUri) {
    +  public HDFSSpout setHdfsUri(String hdfsUri) {
         this.hdfsUri = hdfsUri;
         return this;
       }
     
    -  public HdfsSpout setReaderType(String readerType) {
    +  public HDFSSpout setReaderType(Class<? extends FileReader> readerType) {
    --- End diff --
    
    I have deprecated as per comments, not removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1942: STORM-2358: Update storm hdfs spout to remove spec...

Posted by arunmahadevan <gi...@git.apache.org>.
Github user arunmahadevan commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1942#discussion_r101696530
  
    --- Diff: external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HDFSSpout.java ---
    @@ -409,9 +409,14 @@ public void open(Map conf, TopologyContext context, SpoutOutputCollector collect
     
         // Reader type config
         if ( readerType==null && conf.containsKey(Configs.READER_TYPE) ) {
    -      readerType = conf.get(Configs.READER_TYPE).toString();
    +    	String className = (String) conf.get(Configs.READER_TYPE);
    --- End diff --
    
    What is the need for this when the spout accepts the readerType via `setReaderType ` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by roshannaik <gi...@git.apache.org>.
Github user roshannaik commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @HeartSaVioR 
    I am OK with that line of thinking  of handling aliases, as you are looking to make it more intuitive to user. .. although we may not have a real problem to justify public interface changes. 
    
    However...  the author's concern is quite the opposite  ... it is not from a user perspective.   He is trying to make code  *more generic*  (as he puts it)  by **removing** aliasing  ... because we need to make the alias known to Hdfs Spout.... Which means no enums, no text() or sequence() etc.
    
    *Side note:* This approach of using aliases OR class names in the same place has precedence in other projects as well...  Apache Flume uses this pattern a fair amount.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @roshannaik 
    Interesting. Could you point out where Apache Flume uses that approach? Just curious. :)
    Btw, if it's from configuration backed by text file, we might have no option. But I think there're still some options to choose since we're assigning it from codebase.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    @HeartSaVioR 
    
    > For point 2 I suggest adding shortcut, not removing previous one. Providing FQDN is not easiest way to do so we can make some implementations taking precedence and support it via easiest way.
    
    IMHO providing shortcut method is easiest (or similar to providing enum), and providing enum is good enough, and receiving only FQDN is not convenient, but just has to use. First two things make IDE providing various hints, and the last thing doesn't.
    
    Adding it to enum or even providing shortcut method will be done in 10 lines. I don't think it gives contributors more extra works, but we might need to guide it properly.`
    
    My intention was never to use FQCN it was just to use className rather than Constants. Even you look at my code I am just using class name.
    ```java
    setReader(Class<? extends FileReader> reader)
    ```
    However what you are suggesting is in HdfsSpout I should add something like below
    
    ```java
    public hdfsSpout useTextFileReader() {
    this.readerType = TextFileReader.class;
    return this;
    }
    
    public hdfsSpout useSequenceFileReader() {
    this.readerType = SequenceFileReader.class;
    }
    ```
    Also have a way to provide way to support FQCN, Correct?
    
    If everyone agrees I will do the change. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    First of all, thanks for the contribution.
    
    I'm seeing backward incompatible as @roshannaik stated, and like renaming class is not strictly necessary to. Huge part of maintaining open source is fighting with backward compatibility. What you change will affect number of developers. It's joyful thing for open source developers but also huge pain point.
    
    Unless there are critical places so need to change or get rid of, breaking backward compatibility would not be accepted, and even it is really needed, it would be better to discuss widely.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1942: STORM-2358: Update storm hdfs spout to remove specific im...

Posted by pasalkarsachin1 <gi...@git.apache.org>.
Github user pasalkarsachin1 commented on the issue:

    https://github.com/apache/storm/pull/1942
  
    Can someone take a look at this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---