You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 马哲超 <ma...@gmail.com> on 2016/05/19 10:37:39 UTC

Re: Storm + HDFS

Yes, I have get the same error, and can't resolve it.

2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:

> Thank you for your reply, it worked. However, I got an another problem.
>
> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
> with basic one, so used TestWordSpout provided by Storm.
>
> I can successfully compile and submit the topology, but it doesn't write
> into HDFS.
>
> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
> do anything, and it has an error
>
> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
> java.lang.ClassLoader.defineClass1(Native Method) at
> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
> java.security.Sec
>
> Here is my topology
>
> public class HdfsFileTopology {
>   public static void main(String[] args) throws Exception {
>     RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
>     SyncPolicy syncPolicy = new CountSyncPolicy(100);
>     FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
>     FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
>     HdfsBolt bolt = new HdfsBolt()
>             .withFsUrl("hdfs://localhost:9000")
>             .withFileNameFormat(fileNameFormat)
>             .withRecordFormat(format)
>             .withRotationPolicy(rotationPolicy)
>             .withSyncPolicy(syncPolicy);
>
>     TopologyBuilder builder = new TopologyBuilder();
>     builder.setSpout("word", new TestWordSpout(), 1);
>     builder.setBolt("output", bolt, 1).shuffleGrouping("word");
>     Config conf = new Config();
>     conf.setDebug(true);
>     conf.setNumWorkers(3);
>     StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
>     }
> }
>
>
> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com> wrote:
>
>> Assuming you have git and maven installed:
>>
>> git clone git@github.com:apache/storm.git
>> cd storm
>> git checkout -b 1.x origin/1.x-branch
>> mvn install -DskipTests
>>
>> That third step checks out the 1.x-branch branch which is the base for
>> the upcoming 1.0 release.
>>
>> You can then include the storm-hdfs dependency in your project:
>>
>> <dependency>
>> <groupId>org.apache.storm</groupId>
>> <artifactId>storm-hdfs</artifactId>
>> <version>1.0.0-SNAPSHOT</version>
>> </dependency>
>>
>> You can find more information on using the spout and other HDFS
>> components here:
>>
>>
>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>
>> -Taylor
>>
>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>
>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>> beginner :)
>>
>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>> pbrahmbhatt@hortonworks.com> wrote:
>>
>>> Storm-hdfs spout is not yet published in maven. You will have to
>>> checkout storm locally and build it to make it available for development.
>>>
>>> From: K Zharas <kg...@gmail.com>
>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>> Subject: Re: Storm + HDFS
>>>
>>> Yes, looks like it is. But, I have added dependencies required by
>>> storm-hdfs as stated in a guide.
>>>
>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> Well,
>>>>
>>>> those errors look like a problem with the way you build your jar file.
>>>> Please, make sure that you build your jar with the proper storm maven
>>>> dependency).
>>>>
>>>> Cheers,
>>>> Nick
>>>>
>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>
>>>>> It throws and error that packages does not exist. I have also tried
>>>>> changing org.apache to backtype, still got an error but only for
>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>
>>>>>    package org.apache.storm does not exist
>>>>>    package org.apache.storm does not exist
>>>>>    package org.apache.storm.generated does not exist
>>>>>    package org.apache.storm.metric does not exist
>>>>>    package org.apache.storm.topology does not exist
>>>>>    package org.apache.storm.utils does not exist
>>>>>    package org.apache.storm.utils does not exist
>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>    package org.apache.storm.topology.base does not exist
>>>>>    package org.apache.storm.topology does not exist
>>>>>    package org.apache.storm.tuple does not exist
>>>>>    package org.apache.storm.task does not exist
>>>>>
>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>> instead of writing your own spout/bolt:
>>>>>>
>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>> > I have tried with the code below, but it is not working.
>>>>>> >
>>>>>> > public void nextTuple() {
>>>>>> >       Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>> ");
>>>>>> >       FileSystem fs = FileSystem.get(new Configuration());
>>>>>> >       BufferedReader br = new BufferedReader(new
>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>> >       String line = br.readLine();
>>>>>> >       while (line != null){
>>>>>> >          System.out.println(line);
>>>>>> >          line=br.readLine();
>>>>>> >          _collector.emit(new Values(line));
>>>>>> >       }
>>>>>> > }
>>>>>> >
>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>> > <ma...@gmail.com>> wrote:
>>>>>> >
>>>>>> >     Hi.
>>>>>> >
>>>>>> >     I have a project I'm currently working on. The idea is to
>>>>>> implement
>>>>>> >     "scikit-learn" into Storm and integrate it with HDFS.
>>>>>> >
>>>>>> >     I've already implemented "scikit-learn". But, currently I'm
>>>>>> using a
>>>>>> >     text file to read and write. However, I need to use HDFS, but
>>>>>> >     finding it hard to integrate with HDFS.
>>>>>> >
>>>>>> >     Here is the link to github
>>>>>> >     <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>> included
>>>>>> >     files that I used, not whole project)
>>>>>> >
>>>>>> >     Basically, I have a few questions if you don't mint to answer
>>>>>> them
>>>>>> >     1) How to use HDFS to read and write?
>>>>>> >     2) Is my "scikit-learn" implementation correct?
>>>>>> >     3) How to create a Storm project? (Currently working in
>>>>>> "storm-starter")
>>>>>> >
>>>>>> >     These questions may sound a bit silly, but I really can't find a
>>>>>> >     proper solution.
>>>>>> >
>>>>>> >     Thank you for your attention to this matter.
>>>>>> >     Sincerely, Zharas.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Best regards,
>>>>>> > Zharas
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Zharas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nick R. Katsipoulakis,
>>>> Department of Computer Science
>>>> University of Pittsburgh
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>
>>
>>
>> --
>> Best regards,
>> Zharas
>>
>>
>>
>
>
> --
> Best regards,
> Zharas
>

Re: Storm + HDFS

Posted by 马哲超 <ma...@gmail.com>.
I'd like to make a small fix for the solution in my last mail.

<dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>2.7.1</version>
      <exclusions>
        <exclusion>
          <groupId>org.slf4j</groupId>
          <artifactId>slf4j-log4j12</artifactId>
        </exclusion>
      </exclusions>
    </dependency>

2016-05-20 10:54 GMT+08:00 马哲超 <ma...@gmail.com>:

> I have already worked it out. Adding dependency hadoop-common in the
> pom.xml file can solve the problem.
>
> For example,
>
> <dependency>
>     <groupId>org.apache.hadoop</groupId>
>     <artifactId>hadoop-common</artifactId>
>     <version>2.7.1</version>
> </dependency>
>
> 2016-05-19 18:37 GMT+08:00 马哲超 <ma...@gmail.com>:
>
>> Yes, I have get the same error, and can't resolve it.
>>
>> 2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:
>>
>>> Thank you for your reply, it worked. However, I got an another problem.
>>>
>>> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
>>> with basic one, so used TestWordSpout provided by Storm.
>>>
>>> I can successfully compile and submit the topology, but it doesn't
>>> write into HDFS.
>>>
>>> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
>>> do anything, and it has an error
>>>
>>> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
>>> java.lang.ClassLoader.defineClass1(Native Method) at
>>> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
>>> java.security.Sec
>>>
>>> Here is my topology
>>>
>>> public class HdfsFileTopology {
>>>   public static void main(String[] args) throws Exception {
>>>     RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
>>>     SyncPolicy syncPolicy = new CountSyncPolicy(100);
>>>     FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
>>>     FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
>>>     HdfsBolt bolt = new HdfsBolt()
>>>             .withFsUrl("hdfs://localhost:9000")
>>>             .withFileNameFormat(fileNameFormat)
>>>             .withRecordFormat(format)
>>>             .withRotationPolicy(rotationPolicy)
>>>             .withSyncPolicy(syncPolicy);
>>>
>>>     TopologyBuilder builder = new TopologyBuilder();
>>>     builder.setSpout("word", new TestWordSpout(), 1);
>>>     builder.setBolt("output", bolt, 1).shuffleGrouping("word");
>>>     Config conf = new Config();
>>>     conf.setDebug(true);
>>>     conf.setNumWorkers(3);
>>>     StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
>>>     }
>>> }
>>>
>>>
>>> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com>
>>> wrote:
>>>
>>>> Assuming you have git and maven installed:
>>>>
>>>> git clone git@github.com:apache/storm.git
>>>> cd storm
>>>> git checkout -b 1.x origin/1.x-branch
>>>> mvn install -DskipTests
>>>>
>>>> That third step checks out the 1.x-branch branch which is the base for
>>>> the upcoming 1.0 release.
>>>>
>>>> You can then include the storm-hdfs dependency in your project:
>>>>
>>>> <dependency>
>>>> <groupId>org.apache.storm</groupId>
>>>> <artifactId>storm-hdfs</artifactId>
>>>> <version>1.0.0-SNAPSHOT</version>
>>>> </dependency>
>>>>
>>>> You can find more information on using the spout and other HDFS
>>>> components here:
>>>>
>>>>
>>>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>>>
>>>> -Taylor
>>>>
>>>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>>>
>>>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>>>> beginner :)
>>>>
>>>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>>>> pbrahmbhatt@hortonworks.com> wrote:
>>>>
>>>>> Storm-hdfs spout is not yet published in maven. You will have to
>>>>> checkout storm locally and build it to make it available for development.
>>>>>
>>>>> From: K Zharas <kg...@gmail.com>
>>>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>>>> Subject: Re: Storm + HDFS
>>>>>
>>>>> Yes, looks like it is. But, I have added dependencies required by
>>>>> storm-hdfs as stated in a guide.
>>>>>
>>>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>>>> nick.katsip@gmail.com> wrote:
>>>>>
>>>>>> Well,
>>>>>>
>>>>>> those errors look like a problem with the way you build your jar
>>>>>> file.
>>>>>> Please, make sure that you build your jar with the proper storm maven
>>>>>> dependency).
>>>>>>
>>>>>> Cheers,
>>>>>> Nick
>>>>>>
>>>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>>>
>>>>>>> It throws and error that packages does not exist. I have also tried
>>>>>>> changing org.apache to backtype, still got an error but only for
>>>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>>>
>>>>>>>    package org.apache.storm does not exist
>>>>>>>    package org.apache.storm does not exist
>>>>>>>    package org.apache.storm.generated does not exist
>>>>>>>    package org.apache.storm.metric does not exist
>>>>>>>    package org.apache.storm.topology does not exist
>>>>>>>    package org.apache.storm.utils does not exist
>>>>>>>    package org.apache.storm.utils does not exist
>>>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>>>    package org.apache.storm.topology.base does not exist
>>>>>>>    package org.apache.storm.topology does not exist
>>>>>>>    package org.apache.storm.tuple does not exist
>>>>>>>    package org.apache.storm.task does not exist
>>>>>>>
>>>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>>>> instead of writing your own spout/bolt:
>>>>>>>>
>>>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>>>> > I have tried with the code below, but it is not working.
>>>>>>>> >
>>>>>>>> > public void nextTuple() {
>>>>>>>> >       Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>>>> ");
>>>>>>>> >       FileSystem fs = FileSystem.get(new Configuration());
>>>>>>>> >       BufferedReader br = new BufferedReader(new
>>>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>>>> >       String line = br.readLine();
>>>>>>>> >       while (line != null){
>>>>>>>> >          System.out.println(line);
>>>>>>>> >          line=br.readLine();
>>>>>>>> >          _collector.emit(new Values(line));
>>>>>>>> >       }
>>>>>>>> > }
>>>>>>>> >
>>>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>>>> > <ma...@gmail.com>> wrote:
>>>>>>>> >
>>>>>>>> >     Hi.
>>>>>>>> >
>>>>>>>> >     I have a project I'm currently working on. The idea is to
>>>>>>>> implement
>>>>>>>> >     "scikit-learn" into Storm and integrate it with HDFS.
>>>>>>>> >
>>>>>>>> >     I've already implemented "scikit-learn". But, currently I'm
>>>>>>>> using a
>>>>>>>> >     text file to read and write. However, I need to use HDFS, but
>>>>>>>> >     finding it hard to integrate with HDFS.
>>>>>>>> >
>>>>>>>> >     Here is the link to github
>>>>>>>> >     <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>>>> included
>>>>>>>> >     files that I used, not whole project)
>>>>>>>> >
>>>>>>>> >     Basically, I have a few questions if you don't mint to answer
>>>>>>>> them
>>>>>>>> >     1) How to use HDFS to read and write?
>>>>>>>> >     2) Is my "scikit-learn" implementation correct?
>>>>>>>> >     3) How to create a Storm project? (Currently working in
>>>>>>>> "storm-starter")
>>>>>>>> >
>>>>>>>> >     These questions may sound a bit silly, but I really can't
>>>>>>>> find a
>>>>>>>> >     proper solution.
>>>>>>>> >
>>>>>>>> >     Thank you for your attention to this matter.
>>>>>>>> >     Sincerely, Zharas.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Best regards,
>>>>>>>> > Zharas
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Zharas
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nick R. Katsipoulakis,
>>>>>> Department of Computer Science
>>>>>> University of Pittsburgh
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Zharas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Zharas
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>
>>
>

Re: Storm + HDFS

Posted by 马哲超 <ma...@gmail.com>.
I have already worked it out. Adding dependency hadoop-common in the
pom.xml file can solve the problem.

For example,

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.7.1</version>
</dependency>

2016-05-19 18:37 GMT+08:00 马哲超 <ma...@gmail.com>:

> Yes, I have get the same error, and can't resolve it.
>
> 2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:
>
>> Thank you for your reply, it worked. However, I got an another problem.
>>
>> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
>> with basic one, so used TestWordSpout provided by Storm.
>>
>> I can successfully compile and submit the topology, but it doesn't write
>> into HDFS.
>>
>> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
>> do anything, and it has an error
>>
>> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
>> java.lang.ClassLoader.defineClass1(Native Method) at
>> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
>> java.security.Sec
>>
>> Here is my topology
>>
>> public class HdfsFileTopology {
>>   public static void main(String[] args) throws Exception {
>>     RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
>>     SyncPolicy syncPolicy = new CountSyncPolicy(100);
>>     FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
>>     FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
>>     HdfsBolt bolt = new HdfsBolt()
>>             .withFsUrl("hdfs://localhost:9000")
>>             .withFileNameFormat(fileNameFormat)
>>             .withRecordFormat(format)
>>             .withRotationPolicy(rotationPolicy)
>>             .withSyncPolicy(syncPolicy);
>>
>>     TopologyBuilder builder = new TopologyBuilder();
>>     builder.setSpout("word", new TestWordSpout(), 1);
>>     builder.setBolt("output", bolt, 1).shuffleGrouping("word");
>>     Config conf = new Config();
>>     conf.setDebug(true);
>>     conf.setNumWorkers(3);
>>     StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
>>     }
>> }
>>
>>
>> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com>
>> wrote:
>>
>>> Assuming you have git and maven installed:
>>>
>>> git clone git@github.com:apache/storm.git
>>> cd storm
>>> git checkout -b 1.x origin/1.x-branch
>>> mvn install -DskipTests
>>>
>>> That third step checks out the 1.x-branch branch which is the base for
>>> the upcoming 1.0 release.
>>>
>>> You can then include the storm-hdfs dependency in your project:
>>>
>>> <dependency>
>>> <groupId>org.apache.storm</groupId>
>>> <artifactId>storm-hdfs</artifactId>
>>> <version>1.0.0-SNAPSHOT</version>
>>> </dependency>
>>>
>>> You can find more information on using the spout and other HDFS
>>> components here:
>>>
>>>
>>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>>
>>> -Taylor
>>>
>>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>>
>>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>>> beginner :)
>>>
>>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>>> pbrahmbhatt@hortonworks.com> wrote:
>>>
>>>> Storm-hdfs spout is not yet published in maven. You will have to
>>>> checkout storm locally and build it to make it available for development.
>>>>
>>>> From: K Zharas <kg...@gmail.com>
>>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>>> Subject: Re: Storm + HDFS
>>>>
>>>> Yes, looks like it is. But, I have added dependencies required by
>>>> storm-hdfs as stated in a guide.
>>>>
>>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> Well,
>>>>>
>>>>> those errors look like a problem with the way you build your jar
>>>>> file.
>>>>> Please, make sure that you build your jar with the proper storm maven
>>>>> dependency).
>>>>>
>>>>> Cheers,
>>>>> Nick
>>>>>
>>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>>
>>>>>> It throws and error that packages does not exist. I have also tried
>>>>>> changing org.apache to backtype, still got an error but only for
>>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>>
>>>>>>    package org.apache.storm does not exist
>>>>>>    package org.apache.storm does not exist
>>>>>>    package org.apache.storm.generated does not exist
>>>>>>    package org.apache.storm.metric does not exist
>>>>>>    package org.apache.storm.topology does not exist
>>>>>>    package org.apache.storm.utils does not exist
>>>>>>    package org.apache.storm.utils does not exist
>>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>>    package org.apache.storm.hdfs.spout does not exist
>>>>>>    package org.apache.storm.topology.base does not exist
>>>>>>    package org.apache.storm.topology does not exist
>>>>>>    package org.apache.storm.tuple does not exist
>>>>>>    package org.apache.storm.task does not exist
>>>>>>
>>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>>> instead of writing your own spout/bolt:
>>>>>>>
>>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>>
>>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>>> > I have tried with the code below, but it is not working.
>>>>>>> >
>>>>>>> > public void nextTuple() {
>>>>>>> >       Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>>> ");
>>>>>>> >       FileSystem fs = FileSystem.get(new Configuration());
>>>>>>> >       BufferedReader br = new BufferedReader(new
>>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>>> >       String line = br.readLine();
>>>>>>> >       while (line != null){
>>>>>>> >          System.out.println(line);
>>>>>>> >          line=br.readLine();
>>>>>>> >          _collector.emit(new Values(line));
>>>>>>> >       }
>>>>>>> > }
>>>>>>> >
>>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>>> > <ma...@gmail.com>> wrote:
>>>>>>> >
>>>>>>> >     Hi.
>>>>>>> >
>>>>>>> >     I have a project I'm currently working on. The idea is to
>>>>>>> implement
>>>>>>> >     "scikit-learn" into Storm and integrate it with HDFS.
>>>>>>> >
>>>>>>> >     I've already implemented "scikit-learn". But, currently I'm
>>>>>>> using a
>>>>>>> >     text file to read and write. However, I need to use HDFS, but
>>>>>>> >     finding it hard to integrate with HDFS.
>>>>>>> >
>>>>>>> >     Here is the link to github
>>>>>>> >     <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>>> included
>>>>>>> >     files that I used, not whole project)
>>>>>>> >
>>>>>>> >     Basically, I have a few questions if you don't mint to answer
>>>>>>> them
>>>>>>> >     1) How to use HDFS to read and write?
>>>>>>> >     2) Is my "scikit-learn" implementation correct?
>>>>>>> >     3) How to create a Storm project? (Currently working in
>>>>>>> "storm-starter")
>>>>>>> >
>>>>>>> >     These questions may sound a bit silly, but I really can't find
>>>>>>> a
>>>>>>> >     proper solution.
>>>>>>> >
>>>>>>> >     Thank you for your attention to this matter.
>>>>>>> >     Sincerely, Zharas.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Best regards,
>>>>>>> > Zharas
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Zharas
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nick R. Katsipoulakis,
>>>>> Department of Computer Science
>>>>> University of Pittsburgh
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Zharas
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Zharas
>>
>
>