You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by 马哲超 <ma...@gmail.com> on 2016/05/19 10:37:39 UTC
Re: Storm + HDFS
Yes, I have get the same error, and can't resolve it.
2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:
> Thank you for your reply, it worked. However, I got an another problem.
>
> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
> with basic one, so used TestWordSpout provided by Storm.
>
> I can successfully compile and submit the topology, but it doesn't write
> into HDFS.
>
> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
> do anything, and it has an error
>
> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
> java.lang.ClassLoader.defineClass1(Native Method) at
> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
> java.security.Sec
>
> Here is my topology
>
> public class HdfsFileTopology {
> public static void main(String[] args) throws Exception {
> RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
> SyncPolicy syncPolicy = new CountSyncPolicy(100);
> FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
> FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
> HdfsBolt bolt = new HdfsBolt()
> .withFsUrl("hdfs://localhost:9000")
> .withFileNameFormat(fileNameFormat)
> .withRecordFormat(format)
> .withRotationPolicy(rotationPolicy)
> .withSyncPolicy(syncPolicy);
>
> TopologyBuilder builder = new TopologyBuilder();
> builder.setSpout("word", new TestWordSpout(), 1);
> builder.setBolt("output", bolt, 1).shuffleGrouping("word");
> Config conf = new Config();
> conf.setDebug(true);
> conf.setNumWorkers(3);
> StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
> }
> }
>
>
> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com> wrote:
>
>> Assuming you have git and maven installed:
>>
>> git clone git@github.com:apache/storm.git
>> cd storm
>> git checkout -b 1.x origin/1.x-branch
>> mvn install -DskipTests
>>
>> That third step checks out the 1.x-branch branch which is the base for
>> the upcoming 1.0 release.
>>
>> You can then include the storm-hdfs dependency in your project:
>>
>> <dependency>
>> <groupId>org.apache.storm</groupId>
>> <artifactId>storm-hdfs</artifactId>
>> <version>1.0.0-SNAPSHOT</version>
>> </dependency>
>>
>> You can find more information on using the spout and other HDFS
>> components here:
>>
>>
>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>
>> -Taylor
>>
>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>
>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>> beginner :)
>>
>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>> pbrahmbhatt@hortonworks.com> wrote:
>>
>>> Storm-hdfs spout is not yet published in maven. You will have to
>>> checkout storm locally and build it to make it available for development.
>>>
>>> From: K Zharas <kg...@gmail.com>
>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>> Subject: Re: Storm + HDFS
>>>
>>> Yes, looks like it is. But, I have added dependencies required by
>>> storm-hdfs as stated in a guide.
>>>
>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>> nick.katsip@gmail.com> wrote:
>>>
>>>> Well,
>>>>
>>>> those errors look like a problem with the way you build your jar file.
>>>> Please, make sure that you build your jar with the proper storm maven
>>>> dependency).
>>>>
>>>> Cheers,
>>>> Nick
>>>>
>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>
>>>>> It throws and error that packages does not exist. I have also tried
>>>>> changing org.apache to backtype, still got an error but only for
>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>
>>>>> package org.apache.storm does not exist
>>>>> package org.apache.storm does not exist
>>>>> package org.apache.storm.generated does not exist
>>>>> package org.apache.storm.metric does not exist
>>>>> package org.apache.storm.topology does not exist
>>>>> package org.apache.storm.utils does not exist
>>>>> package org.apache.storm.utils does not exist
>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>> package org.apache.storm.topology.base does not exist
>>>>> package org.apache.storm.topology does not exist
>>>>> package org.apache.storm.tuple does not exist
>>>>> package org.apache.storm.task does not exist
>>>>>
>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>> instead of writing your own spout/bolt:
>>>>>>
>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>>
>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>> > I have tried with the code below, but it is not working.
>>>>>> >
>>>>>> > public void nextTuple() {
>>>>>> > Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>> ");
>>>>>> > FileSystem fs = FileSystem.get(new Configuration());
>>>>>> > BufferedReader br = new BufferedReader(new
>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>> > String line = br.readLine();
>>>>>> > while (line != null){
>>>>>> > System.out.println(line);
>>>>>> > line=br.readLine();
>>>>>> > _collector.emit(new Values(line));
>>>>>> > }
>>>>>> > }
>>>>>> >
>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>> > <ma...@gmail.com>> wrote:
>>>>>> >
>>>>>> > Hi.
>>>>>> >
>>>>>> > I have a project I'm currently working on. The idea is to
>>>>>> implement
>>>>>> > "scikit-learn" into Storm and integrate it with HDFS.
>>>>>> >
>>>>>> > I've already implemented "scikit-learn". But, currently I'm
>>>>>> using a
>>>>>> > text file to read and write. However, I need to use HDFS, but
>>>>>> > finding it hard to integrate with HDFS.
>>>>>> >
>>>>>> > Here is the link to github
>>>>>> > <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>> included
>>>>>> > files that I used, not whole project)
>>>>>> >
>>>>>> > Basically, I have a few questions if you don't mint to answer
>>>>>> them
>>>>>> > 1) How to use HDFS to read and write?
>>>>>> > 2) Is my "scikit-learn" implementation correct?
>>>>>> > 3) How to create a Storm project? (Currently working in
>>>>>> "storm-starter")
>>>>>> >
>>>>>> > These questions may sound a bit silly, but I really can't find a
>>>>>> > proper solution.
>>>>>> >
>>>>>> > Thank you for your attention to this matter.
>>>>>> > Sincerely, Zharas.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Best regards,
>>>>>> > Zharas
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Zharas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nick R. Katsipoulakis,
>>>> Department of Computer Science
>>>> University of Pittsburgh
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>
>>
>>
>> --
>> Best regards,
>> Zharas
>>
>>
>>
>
>
> --
> Best regards,
> Zharas
>
Re: Storm + HDFS
Posted by 马哲超 <ma...@gmail.com>.
I'd like to make a small fix for the solution in my last mail.
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.1</version>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</exclusion>
</exclusions>
</dependency>
2016-05-20 10:54 GMT+08:00 马哲超 <ma...@gmail.com>:
> I have already worked it out. Adding dependency hadoop-common in the
> pom.xml file can solve the problem.
>
> For example,
>
> <dependency>
> <groupId>org.apache.hadoop</groupId>
> <artifactId>hadoop-common</artifactId>
> <version>2.7.1</version>
> </dependency>
>
> 2016-05-19 18:37 GMT+08:00 马哲超 <ma...@gmail.com>:
>
>> Yes, I have get the same error, and can't resolve it.
>>
>> 2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:
>>
>>> Thank you for your reply, it worked. However, I got an another problem.
>>>
>>> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
>>> with basic one, so used TestWordSpout provided by Storm.
>>>
>>> I can successfully compile and submit the topology, but it doesn't
>>> write into HDFS.
>>>
>>> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
>>> do anything, and it has an error
>>>
>>> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
>>> java.lang.ClassLoader.defineClass1(Native Method) at
>>> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
>>> java.security.Sec
>>>
>>> Here is my topology
>>>
>>> public class HdfsFileTopology {
>>> public static void main(String[] args) throws Exception {
>>> RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
>>> SyncPolicy syncPolicy = new CountSyncPolicy(100);
>>> FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
>>> FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
>>> HdfsBolt bolt = new HdfsBolt()
>>> .withFsUrl("hdfs://localhost:9000")
>>> .withFileNameFormat(fileNameFormat)
>>> .withRecordFormat(format)
>>> .withRotationPolicy(rotationPolicy)
>>> .withSyncPolicy(syncPolicy);
>>>
>>> TopologyBuilder builder = new TopologyBuilder();
>>> builder.setSpout("word", new TestWordSpout(), 1);
>>> builder.setBolt("output", bolt, 1).shuffleGrouping("word");
>>> Config conf = new Config();
>>> conf.setDebug(true);
>>> conf.setNumWorkers(3);
>>> StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
>>> }
>>> }
>>>
>>>
>>> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com>
>>> wrote:
>>>
>>>> Assuming you have git and maven installed:
>>>>
>>>> git clone git@github.com:apache/storm.git
>>>> cd storm
>>>> git checkout -b 1.x origin/1.x-branch
>>>> mvn install -DskipTests
>>>>
>>>> That third step checks out the 1.x-branch branch which is the base for
>>>> the upcoming 1.0 release.
>>>>
>>>> You can then include the storm-hdfs dependency in your project:
>>>>
>>>> <dependency>
>>>> <groupId>org.apache.storm</groupId>
>>>> <artifactId>storm-hdfs</artifactId>
>>>> <version>1.0.0-SNAPSHOT</version>
>>>> </dependency>
>>>>
>>>> You can find more information on using the spout and other HDFS
>>>> components here:
>>>>
>>>>
>>>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>>>
>>>> -Taylor
>>>>
>>>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>>>
>>>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>>>> beginner :)
>>>>
>>>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>>>> pbrahmbhatt@hortonworks.com> wrote:
>>>>
>>>>> Storm-hdfs spout is not yet published in maven. You will have to
>>>>> checkout storm locally and build it to make it available for development.
>>>>>
>>>>> From: K Zharas <kg...@gmail.com>
>>>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>>>> Subject: Re: Storm + HDFS
>>>>>
>>>>> Yes, looks like it is. But, I have added dependencies required by
>>>>> storm-hdfs as stated in a guide.
>>>>>
>>>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>>>> nick.katsip@gmail.com> wrote:
>>>>>
>>>>>> Well,
>>>>>>
>>>>>> those errors look like a problem with the way you build your jar
>>>>>> file.
>>>>>> Please, make sure that you build your jar with the proper storm maven
>>>>>> dependency).
>>>>>>
>>>>>> Cheers,
>>>>>> Nick
>>>>>>
>>>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>>>
>>>>>>> It throws and error that packages does not exist. I have also tried
>>>>>>> changing org.apache to backtype, still got an error but only for
>>>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>>>
>>>>>>> package org.apache.storm does not exist
>>>>>>> package org.apache.storm does not exist
>>>>>>> package org.apache.storm.generated does not exist
>>>>>>> package org.apache.storm.metric does not exist
>>>>>>> package org.apache.storm.topology does not exist
>>>>>>> package org.apache.storm.utils does not exist
>>>>>>> package org.apache.storm.utils does not exist
>>>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>>>> package org.apache.storm.topology.base does not exist
>>>>>>> package org.apache.storm.topology does not exist
>>>>>>> package org.apache.storm.tuple does not exist
>>>>>>> package org.apache.storm.task does not exist
>>>>>>>
>>>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>>>> instead of writing your own spout/bolt:
>>>>>>>>
>>>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>>>> > I have tried with the code below, but it is not working.
>>>>>>>> >
>>>>>>>> > public void nextTuple() {
>>>>>>>> > Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>>>> ");
>>>>>>>> > FileSystem fs = FileSystem.get(new Configuration());
>>>>>>>> > BufferedReader br = new BufferedReader(new
>>>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>>>> > String line = br.readLine();
>>>>>>>> > while (line != null){
>>>>>>>> > System.out.println(line);
>>>>>>>> > line=br.readLine();
>>>>>>>> > _collector.emit(new Values(line));
>>>>>>>> > }
>>>>>>>> > }
>>>>>>>> >
>>>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>>>> > <ma...@gmail.com>> wrote:
>>>>>>>> >
>>>>>>>> > Hi.
>>>>>>>> >
>>>>>>>> > I have a project I'm currently working on. The idea is to
>>>>>>>> implement
>>>>>>>> > "scikit-learn" into Storm and integrate it with HDFS.
>>>>>>>> >
>>>>>>>> > I've already implemented "scikit-learn". But, currently I'm
>>>>>>>> using a
>>>>>>>> > text file to read and write. However, I need to use HDFS, but
>>>>>>>> > finding it hard to integrate with HDFS.
>>>>>>>> >
>>>>>>>> > Here is the link to github
>>>>>>>> > <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>>>> included
>>>>>>>> > files that I used, not whole project)
>>>>>>>> >
>>>>>>>> > Basically, I have a few questions if you don't mint to answer
>>>>>>>> them
>>>>>>>> > 1) How to use HDFS to read and write?
>>>>>>>> > 2) Is my "scikit-learn" implementation correct?
>>>>>>>> > 3) How to create a Storm project? (Currently working in
>>>>>>>> "storm-starter")
>>>>>>>> >
>>>>>>>> > These questions may sound a bit silly, but I really can't
>>>>>>>> find a
>>>>>>>> > proper solution.
>>>>>>>> >
>>>>>>>> > Thank you for your attention to this matter.
>>>>>>>> > Sincerely, Zharas.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > Best regards,
>>>>>>>> > Zharas
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Zharas
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nick R. Katsipoulakis,
>>>>>> Department of Computer Science
>>>>>> University of Pittsburgh
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Zharas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Zharas
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>
>>
>
Re: Storm + HDFS
Posted by 马哲超 <ma...@gmail.com>.
I have already worked it out. Adding dependency hadoop-common in the
pom.xml file can solve the problem.
For example,
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.1</version>
</dependency>
2016-05-19 18:37 GMT+08:00 马哲超 <ma...@gmail.com>:
> Yes, I have get the same error, and can't resolve it.
>
> 2016-02-05 0:29 GMT+08:00 K Zharas <kg...@gmail.com>:
>
>> Thank you for your reply, it worked. However, I got an another problem.
>>
>> Basically, I'm trying to implement HDFSBolt in Storm. I wanted to start
>> with basic one, so used TestWordSpout provided by Storm.
>>
>> I can successfully compile and submit the topology, but it doesn't write
>> into HDFS.
>>
>> In StormUI, I can see that Spout is emitting continuously. Bolt doesn't
>> do anything, and it has an error
>>
>> java.lang.NoClassDefFoundError: org/apache/hadoop/fs/CanUnbuffer at
>> java.lang.ClassLoader.defineClass1(Native Method) at
>> java.lang.ClassLoader.defineClass(ClassLoader.java:800) at
>> java.security.Sec
>>
>> Here is my topology
>>
>> public class HdfsFileTopology {
>> public static void main(String[] args) throws Exception {
>> RecordFormat format = new DelimitedRecordFormat().withFieldDelimiter(",");
>> SyncPolicy syncPolicy = new CountSyncPolicy(100);
>> FileRotationPolicy rotationPolicy = new FileSizeRotationPolicy(10.0f, Units.KB);
>> FileNameFormat fileNameFormat = new DefaultFileNameFormat().withPath("/user");
>> HdfsBolt bolt = new HdfsBolt()
>> .withFsUrl("hdfs://localhost:9000")
>> .withFileNameFormat(fileNameFormat)
>> .withRecordFormat(format)
>> .withRotationPolicy(rotationPolicy)
>> .withSyncPolicy(syncPolicy);
>>
>> TopologyBuilder builder = new TopologyBuilder();
>> builder.setSpout("word", new TestWordSpout(), 1);
>> builder.setBolt("output", bolt, 1).shuffleGrouping("word");
>> Config conf = new Config();
>> conf.setDebug(true);
>> conf.setNumWorkers(3);
>> StormSubmitter.submitTopology("HdfsFileTopology", conf, builder.createTopology());
>> }
>> }
>>
>>
>> On Thu, Feb 4, 2016 at 5:04 AM, P. Taylor Goetz <pt...@gmail.com>
>> wrote:
>>
>>> Assuming you have git and maven installed:
>>>
>>> git clone git@github.com:apache/storm.git
>>> cd storm
>>> git checkout -b 1.x origin/1.x-branch
>>> mvn install -DskipTests
>>>
>>> That third step checks out the 1.x-branch branch which is the base for
>>> the upcoming 1.0 release.
>>>
>>> You can then include the storm-hdfs dependency in your project:
>>>
>>> <dependency>
>>> <groupId>org.apache.storm</groupId>
>>> <artifactId>storm-hdfs</artifactId>
>>> <version>1.0.0-SNAPSHOT</version>
>>> </dependency>
>>>
>>> You can find more information on using the spout and other HDFS
>>> components here:
>>>
>>>
>>> https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout
>>>
>>> -Taylor
>>>
>>> On Feb 3, 2016, at 2:54 PM, K Zharas <kg...@gmail.com> wrote:
>>>
>>> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite
>>> beginner :)
>>>
>>> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <
>>> pbrahmbhatt@hortonworks.com> wrote:
>>>
>>>> Storm-hdfs spout is not yet published in maven. You will have to
>>>> checkout storm locally and build it to make it available for development.
>>>>
>>>> From: K Zharas <kg...@gmail.com>
>>>> Reply-To: "user@storm.apache.org" <us...@storm.apache.org>
>>>> Date: Wednesday, February 3, 2016 at 11:41 AM
>>>> To: "user@storm.apache.org" <us...@storm.apache.org>
>>>> Subject: Re: Storm + HDFS
>>>>
>>>> Yes, looks like it is. But, I have added dependencies required by
>>>> storm-hdfs as stated in a guide.
>>>>
>>>> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <
>>>> nick.katsip@gmail.com> wrote:
>>>>
>>>>> Well,
>>>>>
>>>>> those errors look like a problem with the way you build your jar
>>>>> file.
>>>>> Please, make sure that you build your jar with the proper storm maven
>>>>> dependency).
>>>>>
>>>>> Cheers,
>>>>> Nick
>>>>>
>>>>> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kg...@gmail.com> wrote:
>>>>>
>>>>>> It throws and error that packages does not exist. I have also tried
>>>>>> changing org.apache to backtype, still got an error but only for
>>>>>> storm.hdfs.spout. Btw, I use Storm-0.10.0 and Hadoop-2.7.1
>>>>>>
>>>>>> package org.apache.storm does not exist
>>>>>> package org.apache.storm does not exist
>>>>>> package org.apache.storm.generated does not exist
>>>>>> package org.apache.storm.metric does not exist
>>>>>> package org.apache.storm.topology does not exist
>>>>>> package org.apache.storm.utils does not exist
>>>>>> package org.apache.storm.utils does not exist
>>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>>> package org.apache.storm.hdfs.spout does not exist
>>>>>> package org.apache.storm.topology.base does not exist
>>>>>> package org.apache.storm.topology does not exist
>>>>>> package org.apache.storm.tuple does not exist
>>>>>> package org.apache.storm.task does not exist
>>>>>>
>>>>>> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
>>>>>>> instead of writing your own spout/bolt:
>>>>>>>
>>>>>>> https://github.com/apache/storm/tree/master/external/storm-hdfs
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>>
>>>>>>> On 02/03/2016 12:34 PM, K Zharas wrote:
>>>>>>> > Can anyone help to create a Spout which reads a file from HDFS?
>>>>>>> > I have tried with the code below, but it is not working.
>>>>>>> >
>>>>>>> > public void nextTuple() {
>>>>>>> > Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt
>>>>>>> ");
>>>>>>> > FileSystem fs = FileSystem.get(new Configuration());
>>>>>>> > BufferedReader br = new BufferedReader(new
>>>>>>> > InputStreamReader(fs.open(pt)));
>>>>>>> > String line = br.readLine();
>>>>>>> > while (line != null){
>>>>>>> > System.out.println(line);
>>>>>>> > line=br.readLine();
>>>>>>> > _collector.emit(new Values(line));
>>>>>>> > }
>>>>>>> > }
>>>>>>> >
>>>>>>> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzharas@gmail.com
>>>>>>> > <ma...@gmail.com>> wrote:
>>>>>>> >
>>>>>>> > Hi.
>>>>>>> >
>>>>>>> > I have a project I'm currently working on. The idea is to
>>>>>>> implement
>>>>>>> > "scikit-learn" into Storm and integrate it with HDFS.
>>>>>>> >
>>>>>>> > I've already implemented "scikit-learn". But, currently I'm
>>>>>>> using a
>>>>>>> > text file to read and write. However, I need to use HDFS, but
>>>>>>> > finding it hard to integrate with HDFS.
>>>>>>> >
>>>>>>> > Here is the link to github
>>>>>>> > <https://github.com/kgzharas/StormTopologyTest>. (I only
>>>>>>> included
>>>>>>> > files that I used, not whole project)
>>>>>>> >
>>>>>>> > Basically, I have a few questions if you don't mint to answer
>>>>>>> them
>>>>>>> > 1) How to use HDFS to read and write?
>>>>>>> > 2) Is my "scikit-learn" implementation correct?
>>>>>>> > 3) How to create a Storm project? (Currently working in
>>>>>>> "storm-starter")
>>>>>>> >
>>>>>>> > These questions may sound a bit silly, but I really can't find
>>>>>>> a
>>>>>>> > proper solution.
>>>>>>> >
>>>>>>> > Thank you for your attention to this matter.
>>>>>>> > Sincerely, Zharas.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > Best regards,
>>>>>>> > Zharas
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards,
>>>>>> Zharas
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nick R. Katsipoulakis,
>>>>> Department of Computer Science
>>>>> University of Pittsburgh
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Zharas
>>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Zharas
>>>
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Zharas
>>
>
>