You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Muhamad Muchlis <tr...@gmail.com> on 2015/09/30 09:02:27 UTC

Nutch with MongoDB

Hi,

Anyone have a tutorial Nutch with MongoDB ?.

Thanks.

Re: Nutch with MongoDB

Posted by Alexis Hope <ba...@gmail.com>.
Nice! Happy crawling :)

On Wed, Sep 30, 2015 at 3:46 PM, Muhamad Muchlis <tr...@gmail.com> wrote:

> Good news. I did it.
> On Sep 30, 2015 16:02, "Muhamad Muchlis" <tr...@gmail.com> wrote:
>
> > And this is another error when i try run again
> >
> > 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: starting
> at
> > 2015-09-30 15:59:48
> > 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: Injecting
> > urlDir: urls
> > 2015-09-30 15:59:49,601 INFO  crawl.InjectorJob - InjectorJob: Using
> class
> > org.apache.gora.mongodb.store.MongoStore as the Gora storage class.
> > 2015-09-30 15:59:49,681 WARN  util.NativeCodeLoader - Unable to load
> > native-hadoop library for your platform... using builtin-java classes
> where
> > applicable
> > 2015-09-30 15:59:49,682 ERROR security.UserGroupInformation -
> > PriviledgedActionException as:creoactive cause:java.io.IOException:
> Failed
> > to set permissions of path:
> > \tmp\hadoop-creoactive\mapred\staging\user795133740\.staging to 0700
> > 2015-09-30 15:59:49,696 ERROR crawl.InjectorJob - InjectorJob:
> > java.io.IOException: Failed to set permissions of path:
> > \tmp\hadoop-user\mapred\staging\user795133740\.staging to 0700
> >     at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
> >     at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
> >     at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
> >     at
> >
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
> >     at
> > org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
> >     at
> >
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
> >     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
> >     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
> >     at java.security.AccessController.doPrivileged(Native Method)
> >     at javax.security.auth.Subject.doAs(Subject.java:422)
> >     at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
> >     at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
> >     at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
> >     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
> >     at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
> >     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
> >     at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
> >     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
> >     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >     at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> >
> >
> >
> >
> >
> > On Wed, Sep 30, 2015 at 3:42 PM, Muhamad Muchlis <tr...@gmail.com>
> > wrote:
> >
> >> Thank You Alexis,
> >>
> >> I'm trying tutorial on this blog
> >>
> >>
> >>
> http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elasticsearch/
> >>
> >> But there is an error
> >>
> >> 2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing conf
> >> file: org.xml.sax.SAXParseException; systemId:
> >>   file: / C:
> >> /cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
> >> linenumber: 1;
> >> columnNumber: 2; The markup in the document preceding the root element
> >> must be well-formed.
> >>
> >>
> >>
> >>
> >> On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com>
> >> wrote:
> >>
> >>> I dont have any tutorials written up but I use Nutch with Mongo.
> >>> Have you used mongo or nutch before?
> >>> I can send you my gora config and nutch-site config and you should be
> >>> good
> >>> to go.
> >>>
> >>> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
> >>> wrote:
> >>>
> >>> > Hi,
> >>> >
> >>> > Anyone have a tutorial Nutch with MongoDB ?.
> >>> >
> >>> > Thanks.
> >>> >
> >>>
> >>
> >>
> >
>

Re: Nutch with MongoDB

Posted by Muhamad Muchlis <tr...@gmail.com>.
Hi Drulea,

I use elastic search & kibana

https://www.elastic.co/products/elasticsearch
https://www.elastic.co/products/kibana



On Thu, Oct 1, 2015 at 7:23 AM, Drulea, Sherban <sd...@rand.org> wrote:

> Nice job Muhamad. I¹m trying to do the same thing.
>
> What version of SOLR and nutch are you using? Did you use the nutch
> schema.xml in SOLR?
>
> What plugins did you enable in nutch-site.xml?
>
>
> On 9/30/15, 6:46 AM, "Muhamad Muchlis" <tr...@gmail.com> wrote:
>
> >Good news. I did it.
> >On Sep 30, 2015 16:02, "Muhamad Muchlis" <tr...@gmail.com> wrote:
> >
> >> And this is another error when i try run again
> >>
> >> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: starting
> >>at
> >> 2015-09-30 15:59:48
> >> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: Injecting
> >> urlDir: urls
> >> 2015-09-30 15:59:49,601 INFO  crawl.InjectorJob - InjectorJob: Using
> >>class
> >> org.apache.gora.mongodb.store.MongoStore as the Gora storage class.
> >> 2015-09-30 15:59:49,681 WARN  util.NativeCodeLoader - Unable to load
> >> native-hadoop library for your platform... using builtin-java classes
> >>where
> >> applicable
> >> 2015-09-30 15:59:49,682 ERROR security.UserGroupInformation -
> >> PriviledgedActionException as:creoactive cause:java.io.IOException:
> >>Failed
> >> to set permissions of path:
> >> \tmp\hadoop-creoactive\mapred\staging\user795133740\.staging to 0700
> >> 2015-09-30 15:59:49,696 ERROR crawl.InjectorJob - InjectorJob:
> >> java.io.IOException: Failed to set permissions of path:
> >> \tmp\hadoop-user\mapred\staging\user795133740\.staging to 0700
> >>     at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
> >>     at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
> >>     at
> >>
> >>org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.
> >>java:514)
> >>     at
> >>
> >>org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:34
> >>9)
> >>     at
> >> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
> >>     at
> >>
> >>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
> >>nFiles.java:126)
> >>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
> >>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
> >>     at java.security.AccessController.doPrivileged(Native Method)
> >>     at javax.security.auth.Subject.doAs(Subject.java:422)
> >>     at
> >>
> >>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> >>.java:1190)
> >>     at
> >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
> >>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
> >>     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
> >>     at
> >>org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
> >>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
> >>     at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
> >>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
> >>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >>     at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Sep 30, 2015 at 3:42 PM, Muhamad Muchlis <tr...@gmail.com>
> >> wrote:
> >>
> >>> Thank You Alexis,
> >>>
> >>> I'm trying tutorial on this blog
> >>>
> >>>
> >>>
> >>>
> http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elast
> >>>icsearch/
> >>>
> >>> But there is an error
> >>>
> >>> 2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing
> conf
> >>> file: org.xml.sax.SAXParseException; systemId:
> >>>   file: / C:
> >>> /cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
> >>> linenumber: 1;
> >>> columnNumber: 2; The markup in the document preceding the root element
> >>> must be well-formed.
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com>
> >>> wrote:
> >>>
> >>>> I dont have any tutorials written up but I use Nutch with Mongo.
> >>>> Have you used mongo or nutch before?
> >>>> I can send you my gora config and nutch-site config and you should be
> >>>> good
> >>>> to go.
> >>>>
> >>>> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
> >>>> wrote:
> >>>>
> >>>> > Hi,
> >>>> >
> >>>> > Anyone have a tutorial Nutch with MongoDB ?.
> >>>> >
> >>>> > Thanks.
> >>>> >
> >>>>
> >>>
> >>>
> >>
>
>
> __________________________________________________________________________
>
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information. Any unauthorized review, use,
> disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message.
>
>

Re: Nutch with MongoDB

Posted by "Drulea, Sherban" <sd...@rand.org>.
Nice job Muhamad. I¹m trying to do the same thing.

What version of SOLR and nutch are you using? Did you use the nutch
schema.xml in SOLR?

What plugins did you enable in nutch-site.xml?


On 9/30/15, 6:46 AM, "Muhamad Muchlis" <tr...@gmail.com> wrote:

>Good news. I did it.
>On Sep 30, 2015 16:02, "Muhamad Muchlis" <tr...@gmail.com> wrote:
>
>> And this is another error when i try run again
>>
>> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: starting
>>at
>> 2015-09-30 15:59:48
>> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: Injecting
>> urlDir: urls
>> 2015-09-30 15:59:49,601 INFO  crawl.InjectorJob - InjectorJob: Using
>>class
>> org.apache.gora.mongodb.store.MongoStore as the Gora storage class.
>> 2015-09-30 15:59:49,681 WARN  util.NativeCodeLoader - Unable to load
>> native-hadoop library for your platform... using builtin-java classes
>>where
>> applicable
>> 2015-09-30 15:59:49,682 ERROR security.UserGroupInformation -
>> PriviledgedActionException as:creoactive cause:java.io.IOException:
>>Failed
>> to set permissions of path:
>> \tmp\hadoop-creoactive\mapred\staging\user795133740\.staging to 0700
>> 2015-09-30 15:59:49,696 ERROR crawl.InjectorJob - InjectorJob:
>> java.io.IOException: Failed to set permissions of path:
>> \tmp\hadoop-user\mapred\staging\user795133740\.staging to 0700
>>     at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
>>     at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
>>     at
>> 
>>org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.
>>java:514)
>>     at
>> 
>>org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:34
>>9)
>>     at
>> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
>>     at
>> 
>>org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissio
>>nFiles.java:126)
>>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
>>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:422)
>>     at
>> 
>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>>.java:1190)
>>     at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
>>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
>>     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
>>     at 
>>org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
>>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
>>     at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>     at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
>>
>>
>>
>>
>>
>> On Wed, Sep 30, 2015 at 3:42 PM, Muhamad Muchlis <tr...@gmail.com>
>> wrote:
>>
>>> Thank You Alexis,
>>>
>>> I'm trying tutorial on this blog
>>>
>>>
>>> 
>>>http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elast
>>>icsearch/
>>>
>>> But there is an error
>>>
>>> 2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing conf
>>> file: org.xml.sax.SAXParseException; systemId:
>>>   file: / C:
>>> /cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
>>> linenumber: 1;
>>> columnNumber: 2; The markup in the document preceding the root element
>>> must be well-formed.
>>>
>>>
>>>
>>>
>>> On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com>
>>> wrote:
>>>
>>>> I dont have any tutorials written up but I use Nutch with Mongo.
>>>> Have you used mongo or nutch before?
>>>> I can send you my gora config and nutch-site config and you should be
>>>> good
>>>> to go.
>>>>
>>>> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi,
>>>> >
>>>> > Anyone have a tutorial Nutch with MongoDB ?.
>>>> >
>>>> > Thanks.
>>>> >
>>>>
>>>
>>>
>>


__________________________________________________________________________

This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.


Re: Nutch with MongoDB

Posted by Muhamad Muchlis <tr...@gmail.com>.
Good news. I did it.
On Sep 30, 2015 16:02, "Muhamad Muchlis" <tr...@gmail.com> wrote:

> And this is another error when i try run again
>
> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: starting at
> 2015-09-30 15:59:48
> 2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: Injecting
> urlDir: urls
> 2015-09-30 15:59:49,601 INFO  crawl.InjectorJob - InjectorJob: Using class
> org.apache.gora.mongodb.store.MongoStore as the Gora storage class.
> 2015-09-30 15:59:49,681 WARN  util.NativeCodeLoader - Unable to load
> native-hadoop library for your platform... using builtin-java classes where
> applicable
> 2015-09-30 15:59:49,682 ERROR security.UserGroupInformation -
> PriviledgedActionException as:creoactive cause:java.io.IOException: Failed
> to set permissions of path:
> \tmp\hadoop-creoactive\mapred\staging\user795133740\.staging to 0700
> 2015-09-30 15:59:49,696 ERROR crawl.InjectorJob - InjectorJob:
> java.io.IOException: Failed to set permissions of path:
> \tmp\hadoop-user\mapred\staging\user795133740\.staging to 0700
>     at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
>     at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
>     at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
>     at
> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
>     at
> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
>     at
> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>     at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
>     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
>     at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
>     at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
>     at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
>
>
>
>
>
> On Wed, Sep 30, 2015 at 3:42 PM, Muhamad Muchlis <tr...@gmail.com>
> wrote:
>
>> Thank You Alexis,
>>
>> I'm trying tutorial on this blog
>>
>>
>> http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elasticsearch/
>>
>> But there is an error
>>
>> 2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing conf
>> file: org.xml.sax.SAXParseException; systemId:
>>   file: / C:
>> /cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
>> linenumber: 1;
>> columnNumber: 2; The markup in the document preceding the root element
>> must be well-formed.
>>
>>
>>
>>
>> On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com>
>> wrote:
>>
>>> I dont have any tutorials written up but I use Nutch with Mongo.
>>> Have you used mongo or nutch before?
>>> I can send you my gora config and nutch-site config and you should be
>>> good
>>> to go.
>>>
>>> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
>>> wrote:
>>>
>>> > Hi,
>>> >
>>> > Anyone have a tutorial Nutch with MongoDB ?.
>>> >
>>> > Thanks.
>>> >
>>>
>>
>>
>

Re: Nutch with MongoDB

Posted by Muhamad Muchlis <tr...@gmail.com>.
And this is another error when i try run again

2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: starting at
2015-09-30 15:59:48
2015-09-30 15:59:48,061 INFO  crawl.InjectorJob - InjectorJob: Injecting
urlDir: urls
2015-09-30 15:59:49,601 INFO  crawl.InjectorJob - InjectorJob: Using class
org.apache.gora.mongodb.store.MongoStore as the Gora storage class.
2015-09-30 15:59:49,681 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2015-09-30 15:59:49,682 ERROR security.UserGroupInformation -
PriviledgedActionException as:creoactive cause:java.io.IOException: Failed
to set permissions of path:
\tmp\hadoop-creoactive\mapred\staging\user795133740\.staging to 0700
2015-09-30 15:59:49,696 ERROR crawl.InjectorJob - InjectorJob:
java.io.IOException: Failed to set permissions of path:
\tmp\hadoop-user\mapred\staging\user795133740\.staging to 0700
    at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
    at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
    at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
    at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
    at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
    at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
    at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:50)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:231)
    at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)





On Wed, Sep 30, 2015 at 3:42 PM, Muhamad Muchlis <tr...@gmail.com> wrote:

> Thank You Alexis,
>
> I'm trying tutorial on this blog
>
>
> http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elasticsearch/
>
> But there is an error
>
> 2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing conf
> file: org.xml.sax.SAXParseException; systemId:
>   file: / C:
> /cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
> linenumber: 1;
> columnNumber: 2; The markup in the document preceding the root element
> must be well-formed.
>
>
>
>
> On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com> wrote:
>
>> I dont have any tutorials written up but I use Nutch with Mongo.
>> Have you used mongo or nutch before?
>> I can send you my gora config and nutch-site config and you should be good
>> to go.
>>
>> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
>> wrote:
>>
>> > Hi,
>> >
>> > Anyone have a tutorial Nutch with MongoDB ?.
>> >
>> > Thanks.
>> >
>>
>
>

Re: Nutch with MongoDB

Posted by Muhamad Muchlis <tr...@gmail.com>.
Thank You Alexis,

I'm trying tutorial on this blog

http://www.aossama.com/search-engine-with-apache-nutch-mongodb-and-elasticsearch/

But there is an error

2015-09-30 15: 30: 52.231 FATAL conf.Configuration - error parsing conf
file: org.xml.sax.SAXParseException; systemId:
  file: / C:
/cygwin64/home/user/apache-nutch-2.3/runtime/local/conf/nutch-site.xml;
linenumber: 1;
columnNumber: 2; The markup in the document preceding the root element must
be well-formed.




On Wed, Sep 30, 2015 at 2:51 PM, Alexis Hope <ba...@gmail.com> wrote:

> I dont have any tutorials written up but I use Nutch with Mongo.
> Have you used mongo or nutch before?
> I can send you my gora config and nutch-site config and you should be good
> to go.
>
> On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Anyone have a tutorial Nutch with MongoDB ?.
> >
> > Thanks.
> >
>

Re: Nutch with MongoDB

Posted by Alexis Hope <ba...@gmail.com>.
I dont have any tutorials written up but I use Nutch with Mongo.
Have you used mongo or nutch before?
I can send you my gora config and nutch-site config and you should be good
to go.

On Wed, Sep 30, 2015 at 9:02 AM, Muhamad Muchlis <tr...@gmail.com> wrote:

> Hi,
>
> Anyone have a tutorial Nutch with MongoDB ?.
>
> Thanks.
>