You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Dimanshu Parihar <pa...@outlook.com> on 2020/08/10 09:31:05 UTC

Regarding Nutch Hadoop Cluster Setup in Deploy Mode


Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
Hello Sir,
I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :

Problem :

First copy the files from the nutch build to the deploy directory using something like the following command:

cp -R /path/to/build/* /nutch/search

Then make sure that all of the shell scripts are in unix format and are executable.

dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch

chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch

dos2unix /nutch/search/config/*.sh

chmod 700 /nutch/search/config/*.sh
Issue :
The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
So can you please clarify these statements that how can I follow these steps?
I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.

Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Posted by Sebastian Nagel <sn...@apache.org>.
Hi Dimanshu,

Nutch is a community project. If you can, please take the time, be part of the community
and improve the documentation. Unlike for the source code, the barrier for the wiki is low:
anybody can and *is welcome* to register and update the Nutch Wiki. As a 100% volunteer project
we rely on contributions from the community including our users.

Thanks,
Sebastian

On 9/4/20 9:17 PM, Dimanshu Parihar wrote:
> Thanks Sebastian,
> This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.
> 
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> 
> From: Sebastian Nagel<ma...@googlemail.com.INVALID>
> Sent: Tuesday, August 11, 2020 4:56 PM
> To: user@nutch.apache.org<ma...@nutch.apache.org>
> Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
> 
> Hi,
> 
> Nutch does not include a search component anymore. These steps are obsolete.
> 
> All you need is to setup your Hadoop cluster, then run
>    $NUTCH_HOME/runtime/deploy/bin/nutch ...
> (instead of .../runtime/local/bin/nutch ...)
> 
> Alternatively, you could launch a Nutch tool, eg. Injector
> the following way:
> 
> hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
>    org.apache.nutch.crawl.Injector ...
> 
> Best,
> Sebastian
> 
> 
> On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>>
>>
>> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>> Hello Sir,
>> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>>
>> Problem :
>>
>> First copy the files from the nutch build to the deploy directory using something like the following command:
>>
>> cp -R /path/to/build/* /nutch/search
>>
>> Then make sure that all of the shell scripts are in unix format and are executable.
>>
>> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> dos2unix /nutch/search/config/*.sh
>>
>> chmod 700 /nutch/search/config/*.sh
>> Issue :
>> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
>> So can you please clarify these statements that how can I follow these steps?
>> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>>
> 
> 


RE: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Posted by Dimanshu Parihar <pa...@outlook.com>.
Thanks Sebastian,
This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

From: Sebastian Nagel<ma...@googlemail.com.INVALID>
Sent: Tuesday, August 11, 2020 4:56 PM
To: user@nutch.apache.org<ma...@nutch.apache.org>
Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Hi,

Nutch does not include a search component anymore. These steps are obsolete.

All you need is to setup your Hadoop cluster, then run
   $NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)

Alternatively, you could launch a Nutch tool, eg. Injector
the following way:

hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
   org.apache.nutch.crawl.Injector ...

Best,
Sebastian


On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>
> Problem :
>
> First copy the files from the nutch build to the deploy directory using something like the following command:
>
> cp -R /path/to/build/* /nutch/search
>
> Then make sure that all of the shell scripts are in unix format and are executable.
>
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> dos2unix /nutch/search/config/*.sh
>
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>


Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode

Posted by Sebastian Nagel <wa...@googlemail.com.INVALID>.
Hi,

Nutch does not include a search component anymore. These steps are obsolete.

All you need is to setup your Hadoop cluster, then run
   $NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)

Alternatively, you could launch a Nutch tool, eg. Injector
the following way:

hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
   org.apache.nutch.crawl.Injector ...

Best,
Sebastian


On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
> 
> 
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
> 
> Problem :
> 
> First copy the files from the nutch build to the deploy directory using something like the following command:
> 
> cp -R /path/to/build/* /nutch/search
> 
> Then make sure that all of the shell scripts are in unix format and are executable.
> 
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
> 
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
> 
> dos2unix /nutch/search/config/*.sh
> 
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>