You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Dimanshu Parihar <pa...@outlook.com> on 2020/08/10 09:31:05 UTC
Regarding Nutch Hadoop Cluster Setup in Deploy Mode
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
Hello Sir,
I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
Problem :
First copy the files from the nutch build to the deploy directory using something like the following command:
cp -R /path/to/build/* /nutch/search
Then make sure that all of the shell scripts are in unix format and are executable.
dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
dos2unix /nutch/search/config/*.sh
chmod 700 /nutch/search/config/*.sh
Issue :
The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
So can you please clarify these statements that how can I follow these steps?
I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
Posted by Sebastian Nagel <sn...@apache.org>.
Hi Dimanshu,
Nutch is a community project. If you can, please take the time, be part of the community
and improve the documentation. Unlike for the source code, the barrier for the wiki is low:
anybody can and *is welcome* to register and update the Nutch Wiki. As a 100% volunteer project
we rely on contributions from the community including our users.
Thanks,
Sebastian
On 9/4/20 9:17 PM, Dimanshu Parihar wrote:
> Thanks Sebastian,
> This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>
> From: Sebastian Nagel<ma...@googlemail.com.INVALID>
> Sent: Tuesday, August 11, 2020 4:56 PM
> To: user@nutch.apache.org<ma...@nutch.apache.org>
> Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
>
> Hi,
>
> Nutch does not include a search component anymore. These steps are obsolete.
>
> All you need is to setup your Hadoop cluster, then run
> $NUTCH_HOME/runtime/deploy/bin/nutch ...
> (instead of .../runtime/local/bin/nutch ...)
>
> Alternatively, you could launch a Nutch tool, eg. Injector
> the following way:
>
> hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
> org.apache.nutch.crawl.Injector ...
>
> Best,
> Sebastian
>
>
> On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>>
>>
>> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>> Hello Sir,
>> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>>
>> Problem :
>>
>> First copy the files from the nutch build to the deploy directory using something like the following command:
>>
>> cp -R /path/to/build/* /nutch/search
>>
>> Then make sure that all of the shell scripts are in unix format and are executable.
>>
>> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>>
>> dos2unix /nutch/search/config/*.sh
>>
>> chmod 700 /nutch/search/config/*.sh
>> Issue :
>> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
>> So can you please clarify these statements that how can I follow these steps?
>> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>>
>
>
RE: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
Posted by Dimanshu Parihar <pa...@outlook.com>.
Thanks Sebastian,
This helps a lot. I got the point. They should change the documentation. A lot of people gets confused because of that.
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
From: Sebastian Nagel<ma...@googlemail.com.INVALID>
Sent: Tuesday, August 11, 2020 4:56 PM
To: user@nutch.apache.org<ma...@nutch.apache.org>
Subject: Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
Hi,
Nutch does not include a search component anymore. These steps are obsolete.
All you need is to setup your Hadoop cluster, then run
$NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)
Alternatively, you could launch a Nutch tool, eg. Injector
the following way:
hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
org.apache.nutch.crawl.Injector ...
Best,
Sebastian
On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>
> Problem :
>
> First copy the files from the nutch build to the deploy directory using something like the following command:
>
> cp -R /path/to/build/* /nutch/search
>
> Then make sure that all of the shell scripts are in unix format and are executable.
>
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> dos2unix /nutch/search/config/*.sh
>
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>
Re: Regarding Nutch Hadoop Cluster Setup in Deploy Mode
Posted by Sebastian Nagel <wa...@googlemail.com.INVALID>.
Hi,
Nutch does not include a search component anymore. These steps are obsolete.
All you need is to setup your Hadoop cluster, then run
$NUTCH_HOME/runtime/deploy/bin/nutch ...
(instead of .../runtime/local/bin/nutch ...)
Alternatively, you could launch a Nutch tool, eg. Injector
the following way:
hadoop jar $NUTCH_HOME/runtime/deploy/apache-nutch-1.15-SNAPSHOT.job \
org.apache.nutch.crawl.Injector ...
Best,
Sebastian
On 8/10/20 11:31 AM, Dimanshu Parihar wrote:
>
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
> Hello Sir,
> I have been using Nutch 1.17 in local mode and now I wanted to shift from local mode to deploy mode. For this, I tried the Apache Nutch Hadoop cluster setup link but I am stuck at the below given point :
>
> Problem :
>
> First copy the files from the nutch build to the deploy directory using something like the following command:
>
> cp -R /path/to/build/* /nutch/search
>
> Then make sure that all of the shell scripts are in unix format and are executable.
>
> dos2unix /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> chmod 700 /nutch/search/bin/*.sh /nutch/search/bin/hadoop /nutch/search/bin/nutch
>
> dos2unix /nutch/search/config/*.sh
>
> chmod 700 /nutch/search/config/*.sh
> Issue :
> The issue is I ran ant command in nutch folder and runtime folder is created and a build folder is created. I copied the build/* files to search folder that I created in nutch folder itself. But after running these dos2unix commands, it says no bin/Hadoop and bin/nutch files found here which is obvious because my build folder didn’t had these files.
> So can you please clarify these statements that how can I follow these steps?
> I have only 1 user where I am setting all 3 hadoop, solr and nutch which is not root user.
>