You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Cement Xianyu <ce...@gmail.com> on 2006/04/30 17:57:25 UTC
Startscript in windows
Hi
Because I want to use nutch on my thinkdpad under windows,
I read the original start script and change it in windows' batch script
file.
There are two files.
The first file is used to ensure Delayed environment variable is enabled.
Also files can be found from my blog: http://dwangel.3322.org/2006/04/29/44/
( In chinese :P )
I wish these will be helpful. May it can be included into package.
The followings are the content.
nutch.bat
@cmd /V:on /c %~dp0nutch1.bat %*
nutch1.bat
@echo on
rem *****************************************************************
rem * A script to launch nutch on Windows 2000/XP System.
rem *
rem * Written by Cement Xianyu
rem * ( cement.xianyu@gmail.com blog: http://dwangel.3322.org)
rem *
rem * Because delayed environment is used, cmd /V:on should be used to
rem * run this script.
rem *****************************************************************
if "%OS%"=="Windows_NT" @setlocal
if "%OS%"=="WINNT" @setlocal
if "%1" == "" goto :msg
goto :begin
:msg
echo "Usage: nutch COMMAND"
echo "where COMMAND is one of:"
echo " crawl one-step crawler for intranets"
echo " readdb read / dump crawl db"
echo " readlinkdb read / dump link db"
echo " inject inject new urls into the database"
echo " generate generate new segments to fetch"
echo " fetch fetch a segment's pages"
echo " parse parse a segment's pages"
echo " segread read / dump segment data"
echo " updatedb update crawl db from segments after fetching"
echo " invertlinks create a linkdb from parsed segments"
echo " index run the indexer on parsed segments and linkdb"
echo " merge merge several segment indexes"
echo " dedup remove duplicates from a set of segment indexes"
echo " plugin load a plugin and run one of its classes main()"
echo " server run a search server"
echo " or"
echo " CLASSNAME run the class named CLASSNAME"
echo "Most commands print help when invoked w/o parameters."
pause
goto :end
:begin
rem %~dp0 is expanded pathname of the current script under NT
set DEFAULT_NUTCH_HOME=%~dp0..
rem set DEFAULT_NUTCH_HOME=..
if "%NUTCH_HOME%"=="" set NUTCH_HOME=%DEFAULT_NUTCH_HOME%
set DEFAULT_NUTCH_HOME=""
echo %NUTCH_HOME%
rem set _USE_CLASSPATH=yes
if "%CLASSPATH%"=="" ( set CLASSPATH=%JAVA_HOME%\lib\tools.jar) ELSE set
CLASSPATH=%CLASSPATH%;%JAVA_HOME%\lib\tools.jar
set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\conf;
echo %CLASSPATH%
echo before other
rem for developers, add plugins, job & test code to CLASSPATH
if exist %NUTCH_HOME%\build\plugins set
CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build
for /R %NUTCH_HOME%\build %%i in (nutch*.job) do set
CLASSPATH=!CLASSPATH!;%%i
if exist %NUTCH_HOME%\build\test\classes set
CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build\test\classes
rem for releases, add Nutch job to CLASSPATH
for /R %NUTCH_HOME% %%i in (nutch*.job) do set CLASSPATH=!CLASSPATH!;%%i
rem add plugins to classpath
if exist %NUTCH_HOME%\plugins set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%
rem add libs to CLASSPATH
for /R %NUTCH_HOME%\lib %%f in (*.jar) do set CLASSPATH=!CLASSPATH!;%%f
echo %CLASSPATH%
rem translate command
if "%1"=="crawl" set CLASS=org.apache.nutch.crawl.Crawl
if "%1"=="inject" set CLASS=org.apache.nutch.crawl.Injector
if "%1"=="generate" set CLASS=org.apache.nutch.crawl.Generator
if "%1"=="fetch" set CLASS=org.apache.nutch.fetcher.Fetcher
if "%1"=="parse" set CLASS=org.apache.nutch.parse.ParseSegment
if "%1"=="readdb" set CLASS=org.apache.nutch.crawl.CrawlDbReader
if "%1"=="readlinkdb" set CLASS=org.apache.nutch.crawl.LinkDbReader
if "%1"=="segread" set CLASS=org.apache.nutch.segment.SegmentReader
if "%1"=="updatedb" set CLASS=org.apache.nutch.crawl.CrawlDb
if "%1"=="invertlinks" set CLASS=org.apache.nutch.crawl.LinkDb
if "%1"=="index" set CLASS=org.apache.nutch.indexer.Indexer
if "%1"=="dedup" set CLASS=org.apache.nutch.indexer.DeleteDuplicates
if "%1"=="merge" set CLASS=org.apache.nutch.indexer.IndexMerger
if "%1"=="plugin" set CLASS=org.apache.nutch.plugin.PluginRepository
if "%1"=="server" set CLASS='
org.apache.nutch.searcher.DistributedSearch$Server'
if "%CLASS%"=="" set CLASS=%1
%JAVA_HOME%\bin\java -cp %CLASSPATH% %CLASS% %*
if "%OS%"=="Windows_NT" @endlocal
if "%OS%"=="WINNT" @endlocal
:end
Re: Startscript in windows
Posted by Nutch Newbie <nu...@gmail.com>.
AJ
Did you update the scrpit to reflect new changes in 0.8? no? I can
update it.. however I am getting a Class not found error when I try to
run nutch crawl or nutch inject?? yes I did pointed it to the current
class in 0.8??? any suggestions
Thanks
On 4/30/06, ArentJan Banck <aj...@planet.nl> wrote:
> I also wrote a Windows batch file, and created a Jira case for this, see
> http://issues.apache.org/jira/browse/NUTCH-82
>
> -Arent-Jan
>
> ----- Original Message -----
> From: "Cement Xianyu" <ce...@gmail.com>
> To: <nu...@lucene.apache.org>
> Sent: Sunday, April 30, 2006 5:57 PM
> Subject: Startscript in windows
>
>
> Hi
>
> Because I want to use nutch on my thinkdpad under windows,
> I read the original start script and change it in windows' batch script
> file.
> There are two files.
> The first file is used to ensure Delayed environment variable is enabled.
>
> Also files can be found from my blog: http://dwangel.3322.org/2006/04/29/44/
> ( In chinese :P )
>
> I wish these will be helpful. May it can be included into package.
>
> The followings are the content.
>
> nutch.bat
> @cmd /V:on /c %~dp0nutch1.bat %*
>
> nutch1.bat
> @echo on
> rem *****************************************************************
> rem * A script to launch nutch on Windows 2000/XP System.
> rem *
> rem * Written by Cement Xianyu
> rem * ( cement.xianyu@gmail.com blog: http://dwangel.3322.org)
> rem *
> rem * Because delayed environment is used, cmd /V:on should be used to
> rem * run this script.
> rem *****************************************************************
> if "%OS%"=="Windows_NT" @setlocal
> if "%OS%"=="WINNT" @setlocal
>
> if "%1" == "" goto :msg
> goto :begin
> :msg
> echo "Usage: nutch COMMAND"
> echo "where COMMAND is one of:"
> echo " crawl one-step crawler for intranets"
> echo " readdb read / dump crawl db"
> echo " readlinkdb read / dump link db"
> echo " inject inject new urls into the database"
> echo " generate generate new segments to fetch"
> echo " fetch fetch a segment's pages"
> echo " parse parse a segment's pages"
> echo " segread read / dump segment data"
> echo " updatedb update crawl db from segments after fetching"
> echo " invertlinks create a linkdb from parsed segments"
> echo " index run the indexer on parsed segments and linkdb"
> echo " merge merge several segment indexes"
> echo " dedup remove duplicates from a set of segment indexes"
> echo " plugin load a plugin and run one of its classes main()"
> echo " server run a search server"
> echo " or"
> echo " CLASSNAME run the class named CLASSNAME"
> echo "Most commands print help when invoked w/o parameters."
> pause
> goto :end
>
> :begin
> rem %~dp0 is expanded pathname of the current script under NT
> set DEFAULT_NUTCH_HOME=%~dp0..
> rem set DEFAULT_NUTCH_HOME=..
>
> if "%NUTCH_HOME%"=="" set NUTCH_HOME=%DEFAULT_NUTCH_HOME%
> set DEFAULT_NUTCH_HOME=""
>
> echo %NUTCH_HOME%
>
> rem set _USE_CLASSPATH=yes
>
> if "%CLASSPATH%"=="" ( set CLASSPATH=%JAVA_HOME%\lib\tools.jar) ELSE set
> CLASSPATH=%CLASSPATH%;%JAVA_HOME%\lib\tools.jar
> set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\conf;
> echo %CLASSPATH%
> echo before other
>
> rem for developers, add plugins, job & test code to CLASSPATH
> if exist %NUTCH_HOME%\build\plugins set
> CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build
>
> for /R %NUTCH_HOME%\build %%i in (nutch*.job) do set
> CLASSPATH=!CLASSPATH!;%%i
> if exist %NUTCH_HOME%\build\test\classes set
> CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build\test\classes
>
> rem for releases, add Nutch job to CLASSPATH
> for /R %NUTCH_HOME% %%i in (nutch*.job) do set CLASSPATH=!CLASSPATH!;%%i
> rem add plugins to classpath
> if exist %NUTCH_HOME%\plugins set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%
> rem add libs to CLASSPATH
> for /R %NUTCH_HOME%\lib %%f in (*.jar) do set CLASSPATH=!CLASSPATH!;%%f
>
>
> echo %CLASSPATH%
>
> rem translate command
> if "%1"=="crawl" set CLASS=org.apache.nutch.crawl.Crawl
> if "%1"=="inject" set CLASS=org.apache.nutch.crawl.Injector
> if "%1"=="generate" set CLASS=org.apache.nutch.crawl.Generator
> if "%1"=="fetch" set CLASS=org.apache.nutch.fetcher.Fetcher
> if "%1"=="parse" set CLASS=org.apache.nutch.parse.ParseSegment
> if "%1"=="readdb" set CLASS=org.apache.nutch.crawl.CrawlDbReader
> if "%1"=="readlinkdb" set CLASS=org.apache.nutch.crawl.LinkDbReader
> if "%1"=="segread" set CLASS=org.apache.nutch.segment.SegmentReader
> if "%1"=="updatedb" set CLASS=org.apache.nutch.crawl.CrawlDb
> if "%1"=="invertlinks" set CLASS=org.apache.nutch.crawl.LinkDb
> if "%1"=="index" set CLASS=org.apache.nutch.indexer.Indexer
> if "%1"=="dedup" set CLASS=org.apache.nutch.indexer.DeleteDuplicates
> if "%1"=="merge" set CLASS=org.apache.nutch.indexer.IndexMerger
> if "%1"=="plugin" set CLASS=org.apache.nutch.plugin.PluginRepository
> if "%1"=="server" set CLASS='
> org.apache.nutch.searcher.DistributedSearch$Server'
> if "%CLASS%"=="" set CLASS=%1
>
> %JAVA_HOME%\bin\java -cp %CLASSPATH% %CLASS% %*
>
>
> if "%OS%"=="Windows_NT" @endlocal
> if "%OS%"=="WINNT" @endlocal
>
> :end
>
>
>
Re: Startscript in windows
Posted by ArentJan Banck <aj...@planet.nl>.
I also wrote a Windows batch file, and created a Jira case for this, see
http://issues.apache.org/jira/browse/NUTCH-82
-Arent-Jan
----- Original Message -----
From: "Cement Xianyu" <ce...@gmail.com>
To: <nu...@lucene.apache.org>
Sent: Sunday, April 30, 2006 5:57 PM
Subject: Startscript in windows
Hi
Because I want to use nutch on my thinkdpad under windows,
I read the original start script and change it in windows' batch script
file.
There are two files.
The first file is used to ensure Delayed environment variable is enabled.
Also files can be found from my blog: http://dwangel.3322.org/2006/04/29/44/
( In chinese :P )
I wish these will be helpful. May it can be included into package.
The followings are the content.
nutch.bat
@cmd /V:on /c %~dp0nutch1.bat %*
nutch1.bat
@echo on
rem *****************************************************************
rem * A script to launch nutch on Windows 2000/XP System.
rem *
rem * Written by Cement Xianyu
rem * ( cement.xianyu@gmail.com blog: http://dwangel.3322.org)
rem *
rem * Because delayed environment is used, cmd /V:on should be used to
rem * run this script.
rem *****************************************************************
if "%OS%"=="Windows_NT" @setlocal
if "%OS%"=="WINNT" @setlocal
if "%1" == "" goto :msg
goto :begin
:msg
echo "Usage: nutch COMMAND"
echo "where COMMAND is one of:"
echo " crawl one-step crawler for intranets"
echo " readdb read / dump crawl db"
echo " readlinkdb read / dump link db"
echo " inject inject new urls into the database"
echo " generate generate new segments to fetch"
echo " fetch fetch a segment's pages"
echo " parse parse a segment's pages"
echo " segread read / dump segment data"
echo " updatedb update crawl db from segments after fetching"
echo " invertlinks create a linkdb from parsed segments"
echo " index run the indexer on parsed segments and linkdb"
echo " merge merge several segment indexes"
echo " dedup remove duplicates from a set of segment indexes"
echo " plugin load a plugin and run one of its classes main()"
echo " server run a search server"
echo " or"
echo " CLASSNAME run the class named CLASSNAME"
echo "Most commands print help when invoked w/o parameters."
pause
goto :end
:begin
rem %~dp0 is expanded pathname of the current script under NT
set DEFAULT_NUTCH_HOME=%~dp0..
rem set DEFAULT_NUTCH_HOME=..
if "%NUTCH_HOME%"=="" set NUTCH_HOME=%DEFAULT_NUTCH_HOME%
set DEFAULT_NUTCH_HOME=""
echo %NUTCH_HOME%
rem set _USE_CLASSPATH=yes
if "%CLASSPATH%"=="" ( set CLASSPATH=%JAVA_HOME%\lib\tools.jar) ELSE set
CLASSPATH=%CLASSPATH%;%JAVA_HOME%\lib\tools.jar
set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\conf;
echo %CLASSPATH%
echo before other
rem for developers, add plugins, job & test code to CLASSPATH
if exist %NUTCH_HOME%\build\plugins set
CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build
for /R %NUTCH_HOME%\build %%i in (nutch*.job) do set
CLASSPATH=!CLASSPATH!;%%i
if exist %NUTCH_HOME%\build\test\classes set
CLASSPATH=%CLASSPATH%;%NUTCH_HOME%\build\test\classes
rem for releases, add Nutch job to CLASSPATH
for /R %NUTCH_HOME% %%i in (nutch*.job) do set CLASSPATH=!CLASSPATH!;%%i
rem add plugins to classpath
if exist %NUTCH_HOME%\plugins set CLASSPATH=%CLASSPATH%;%NUTCH_HOME%
rem add libs to CLASSPATH
for /R %NUTCH_HOME%\lib %%f in (*.jar) do set CLASSPATH=!CLASSPATH!;%%f
echo %CLASSPATH%
rem translate command
if "%1"=="crawl" set CLASS=org.apache.nutch.crawl.Crawl
if "%1"=="inject" set CLASS=org.apache.nutch.crawl.Injector
if "%1"=="generate" set CLASS=org.apache.nutch.crawl.Generator
if "%1"=="fetch" set CLASS=org.apache.nutch.fetcher.Fetcher
if "%1"=="parse" set CLASS=org.apache.nutch.parse.ParseSegment
if "%1"=="readdb" set CLASS=org.apache.nutch.crawl.CrawlDbReader
if "%1"=="readlinkdb" set CLASS=org.apache.nutch.crawl.LinkDbReader
if "%1"=="segread" set CLASS=org.apache.nutch.segment.SegmentReader
if "%1"=="updatedb" set CLASS=org.apache.nutch.crawl.CrawlDb
if "%1"=="invertlinks" set CLASS=org.apache.nutch.crawl.LinkDb
if "%1"=="index" set CLASS=org.apache.nutch.indexer.Indexer
if "%1"=="dedup" set CLASS=org.apache.nutch.indexer.DeleteDuplicates
if "%1"=="merge" set CLASS=org.apache.nutch.indexer.IndexMerger
if "%1"=="plugin" set CLASS=org.apache.nutch.plugin.PluginRepository
if "%1"=="server" set CLASS='
org.apache.nutch.searcher.DistributedSearch$Server'
if "%CLASS%"=="" set CLASS=%1
%JAVA_HOME%\bin\java -cp %CLASSPATH% %CLASS% %*
if "%OS%"=="Windows_NT" @endlocal
if "%OS%"=="WINNT" @endlocal
:end