You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Vimal Varghese <vi...@tcs.com> on 2009/01/22 05:50:20 UTC

Re: login failed exception

Hi Pradeep,

I was able to run the nutch 0.9 stable version but its crawling only very 
few urls. So i tried with the latest build from the nightly build. I am 
getting this login failed exception. I still not able to resolve it... 



Vimal Varghese




Pradeep Pujari <pr...@macys.com> 
20-01-09 11:42 PM

To
Vimal Varghese <vi...@tcs.com>
cc

Subject
Re: login failed exception






Hi Vimal,

I am getting the exactly same error? How did you resolve this? I imported
the code to eclipse and build and following the wiki instructions.

Thanks/Regards
Pradeep Pujari
415-422-1678


 
             Vimal Varghese 
             <vimal.varghese@t 
             cs.com>                                                    To 

                                       nutch-dev@lucene.apache.org 
             01/19/2009 02:03                                           cc 

             AM 
                                                                   Subject 

                                       login failed exception 
             Please respond to 
             nutch-dev@lucene. 
                apache.org 
 
 
 





Hi,

I have configured the latest nutch from the nightly build in eclipse.

I am getting this following error.

Exception in thread "main" java.io.IOException: Failed to get the current
user's information.
        at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:717)
        at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(
JobClient.java:592)
        at 
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:774)

        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1127)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:112)
Caused by: javax.security.auth.login.LoginException: Login failed: Cannot
run program "whoami": CreateProcess error=2, The system cannot find the
file specified
        at org.apache.hadoop.security.UnixUserGroupInformation.login(
UnixUserGroupInformation.java:250)
        at org.apache.hadoop.security.UnixUserGroupInformation.login(
UnixUserGroupInformation.java:275)
        at org.apache.hadoop.mapred.JobClient.getUGI(JobClient.java:715)
        ... 5 more


Is there any way to overcome this.

Vimal Varghese
Tata Consultancy Services
TEJOMAYA, L & T TECH PARK LIMITED
INFOPARK, KUSUMAGIRI POST, KAKKANAD,
Kochi - 682030,.
India
Ph:- +91 484 6618791
Cell:- 9446557234
Mailto: vimal.varghese@tcs.com
Website: http://www.tcs.com
____________________________________________
Experience certainty.        IT Services
                       Business Solutions
                       Outsourcing
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you






ForwardSourceID:NT00014236 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you



RE: login failed exception

Posted by sa...@thomsonreuters.com.
Ignore if you have already done this.  But have you built all the plugins in src/plugin using the build.xml in that folder?  If you have done that, and also pointed plugin.folders to the build folder and its still not working, then its quite likely because the plugin.xml's have not copied to the built plugins.  The ant build.xml in src.plugin doesn't copy the plugin.xml for each plugin as it builds them.  You can try hand-copying some of them to see if it works.

Most likely all the plugins are not getting loaded, and this throws a RuntimeException that Hadoop happily swallows and notes with a JobStatus.FAILED.  As Bartosz suggests these are logged as WARNs in hadoop.log.

Sanjoy

-----Original Message-----
From: Frank McCown [mailto:fmccown@harding.edu] 
Sent: Friday, April 10, 2009 2:29 PM
To: nutch-dev@lucene.apache.org
Subject: Re: login failed exception

Adding cygwin to my PATH solved my problem with whoami.  But now I'm
getting an exception when running the crawler:

Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
	at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
	at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

I know from searching the mailing list that this is normally due to a
bad plugin.folders setting in the nutch-default.xml, but I used the
same value as the tutorial (./src/plugin) to no avail.

(As an aside, seems like Hadoop should provide a better error message
if the plugin folder doesn't exist.)

Anyway, thanks, Bartosz, for your help.

Frank


2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
> Hello,
>
> So now you have to install cygwin and be sure that you add it to PATH
>
> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> After this you should be able to run "bash" command from command prompt
> (Menu Start > RUN > cmd.exe)
>
> Then you'r done - everything will be working.
>
> I must add it to wiki, I forgot about whoami problem.
>
> Take care,
> Bartosz
>
> sanjoy.ghosh@thomsonreuters.com pisze:
>>
>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>> crashed on "bash".
>>
>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>> javax.security.auth.login.LoginException: Login failed: Cannot run
>> program "bash": CreateProcess error=2, The system cannot find the file
>> specified
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:250)
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:275)
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:257)
>>        at
>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>> on.java:67)
>>        at
>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>        at
>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>
>> Where am I going to find "bash" on Windows without running commandline
>> cygwin?  Is there a way to turn off this security in Hadoop?
>>
>> Thanks,
>> Sanjoy
>>
>> -----Original Message-----
>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April 10,
>> 2009 5:06 AM
>> To: nutch-dev@lucene.apache.org
>> Subject: Re: login failed exception
>>
>> Hello,
>>
>> I am not sure if it's the case but you should try to add whoami to your
>> windows box.
>>
>> for example for windows xp and sp2:
>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>> 126-9761-BA8011FABF38&displaylang=en
>>
>>
>> Thanks,
>> Bartosz
>>
>> Frank McCown pisze:
>>
>>>
>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>> the crawler gave me the same exception as it gave this individual.
>>> Maybe there's something else I'm overlooking, but I followed the
>>> Tutorial at
>>>
>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> to a T.  I'll keep working on it though.
>>>
>>> Frank
>>>
>>>
>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>
>>>>
>>>> fmccown pisze:
>>>>
>>>>>
>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>
>>
>> has the
>>
>>>>>
>>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>
>>
>> This
>>
>>>>>
>>>>> is
>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>
>>
>> after
>>
>>>>>
>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>
>>>>>
>>>>>
>>>>
>>>> It's not true, please look at
>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>
>>>> Thanks,
>>>> Bartosz
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>


Re: login failed exception

Posted by Frank McCown <fm...@harding.edu>.
Bartosz,

That fixed the problem.  Thanks for your help and for updating the wiki.

Frank


2009/4/14 Bartosz Gadzimski <ba...@o2.pl>:
> Hello Frank,
>
> Yes, it is memory issue you must increase java heap size.
>
> Just follow this instructions (another things to add to wiki ;)
>
> Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit ->
> Default VM arguments
>
> I've set mine to -Xms5m -Xmx150m because I have like 200MB RAM left after
> runnig all apps
>
> -Xms (minimum ammount of RAM memory for running applications)
> -Xmx (maximum)
>
> It should help.
>
> Thanks,
> Bartosz
>
> Frank McCown pisze:
>>
>> Hello Bartosz,
>>
>> I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
>> with Eclipse 3.3.0.  I followed the directions at
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> exactly as stated.  I'm able to run the default Nutch 0.9 release
>> without any problems in Eclipse.  But when I run 1.0, I always get the
>> java.io.IOException as stated in my last email.  I had assumed it was
>> due to the plugin issue, but maybe not.  I'm just running a very small
>> crawl with two seed URLs.
>>
>> Here's what hadoop.log says:
>>
>> 2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
>> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb:
>> crawl/crawldb
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
>> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
>> injected urls to crawl db entries.
>> 2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
>> GenericOptionsParser for parsing the arguments. Applications should
>> implement Tool for the same.
>> 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
>> java.lang.OutOfMemoryError: Java heap space
>>        at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>        at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>>
>>
>> I have not tried Sanjoy's advice yet... it looks like this is a memory
>> issue.
>>
>> Any advice would be much appreciated,
>> Frank
>>
>>
>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>
>>>
>>> Hello Frank,
>>>
>>> Please look into hadoop.log and let maybe there is something more.
>>>
>>> About your error - you must give us more specific configuration of your
>>> nutch.
>>>
>>> Default nutch installation is working with no problems (I'v never changed
>>> src/plugin path)
>>>
>>> Please tell us: version of nutch
>>> any changes
>>> different configurations (different then crawl-urlfilter - adding your
>>> domain).
>>>
>>> Thanks,
>>> Bartosz
>>>
>>> Frank McCown pisze:
>>>
>>>>
>>>> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
>>>> getting an exception when running the crawler:
>>>>
>>>> Injector: Converting injected urls to crawl db entries.
>>>> Exception in thread "main" java.io.IOException: Job failed!
>>>>       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>>>       at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
>>>>       at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>>>>
>>>> I know from searching the mailing list that this is normally due to a
>>>> bad plugin.folders setting in the nutch-default.xml, but I used the
>>>> same value as the tutorial (./src/plugin) to no avail.
>>>>
>>>> (As an aside, seems like Hadoop should provide a better error message
>>>> if the plugin folder doesn't exist.)
>>>>
>>>> Anyway, thanks, Bartosz, for your help.
>>>>
>>>> Frank
>>>>
>>>>
>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>
>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> So now you have to install cygwin and be sure that you add it to PATH
>>>>>
>>>>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>
>>>>> After this you should be able to run "bash" command from command prompt
>>>>> (Menu Start > RUN > cmd.exe)
>>>>>
>>>>> Then you'r done - everything will be working.
>>>>>
>>>>> I must add it to wiki, I forgot about whoami problem.
>>>>>
>>>>> Take care,
>>>>> Bartosz
>>>>>
>>>>> sanjoy.ghosh@thomsonreuters.com pisze:
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It
>>>>>> promptly
>>>>>> crashed on "bash".
>>>>>>
>>>>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>>>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>>>>> program "bash": CreateProcess error=2, The system cannot find the file
>>>>>> specified
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:250)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:275)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>>> nformation.java:257)
>>>>>>      at
>>>>>>
>>>>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>>>>> on.java:67)
>>>>>>      at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>>>>      at
>>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>>>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>>>>      at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>>>>      at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>>>>
>>>>>> Where am I going to find "bash" on Windows without running commandline
>>>>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>>>>
>>>>>> Thanks,
>>>>>> Sanjoy
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April
>>>>>> 10,
>>>>>> 2009 5:06 AM
>>>>>> To: nutch-dev@lucene.apache.org
>>>>>> Subject: Re: login failed exception
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am not sure if it's the case but you should try to add whoami to
>>>>>> your
>>>>>> windows box.
>>>>>>
>>>>>> for example for windows xp and sp2:
>>>>>>
>>>>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>>>>> 126-9761-BA8011FABF38&displaylang=en
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Bartosz
>>>>>>
>>>>>> Frank McCown pisze:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>>>>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>>>>>> the crawler gave me the same exception as it gave this individual.
>>>>>>> Maybe there's something else I'm overlooking, but I followed the
>>>>>>> Tutorial at
>>>>>>>
>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>
>>>>>>> to a T.  I'll keep working on it though.
>>>>>>>
>>>>>>> Frank
>>>>>>>
>>>>>>>
>>>>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> fmccown pisze:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> has the
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't
>>>>>>>>> use
>>>>>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> This
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> is
>>>>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> after
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>>
>>>>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's not true, please look at
>>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>>
>>>>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Bartosz
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: login failed exception

Posted by Bartosz Gadzimski <ba...@o2.pl>.
Hello Frank,

Yes, it is memory issue you must increase java heap size.

Just follow this instructions (another things to add to wiki ;)

Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit -> 
Default VM arguments

I've set mine to -Xms5m -Xmx150m because I have like 200MB RAM left 
after runnig all apps

-Xms (minimum ammount of RAM memory for running applications)
-Xmx (maximum)

It should help.

Thanks,
Bartosz

Frank McCown pisze:
> Hello Bartosz,
>
> I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
> with Eclipse 3.3.0.  I followed the directions at
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> exactly as stated.  I'm able to run the default Nutch 0.9 release
> without any problems in Eclipse.  But when I run 1.0, I always get the
> java.io.IOException as stated in my last email.  I had assumed it was
> due to the plugin issue, but maybe not.  I'm just running a very small
> crawl with two seed URLs.
>
> Here's what hadoop.log says:
>
> 2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb: crawl/crawldb
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
> injected urls to crawl db entries.
> 2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
> GenericOptionsParser for parsing the arguments. Applications should
> implement Tool for the same.
> 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>
>
> I have not tried Sanjoy's advice yet... it looks like this is a memory issue.
>
> Any advice would be much appreciated,
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>   
>> Hello Frank,
>>
>> Please look into hadoop.log and let maybe there is something more.
>>
>> About your error - you must give us more specific configuration of your
>> nutch.
>>
>> Default nutch installation is working with no problems (I'v never changed
>> src/plugin path)
>>
>> Please tell us: version of nutch
>> any changes
>> different configurations (different then crawl-urlfilter - adding your
>> domain).
>>
>> Thanks,
>> Bartosz
>>
>> Frank McCown pisze:
>>     
>>> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
>>> getting an exception when running the crawler:
>>>
>>> Injector: Converting injected urls to crawl db entries.
>>> Exception in thread "main" java.io.IOException: Job failed!
>>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>>        at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
>>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>>>
>>> I know from searching the mailing list that this is normally due to a
>>> bad plugin.folders setting in the nutch-default.xml, but I used the
>>> same value as the tutorial (./src/plugin) to no avail.
>>>
>>> (As an aside, seems like Hadoop should provide a better error message
>>> if the plugin folder doesn't exist.)
>>>
>>> Anyway, thanks, Bartosz, for your help.
>>>
>>> Frank
>>>
>>>
>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>
>>>       
>>>> Hello,
>>>>
>>>> So now you have to install cygwin and be sure that you add it to PATH
>>>>
>>>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> After this you should be able to run "bash" command from command prompt
>>>> (Menu Start > RUN > cmd.exe)
>>>>
>>>> Then you'r done - everything will be working.
>>>>
>>>> I must add it to wiki, I forgot about whoami problem.
>>>>
>>>> Take care,
>>>> Bartosz
>>>>
>>>> sanjoy.ghosh@thomsonreuters.com pisze:
>>>>
>>>>         
>>>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>>>>> crashed on "bash".
>>>>>
>>>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>>>> program "bash": CreateProcess error=2, The system cannot find the file
>>>>> specified
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:250)
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:275)
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:257)
>>>>>       at
>>>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>>>> on.java:67)
>>>>>       at
>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>>>       at
>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>>>       at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>>>
>>>>> Where am I going to find "bash" on Windows without running commandline
>>>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>>>
>>>>> Thanks,
>>>>> Sanjoy
>>>>>
>>>>> -----Original Message-----
>>>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April 10,
>>>>> 2009 5:06 AM
>>>>> To: nutch-dev@lucene.apache.org
>>>>> Subject: Re: login failed exception
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am not sure if it's the case but you should try to add whoami to your
>>>>> windows box.
>>>>>
>>>>> for example for windows xp and sp2:
>>>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>>>> 126-9761-BA8011FABF38&displaylang=en
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Bartosz
>>>>>
>>>>> Frank McCown pisze:
>>>>>
>>>>>
>>>>>           
>>>>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>>>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>>>>> the crawler gave me the same exception as it gave this individual.
>>>>>> Maybe there's something else I'm overlooking, but I followed the
>>>>>> Tutorial at
>>>>>>
>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>
>>>>>> to a T.  I'll keep working on it though.
>>>>>>
>>>>>> Frank
>>>>>>
>>>>>>
>>>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> fmccown pisze:
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> has the
>>>>>
>>>>>
>>>>>           
>>>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> This
>>>>>
>>>>>
>>>>>           
>>>>>>>> is
>>>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> after
>>>>>
>>>>>
>>>>>           
>>>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>> It's not true, please look at
>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>
>>>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Bartosz
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>             
>>>>>
>>>>>           
>>>       
>>     
>
>   


Re: login failed exception

Posted by Frank McCown <fm...@harding.edu>.
Hello Bartosz,

I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
with Eclipse 3.3.0.  I followed the directions at

http://wiki.apache.org/nutch/RunNutchInEclipse0.9

exactly as stated.  I'm able to run the default Nutch 0.9 release
without any problems in Eclipse.  But when I run 1.0, I always get the
java.io.IOException as stated in my last email.  I had assumed it was
due to the plugin issue, but maybe not.  I'm just running a very small
crawl with two seed URLs.

Here's what hadoop.log says:

2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb: crawl/crawldb
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
injected urls to crawl db entries.
2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)


I have not tried Sanjoy's advice yet... it looks like this is a memory issue.

Any advice would be much appreciated,
Frank


2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
> Hello Frank,
>
> Please look into hadoop.log and let maybe there is something more.
>
> About your error - you must give us more specific configuration of your
> nutch.
>
> Default nutch installation is working with no problems (I'v never changed
> src/plugin path)
>
> Please tell us: version of nutch
> any changes
> different configurations (different then crawl-urlfilter - adding your
> domain).
>
> Thanks,
> Bartosz
>
> Frank McCown pisze:
>>
>> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
>> getting an exception when running the crawler:
>>
>> Injector: Converting injected urls to crawl db entries.
>> Exception in thread "main" java.io.IOException: Job failed!
>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>        at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>>
>> I know from searching the mailing list that this is normally due to a
>> bad plugin.folders setting in the nutch-default.xml, but I used the
>> same value as the tutorial (./src/plugin) to no avail.
>>
>> (As an aside, seems like Hadoop should provide a better error message
>> if the plugin folder doesn't exist.)
>>
>> Anyway, thanks, Bartosz, for your help.
>>
>> Frank
>>
>>
>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>
>>>
>>> Hello,
>>>
>>> So now you have to install cygwin and be sure that you add it to PATH
>>>
>>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> After this you should be able to run "bash" command from command prompt
>>> (Menu Start > RUN > cmd.exe)
>>>
>>> Then you'r done - everything will be working.
>>>
>>> I must add it to wiki, I forgot about whoami problem.
>>>
>>> Take care,
>>> Bartosz
>>>
>>> sanjoy.ghosh@thomsonreuters.com pisze:
>>>
>>>>
>>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>>>> crashed on "bash".
>>>>
>>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>>> program "bash": CreateProcess error=2, The system cannot find the file
>>>> specified
>>>>       at
>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>> nformation.java:250)
>>>>       at
>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>> nformation.java:275)
>>>>       at
>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>> nformation.java:257)
>>>>       at
>>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>>> on.java:67)
>>>>       at
>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>>       at
>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>>       at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>>
>>>> Where am I going to find "bash" on Windows without running commandline
>>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>>
>>>> Thanks,
>>>> Sanjoy
>>>>
>>>> -----Original Message-----
>>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April 10,
>>>> 2009 5:06 AM
>>>> To: nutch-dev@lucene.apache.org
>>>> Subject: Re: login failed exception
>>>>
>>>> Hello,
>>>>
>>>> I am not sure if it's the case but you should try to add whoami to your
>>>> windows box.
>>>>
>>>> for example for windows xp and sp2:
>>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>>> 126-9761-BA8011FABF38&displaylang=en
>>>>
>>>>
>>>> Thanks,
>>>> Bartosz
>>>>
>>>> Frank McCown pisze:
>>>>
>>>>
>>>>>
>>>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>>>> the crawler gave me the same exception as it gave this individual.
>>>>> Maybe there's something else I'm overlooking, but I followed the
>>>>> Tutorial at
>>>>>
>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>
>>>>> to a T.  I'll keep working on it though.
>>>>>
>>>>> Frank
>>>>>
>>>>>
>>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>>
>>>>>
>>>>>>
>>>>>> fmccown pisze:
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>>
>>>>>>>
>>>>
>>>> has the
>>>>
>>>>
>>>>>>>
>>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>>>
>>>>>>>
>>>>
>>>> This
>>>>
>>>>
>>>>>>>
>>>>>>> is
>>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>>
>>>>>>>
>>>>
>>>> after
>>>>
>>>>
>>>>>>>
>>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> It's not true, please look at
>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>
>>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>>
>>>>>> Thanks,
>>>>>> Bartosz
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>
>>
>
>

PDF Parse fails with Nutch 1.0 - BadSecurityHandlerException

Posted by sa...@thomsonreuters.com.
Hello,

I am using the Nutch 1.0 release.  Parsing PDF keeps failing with


2009-04-28 19:16:01,154 INFO  fetcher.Fetcher - fetch of
http://public.xxx.com/life_events/forms/le24_b.pdf failed with:
java.lang.NoClassDefFoundError:
org/pdfbox/pdmodel/encryption/BadSecurityHandlerException

Any idea why this is happening?  I am including PDFBox-0.7.4-dev.jar in
my plugin.xml etc.

Thanks,
Sanjoy


RE: login failed exception

Posted by sa...@thomsonreuters.com.
IMHO, missing plugins should be logged as FATAL errors rather than
WARNings.  It pretty much ends the run right there.

Sanjoy

-----Original Message-----
From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] 
Sent: Friday, April 10, 2009 2:54 PM
To: nutch-dev@lucene.apache.org
Subject: Re: login failed exception

Hello Frank,

Please look into hadoop.log and let maybe there is something more.

About your error - you must give us more specific configuration of your 
nutch.

Default nutch installation is working with no problems (I'v never 
changed src/plugin path)

Please tell us: version of nutch
any changes
different configurations (different then crawl-urlfilter - adding your 
domain).

Thanks,
Bartosz

Frank McCown pisze:
> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
> getting an exception when running the crawler:
>
> Injector: Converting injected urls to crawl db entries.
> Exception in thread "main" java.io.IOException: Job failed!
> 	at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
> 	at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
> 	at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>
> I know from searching the mailing list that this is normally due to a
> bad plugin.folders setting in the nutch-default.xml, but I used the
> same value as the tutorial (./src/plugin) to no avail.
>
> (As an aside, seems like Hadoop should provide a better error message
> if the plugin folder doesn't exist.)
>
> Anyway, thanks, Bartosz, for your help.
>
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>   
>> Hello,
>>
>> So now you have to install cygwin and be sure that you add it to PATH
>>
>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> After this you should be able to run "bash" command from command
prompt
>> (Menu Start > RUN > cmd.exe)
>>
>> Then you'r done - everything will be working.
>>
>> I must add it to wiki, I forgot about whoami problem.
>>
>> Take care,
>> Bartosz
>>
>> sanjoy.ghosh@thomsonreuters.com pisze:
>>     
>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It
promptly
>>> crashed on "bash".
>>>
>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>> program "bash": CreateProcess error=2, The system cannot find the
file
>>> specified
>>>        at
>>>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:250)
>>>        at
>>>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:275)
>>>        at
>>>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:257)
>>>        at
>>>
org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>> on.java:67)
>>>        at
>>>
org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>        at
>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>
>>> Where am I going to find "bash" on Windows without running
commandline
>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>
>>> Thanks,
>>> Sanjoy
>>>
>>> -----Original Message-----
>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April
10,
>>> 2009 5:06 AM
>>> To: nutch-dev@lucene.apache.org
>>> Subject: Re: login failed exception
>>>
>>> Hello,
>>>
>>> I am not sure if it's the case but you should try to add whoami to
your
>>> windows box.
>>>
>>> for example for windows xp and sp2:
>>>
http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>> 126-9761-BA8011FABF38&displaylang=en
>>>
>>>
>>> Thanks,
>>> Bartosz
>>>
>>> Frank McCown pisze:
>>>
>>>       
>>>> I've been running 0.9 in Eclipse on Windows for some time, and I
was
>>>> successful in running the NutchBean from version 1.0 in Eclipse,
but
>>>> the crawler gave me the same exception as it gave this individual.
>>>> Maybe there's something else I'm overlooking, but I followed the
>>>> Tutorial at
>>>>
>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> to a T.  I'll keep working on it though.
>>>>
>>>> Frank
>>>>
>>>>
>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>
>>>>         
>>>>> fmccown pisze:
>>>>>
>>>>>           
>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>
>>>>>>             
>>> has the
>>>
>>>       
>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't
use
>>>>>> cygwin's whoami program and will fail with the exceptions you
saw.
>>>>>>
>>>>>>             
>>> This
>>>
>>>       
>>>>>> is
>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>
>>>>>>             
>>> after
>>>
>>>       
>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>> It's not true, please look at
>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>
>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>
>>>>> Thanks,
>>>>> Bartosz
>>>>>
>>>>>
>>>>>           
>>>>         
>>>
>>>
>>>       
>
>   



Re: login failed exception

Posted by Bartosz Gadzimski <ba...@o2.pl>.
Hello Frank,

Please look into hadoop.log and let maybe there is something more.

About your error - you must give us more specific configuration of your 
nutch.

Default nutch installation is working with no problems (I'v never 
changed src/plugin path)

Please tell us: version of nutch
any changes
different configurations (different then crawl-urlfilter - adding your 
domain).

Thanks,
Bartosz

Frank McCown pisze:
> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
> getting an exception when running the crawler:
>
> Injector: Converting injected urls to crawl db entries.
> Exception in thread "main" java.io.IOException: Job failed!
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
> 	at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
> 	at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>
> I know from searching the mailing list that this is normally due to a
> bad plugin.folders setting in the nutch-default.xml, but I used the
> same value as the tutorial (./src/plugin) to no avail.
>
> (As an aside, seems like Hadoop should provide a better error message
> if the plugin folder doesn't exist.)
>
> Anyway, thanks, Bartosz, for your help.
>
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>   
>> Hello,
>>
>> So now you have to install cygwin and be sure that you add it to PATH
>>
>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> After this you should be able to run "bash" command from command prompt
>> (Menu Start > RUN > cmd.exe)
>>
>> Then you'r done - everything will be working.
>>
>> I must add it to wiki, I forgot about whoami problem.
>>
>> Take care,
>> Bartosz
>>
>> sanjoy.ghosh@thomsonreuters.com pisze:
>>     
>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>>> crashed on "bash".
>>>
>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>> program "bash": CreateProcess error=2, The system cannot find the file
>>> specified
>>>        at
>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:250)
>>>        at
>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:275)
>>>        at
>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>> nformation.java:257)
>>>        at
>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>> on.java:67)
>>>        at
>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>        at
>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>
>>> Where am I going to find "bash" on Windows without running commandline
>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>
>>> Thanks,
>>> Sanjoy
>>>
>>> -----Original Message-----
>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April 10,
>>> 2009 5:06 AM
>>> To: nutch-dev@lucene.apache.org
>>> Subject: Re: login failed exception
>>>
>>> Hello,
>>>
>>> I am not sure if it's the case but you should try to add whoami to your
>>> windows box.
>>>
>>> for example for windows xp and sp2:
>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>> 126-9761-BA8011FABF38&displaylang=en
>>>
>>>
>>> Thanks,
>>> Bartosz
>>>
>>> Frank McCown pisze:
>>>
>>>       
>>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>>> the crawler gave me the same exception as it gave this individual.
>>>> Maybe there's something else I'm overlooking, but I followed the
>>>> Tutorial at
>>>>
>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> to a T.  I'll keep working on it though.
>>>>
>>>> Frank
>>>>
>>>>
>>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>>
>>>>         
>>>>> fmccown pisze:
>>>>>
>>>>>           
>>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>>
>>>>>>             
>>> has the
>>>
>>>       
>>>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>>
>>>>>>             
>>> This
>>>
>>>       
>>>>>> is
>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>
>>>>>>             
>>> after
>>>
>>>       
>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>> It's not true, please look at
>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>
>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>
>>>>> Thanks,
>>>>> Bartosz
>>>>>
>>>>>
>>>>>           
>>>>         
>>>
>>>
>>>       
>
>   


Re: login failed exception

Posted by Frank McCown <fm...@harding.edu>.
Adding cygwin to my PATH solved my problem with whoami.  But now I'm
getting an exception when running the crawler:

Injector: Converting injected urls to crawl db entries.
Exception in thread "main" java.io.IOException: Job failed!
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
	at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
	at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

I know from searching the mailing list that this is normally due to a
bad plugin.folders setting in the nutch-default.xml, but I used the
same value as the tutorial (./src/plugin) to no avail.

(As an aside, seems like Hadoop should provide a better error message
if the plugin folder doesn't exist.)

Anyway, thanks, Bartosz, for your help.

Frank


2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
> Hello,
>
> So now you have to install cygwin and be sure that you add it to PATH
>
> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> After this you should be able to run "bash" command from command prompt
> (Menu Start > RUN > cmd.exe)
>
> Then you'r done - everything will be working.
>
> I must add it to wiki, I forgot about whoami problem.
>
> Take care,
> Bartosz
>
> sanjoy.ghosh@thomsonreuters.com pisze:
>>
>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>> crashed on "bash".
>>
>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>> javax.security.auth.login.LoginException: Login failed: Cannot run
>> program "bash": CreateProcess error=2, The system cannot find the file
>> specified
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:250)
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:275)
>>        at
>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>> nformation.java:257)
>>        at
>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>> on.java:67)
>>        at
>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>        at
>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>
>> Where am I going to find "bash" on Windows without running commandline
>> cygwin?  Is there a way to turn off this security in Hadoop?
>>
>> Thanks,
>> Sanjoy
>>
>> -----Original Message-----
>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April 10,
>> 2009 5:06 AM
>> To: nutch-dev@lucene.apache.org
>> Subject: Re: login failed exception
>>
>> Hello,
>>
>> I am not sure if it's the case but you should try to add whoami to your
>> windows box.
>>
>> for example for windows xp and sp2:
>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>> 126-9761-BA8011FABF38&displaylang=en
>>
>>
>> Thanks,
>> Bartosz
>>
>> Frank McCown pisze:
>>
>>>
>>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>>> successful in running the NutchBean from version 1.0 in Eclipse, but
>>> the crawler gave me the same exception as it gave this individual.
>>> Maybe there's something else I'm overlooking, but I followed the
>>> Tutorial at
>>>
>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> to a T.  I'll keep working on it though.
>>>
>>> Frank
>>>
>>>
>>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>>
>>>>
>>>> fmccown pisze:
>>>>
>>>>>
>>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>>
>>
>> has the
>>
>>>>>
>>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>>
>>
>> This
>>
>>>>>
>>>>> is
>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>
>>
>> after
>>
>>>>>
>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>
>>>>>
>>>>>
>>>>
>>>> It's not true, please look at
>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>
>>>> Thanks,
>>>> Bartosz
>>>>
>>>>
>>>
>>>
>>
>>
>>
>>
>

RE: login failed exception

Posted by sa...@thomsonreuters.com.
Thanks Bartosz.  That worked like a charm.  

Yes please add it to the Wiki.  Millions will stumble on it otherwise.

Sanjoy

-----Original Message-----
From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] 
Sent: Friday, April 10, 2009 12:34 PM
To: nutch-dev@lucene.apache.org
Subject: Re: login failed exception

Hello,

So now you have to install cygwin and be sure that you add it to PATH

it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9

After this you should be able to run "bash" command from command prompt 
(Menu Start > RUN > cmd.exe)

Then you'r done - everything will be working.

I must add it to wiki, I forgot about whoami problem.

Take care,
Bartosz

sanjoy.ghosh@thomsonreuters.com pisze:
> Thanks for the suggestion Bartosz.  I downloaded whoami, and It
promptly
> crashed on "bash".
>
> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
> javax.security.auth.login.LoginException: Login failed: Cannot run
> program "bash": CreateProcess error=2, The system cannot find the file
> specified
> 	at
>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:250)
> 	at
>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:275)
> 	at
>
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:257)
> 	at
>
org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
> on.java:67)
> 	at
> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
> 	at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
> 	at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>
> Where am I going to find "bash" on Windows without running commandline
> cygwin?  Is there a way to turn off this security in Hadoop?
>
> Thanks,
> Sanjoy
>
> -----Original Message-----
> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] 
> Sent: Friday, April 10, 2009 5:06 AM
> To: nutch-dev@lucene.apache.org
> Subject: Re: login failed exception
>
> Hello,
>
> I am not sure if it's the case but you should try to add whoami to
your 
> windows box.
>
> for example for windows xp and sp2:
>
http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
> 126-9761-BA8011FABF38&displaylang=en
>
>
> Thanks,
> Bartosz
>
> Frank McCown pisze:
>   
>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>> successful in running the NutchBean from version 1.0 in Eclipse, but
>> the crawler gave me the same exception as it gave this individual.
>> Maybe there's something else I'm overlooking, but I followed the
>> Tutorial at
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> to a T.  I'll keep working on it though.
>>
>> Frank
>>
>>
>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>   
>>     
>>> fmccown pisze:
>>>     
>>>       
>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>         
> has the
>   
>>>> whoami program.  If you run it from Eclipse on Windows, it can't
use
>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>         
> This
>   
>>>> is
>>>> an unfortunately design decision in Hadoop which makes anything
>>>>         
> after
>   
>>>> version 9.0 not work in Eclipse on Windows.
>>>>
>>>>
>>>>       
>>>>         
>>> It's not true, please look at
>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>
>>> Thanks,
>>> Bartosz
>>>
>>>     
>>>       
>>
>>   
>>     
>
>
>
>   



Re: login failed exception

Posted by Bartosz Gadzimski <ba...@o2.pl>.
Hello,

So now you have to install cygwin and be sure that you add it to PATH

it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9

After this you should be able to run "bash" command from command prompt 
(Menu Start > RUN > cmd.exe)

Then you'r done - everything will be working.

I must add it to wiki, I forgot about whoami problem.

Take care,
Bartosz

sanjoy.ghosh@thomsonreuters.com pisze:
> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
> crashed on "bash".
>
> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
> javax.security.auth.login.LoginException: Login failed: Cannot run
> program "bash": CreateProcess error=2, The system cannot find the file
> specified
> 	at
> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:250)
> 	at
> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:275)
> 	at
> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
> nformation.java:257)
> 	at
> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
> on.java:67)
> 	at
> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
> 	at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
> 	at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>
> Where am I going to find "bash" on Windows without running commandline
> cygwin?  Is there a way to turn off this security in Hadoop?
>
> Thanks,
> Sanjoy
>
> -----Original Message-----
> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] 
> Sent: Friday, April 10, 2009 5:06 AM
> To: nutch-dev@lucene.apache.org
> Subject: Re: login failed exception
>
> Hello,
>
> I am not sure if it's the case but you should try to add whoami to your 
> windows box.
>
> for example for windows xp and sp2:
> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
> 126-9761-BA8011FABF38&displaylang=en
>
>
> Thanks,
> Bartosz
>
> Frank McCown pisze:
>   
>> I've been running 0.9 in Eclipse on Windows for some time, and I was
>> successful in running the NutchBean from version 1.0 in Eclipse, but
>> the crawler gave me the same exception as it gave this individual.
>> Maybe there's something else I'm overlooking, but I followed the
>> Tutorial at
>>
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> to a T.  I'll keep working on it though.
>>
>> Frank
>>
>>
>> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>>   
>>     
>>> fmccown pisze:
>>>     
>>>       
>>>> You must run Nutch's crawler using cygwin on Windows since cygwin
>>>>         
> has the
>   
>>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>>> cygwin's whoami program and will fail with the exceptions you saw.
>>>>         
> This
>   
>>>> is
>>>> an unfortunately design decision in Hadoop which makes anything
>>>>         
> after
>   
>>>> version 9.0 not work in Eclipse on Windows.
>>>>
>>>>
>>>>       
>>>>         
>>> It's not true, please look at
>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>
>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>
>>> Thanks,
>>> Bartosz
>>>
>>>     
>>>       
>>
>>   
>>     
>
>
>
>   


RE: login failed exception

Posted by sa...@thomsonreuters.com.
Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
crashed on "bash".

09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
javax.security.auth.login.LoginException: Login failed: Cannot run
program "bash": CreateProcess error=2, The system cannot find the file
specified
	at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:250)
	at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:275)
	at
org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
nformation.java:257)
	at
org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
on.java:67)
	at
org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
	at
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
	at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)

Where am I going to find "bash" on Windows without running commandline
cygwin?  Is there a way to turn off this security in Hadoop?

Thanks,
Sanjoy

-----Original Message-----
From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] 
Sent: Friday, April 10, 2009 5:06 AM
To: nutch-dev@lucene.apache.org
Subject: Re: login failed exception

Hello,

I am not sure if it's the case but you should try to add whoami to your 
windows box.

for example for windows xp and sp2:
http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
126-9761-BA8011FABF38&displaylang=en


Thanks,
Bartosz

Frank McCown pisze:
> I've been running 0.9 in Eclipse on Windows for some time, and I was
> successful in running the NutchBean from version 1.0 in Eclipse, but
> the crawler gave me the same exception as it gave this individual.
> Maybe there's something else I'm overlooking, but I followed the
> Tutorial at
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> to a T.  I'll keep working on it though.
>
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>   
>> fmccown pisze:
>>     
>>> You must run Nutch's crawler using cygwin on Windows since cygwin
has the
>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>> cygwin's whoami program and will fail with the exceptions you saw.
This
>>> is
>>> an unfortunately design decision in Hadoop which makes anything
after
>>> version 9.0 not work in Eclipse on Windows.
>>>
>>>
>>>       
>> It's not true, please look at
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> I am using nutch 1.0 with eclipse on windows with no problems.
>>
>> Thanks,
>> Bartosz
>>
>>     
>
>
>
>   



Re: login failed exception

Posted by Bartosz Gadzimski <ba...@o2.pl>.
Hello,

I am not sure if it's the case but you should try to add whoami to your 
windows box.

for example for windows xp and sp2:
http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4126-9761-BA8011FABF38&displaylang=en


Thanks,
Bartosz

Frank McCown pisze:
> I've been running 0.9 in Eclipse on Windows for some time, and I was
> successful in running the NutchBean from version 1.0 in Eclipse, but
> the crawler gave me the same exception as it gave this individual.
> Maybe there's something else I'm overlooking, but I followed the
> Tutorial at
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> to a T.  I'll keep working on it though.
>
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
>   
>> fmccown pisze:
>>     
>>> You must run Nutch's crawler using cygwin on Windows since cygwin has the
>>> whoami program.  If you run it from Eclipse on Windows, it can't use
>>> cygwin's whoami program and will fail with the exceptions you saw.  This
>>> is
>>> an unfortunately design decision in Hadoop which makes anything after
>>> version 9.0 not work in Eclipse on Windows.
>>>
>>>
>>>       
>> It's not true, please look at
>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>
>> I am using nutch 1.0 with eclipse on windows with no problems.
>>
>> Thanks,
>> Bartosz
>>
>>     
>
>
>
>   


Re: login failed exception

Posted by Frank McCown <fm...@harding.edu>.
I've been running 0.9 in Eclipse on Windows for some time, and I was
successful in running the NutchBean from version 1.0 in Eclipse, but
the crawler gave me the same exception as it gave this individual.
Maybe there's something else I'm overlooking, but I followed the
Tutorial at

http://wiki.apache.org/nutch/RunNutchInEclipse0.9

to a T.  I'll keep working on it though.

Frank


2009/4/10 Bartosz Gadzimski <ba...@o2.pl>:
> fmccown pisze:
>>
>> You must run Nutch's crawler using cygwin on Windows since cygwin has the
>> whoami program.  If you run it from Eclipse on Windows, it can't use
>> cygwin's whoami program and will fail with the exceptions you saw.  This
>> is
>> an unfortunately design decision in Hadoop which makes anything after
>> version 9.0 not work in Eclipse on Windows.
>>
>>
>
> It's not true, please look at
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> I am using nutch 1.0 with eclipse on windows with no problems.
>
> Thanks,
> Bartosz
>



-- 
Frank McCown, Ph.D.
Assistant Professor of Computer Science
Harding University
http://www.harding.edu/fmccown/

Re: login failed exception

Posted by Bartosz Gadzimski <ba...@o2.pl>.
fmccown pisze:
> You must run Nutch's crawler using cygwin on Windows since cygwin has the
> whoami program.  If you run it from Eclipse on Windows, it can't use
> cygwin's whoami program and will fail with the exceptions you saw.  This is
> an unfortunately design decision in Hadoop which makes anything after
> version 9.0 not work in Eclipse on Windows.
>
>   
It's not true, please look at 
http://wiki.apache.org/nutch/RunNutchInEclipse0.9

I am using nutch 1.0 with eclipse on windows with no problems.

Thanks,
Bartosz

Re: login failed exception

Posted by fmccown <fm...@harding.edu>.
You must run Nutch's crawler using cygwin on Windows since cygwin has the
whoami program.  If you run it from Eclipse on Windows, it can't use
cygwin's whoami program and will fail with the exceptions you saw.  This is
an unfortunately design decision in Hadoop which makes anything after
version 9.0 not work in Eclipse on Windows.

-- 
View this message in context: http://www.nabble.com/login-failed-exception-tp21539952p22979522.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.