You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Mohammad Al-Mohsin <me...@mem9.net> on 2015/02/16 10:57:00 UTC

Nutch-Selenium Error

Hi,

I'm trying to use Nutch-Selenium plugin with Nutch 1.10 trunk on
Mac Yosemite.

I applied the patch from NUTCH-1933
<https://issues.apache.org/jira/browse/NUTCH-1933>, installed X11, and
included protocol-selenium plugin in nutch-site config file.

Now when I start crawling, at the first fetch, I see that Firefox is opened
and closed immediately and I get this error in the console:

fetch of http://www.mywebsite.com failed with:
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.http.impl.conn.ManagedHttpClientConnectionFactory

Any idea how to fix this error?

Best regards,
Mohammad Al-Mohsin

Re: Nutch-Selenium Error

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Mohammad, did you get this fixed?

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Mohammad Al-Mohsin <me...@mem9.net>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Monday, February 16, 2015 at 1:57 AM
To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Subject: Nutch-Selenium Error

>Hi,
>
>
>I'm trying to use Nutch-Selenium plugin with Nutch 1.10 trunk on Mac
>Yosemite.
>
>
>I applied the patch from
>NUTCH-1933 <https://issues.apache.org/jira/browse/NUTCH-1933>, installed
>X11, and included protocol-selenium plugin in nutch-site config file.
>
>
>Now when I start crawling, at the first fetch, I see that Firefox is
>opened and closed immediately and I get this error in the console:
>fetch of 
>http://www.mywebsite.com <http://www.mywebsite.com> failed with:
>java.lang.NoClassDefFoundError: Could not initialize class
>org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>
>
>
>
>Any idea how to fix this error?
>
>Best regards,
>Mohammad Al-Mohsin
>
>


Re: Nutch-Selenium Error

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Good to hear!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Mohammad Al-Mohsin <me...@mem9.net>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Monday, February 16, 2015 at 7:56 PM
To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Subject: Re: Nutch-Selenium Error

>FYI, the issue was resolved by deleting 'runtime' directory and then
>recompiling Nutch.
>
>
>cd nutch/trunk
>rm -r runtime
>ant runtime
>
>
>
>
>
>Best regards,
>Mohammad Al-Mohsin
>
>
>On Mon, Feb 16, 2015 at 2:56 AM, Mohammad Al-Mohsin
><me...@mem9.net> wrote:
>
>Here is the error stack:
>
>
>2015-02-16 01:32:29,699 ERROR selenium.Http - Failed to get protocol
>output
>java.lang.NoClassDefFoundError: Could not initialize class
>org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>        at 
>org.apache.http.impl.conn.PoolingHttpClientConnectionManager$InternalConne
>ctionFactory.<init>(PoolingHttpClientConnectionManager.java:493)
>        at 
>org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(Poolin
>gHttpClientConnectionManager.java:149)
>        at 
>org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(Poolin
>gHttpClientConnectionManager.java:138)
>        at 
>org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(Poolin
>gHttpClientConnectionManager.java:114)
>        at 
>org.openqa.selenium.remote.internal.HttpClientFactory.getClientConnectionM
>anager(HttpClientFactory.java:68)
>        at 
>org.openqa.selenium.remote.internal.HttpClientFactory.<init>(HttpClientFac
>tory.java:54)
>        at 
>org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.
>java:98)
>        at 
>org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.
>java:81)
>        at 
>org.openqa.selenium.firefox.internal.NewProfileExtensionConnection.start(N
>ewProfileExtensionConnection.java:93)
>        at 
>org.openqa.selenium.firefox.FirefoxDriver.startClient(FirefoxDriver.java:2
>46)
>        at 
>org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:114
>)
>        at 
>org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:191)
>        at 
>org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:186)
>        at 
>org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:182)
>        at 
>org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:95)
>        at 
>org.apache.nutch.protocol.selenium.HttpWebClient.getHtmlPage(HttpWebClient
>.java:53)
>        at 
>org.apache.nutch.protocol.selenium.HttpResponse.readPlainContent(HttpRespo
>nse.java:199)
>        at 
>org.apache.nutch.protocol.selenium.HttpResponse.<init>(HttpResponse.java:1
>61)
>        at 
>org.apache.nutch.protocol.selenium.Http.getResponse(Http.java:56)
>        at 
>org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.jav
>a:206)
>        at 
>org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:758)
>
>
>Best regards,
>Mohammad Al-Mohsin
>
>
>On Mon, Feb 16, 2015 at 1:57 AM, Mohammad Al-Mohsin
><me...@mem9.net> wrote:
>
>Hi,
>
>
>I'm trying to use Nutch-Selenium plugin with Nutch 1.10 trunk on Mac
>Yosemite.
>
>
>I applied the patch from
>NUTCH-1933 <https://issues.apache.org/jira/browse/NUTCH-1933>, installed
>X11, and included protocol-selenium plugin in nutch-site config file.
>
>
>Now when I start crawling, at the first fetch, I see that Firefox is
>opened and closed immediately and I get this error in the console:
>fetch of 
>http://www.mywebsite.com <http://www.mywebsite.com> failed with:
>java.lang.NoClassDefFoundError: Could not initialize class
>org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>
>
>
>
>Any idea how to fix this error?
>
>Best regards,
>Mohammad Al-Mohsin
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Nutch-Selenium Error

Posted by Mohammad Al-Mohsin <me...@mem9.net>.
FYI, the issue was resolved by deleting 'runtime' directory and then
recompiling Nutch.

cd nutch/trunk
rm -r runtime
ant runtime


Best regards,
Mohammad Al-Mohsin

On Mon, Feb 16, 2015 at 2:56 AM, Mohammad Al-Mohsin <me...@mem9.net> wrote:

> Here is the error stack:
>
> 2015-02-16 01:32:29,699 ERROR selenium.Http - Failed to get protocol output
>
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager$InternalConnectionFactory.<init>(PoolingHttpClientConnectionManager.java:493)
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:149)
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:138)
>
>         at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:114)
>
>         at
> org.openqa.selenium.remote.internal.HttpClientFactory.getClientConnectionManager(HttpClientFactory.java:68)
>
>         at
> org.openqa.selenium.remote.internal.HttpClientFactory.<init>(HttpClientFactory.java:54)
>
>         at
> org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.java:98)
>
>         at
> org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.java:81)
>
>         at
> org.openqa.selenium.firefox.internal.NewProfileExtensionConnection.start(NewProfileExtensionConnection.java:93)
>
>         at
> org.openqa.selenium.firefox.FirefoxDriver.startClient(FirefoxDriver.java:246)
>
>         at
> org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:114)
>
>         at
> org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:191)
>
>         at
> org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:186)
>
>         at
> org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:182)
>
>         at
> org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:95)
>
>         at
> org.apache.nutch.protocol.selenium.HttpWebClient.getHtmlPage(HttpWebClient.java:53)
>
>         at
> org.apache.nutch.protocol.selenium.HttpResponse.readPlainContent(HttpResponse.java:199)
>
>         at
> org.apache.nutch.protocol.selenium.HttpResponse.<init>(HttpResponse.java:161)
>
>         at
> org.apache.nutch.protocol.selenium.Http.getResponse(Http.java:56)
>
>         at
> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:206)
>
>         at
> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:758)
>
> Best regards,
> Mohammad Al-Mohsin
>
> On Mon, Feb 16, 2015 at 1:57 AM, Mohammad Al-Mohsin <me...@mem9.net> wrote:
>
>> Hi,
>>
>> I'm trying to use Nutch-Selenium plugin with Nutch 1.10 trunk on
>> Mac Yosemite.
>>
>> I applied the patch from NUTCH-1933
>> <https://issues.apache.org/jira/browse/NUTCH-1933>, installed X11, and
>> included protocol-selenium plugin in nutch-site config file.
>>
>> Now when I start crawling, at the first fetch, I see that Firefox is
>> opened and closed immediately and I get this error in the console:
>>
>> fetch of http://www.mywebsite.com failed with:
>> java.lang.NoClassDefFoundError: Could not initialize class
>> org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>>
>> Any idea how to fix this error?
>>
>> Best regards,
>> Mohammad Al-Mohsin
>>
>
>

Re: Nutch-Selenium Error

Posted by Mohammad Al-Mohsin <me...@mem9.net>.
Here is the error stack:

2015-02-16 01:32:29,699 ERROR selenium.Http - Failed to get protocol output

java.lang.NoClassDefFoundError: Could not initialize class
org.apache.http.impl.conn.ManagedHttpClientConnectionFactory

        at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager$InternalConnectionFactory.<init>(PoolingHttpClientConnectionManager.java:493)

        at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:149)

        at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:138)

        at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.<init>(PoolingHttpClientConnectionManager.java:114)

        at
org.openqa.selenium.remote.internal.HttpClientFactory.getClientConnectionManager(HttpClientFactory.java:68)

        at
org.openqa.selenium.remote.internal.HttpClientFactory.<init>(HttpClientFactory.java:54)

        at
org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.java:98)

        at
org.openqa.selenium.remote.HttpCommandExecutor.<init>(HttpCommandExecutor.java:81)

        at
org.openqa.selenium.firefox.internal.NewProfileExtensionConnection.start(NewProfileExtensionConnection.java:93)

        at
org.openqa.selenium.firefox.FirefoxDriver.startClient(FirefoxDriver.java:246)

        at
org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:114)

        at
org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:191)

        at
org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:186)

        at
org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:182)

        at
org.openqa.selenium.firefox.FirefoxDriver.<init>(FirefoxDriver.java:95)

        at
org.apache.nutch.protocol.selenium.HttpWebClient.getHtmlPage(HttpWebClient.java:53)

        at
org.apache.nutch.protocol.selenium.HttpResponse.readPlainContent(HttpResponse.java:199)

        at
org.apache.nutch.protocol.selenium.HttpResponse.<init>(HttpResponse.java:161)

        at org.apache.nutch.protocol.selenium.Http.getResponse(Http.java:56)

        at
org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:206)

        at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:758)

Best regards,
Mohammad Al-Mohsin

On Mon, Feb 16, 2015 at 1:57 AM, Mohammad Al-Mohsin <me...@mem9.net> wrote:

> Hi,
>
> I'm trying to use Nutch-Selenium plugin with Nutch 1.10 trunk on
> Mac Yosemite.
>
> I applied the patch from NUTCH-1933
> <https://issues.apache.org/jira/browse/NUTCH-1933>, installed X11, and
> included protocol-selenium plugin in nutch-site config file.
>
> Now when I start crawling, at the first fetch, I see that Firefox is
> opened and closed immediately and I get this error in the console:
>
> fetch of http://www.mywebsite.com failed with:
> java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.http.impl.conn.ManagedHttpClientConnectionFactory
>
> Any idea how to fix this error?
>
> Best regards,
> Mohammad Al-Mohsin
>