You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Tony Wang (JIRA)" <ji...@apache.org> on 2009/01/10 04:40:00 UTC

[jira] Commented: (NUTCH-442) Integrate Solr/Nutch

    [ https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662613#action_12662613 ] 

Tony Wang commented on NUTCH-442:
---------------------------------

Hello everyone!

I am trying to integrate Nutch with Solr by applying the NUTCH-442_v8.patch file. But not much successful in the patching process. See below:

The text below shown in red is my input on the SSH client window:


I've just downloaded NUTCH-442_v8.patch from https://issues.apache.org/jira/browse/NUTCH-442, but the patching process gave me lots of errors, see below:

webby88 /opt/tomcat6/webapps/nutch: patch < NUTCH-442_v8.patch (Is this right to apply patches in Linux CentOS 5.2?)

The next patch would delete the file TestDistributedSearch.java,
which does not exist!  Assume -R? [n]
Apply anyway? [n] y   (I chose yes)
can't find file to patch at input line 5
Perhaps you should have used the -p or --strip option?
The text leading up to this was:
--------------------------
|Index: src/test/org/apache/nutch/
searcher/TestDistributedSearch.java
|===================================================================
|--- src/test/org/apache/nutch/searcher/TestDistributedSearch.java      (revision 701044)
|+++ src/test/org/apache/nutch/searcher/TestDistributedSearch.java      (working copy)
--------------------------
File to patch:
Skip this patch? [y] n
File to patch: src/test/org/apache/nutch/searcher/TestDistributedSearch.java (I copied the path from the revision 701044 to here)
patching file src/test/org/apache/nutch/searcher/TestDistributedSearch.java
can't find file to patch at input line 154
Perhaps you should have used the -p or --strip option?
The text leading up to this was:
--------------------------
|Index: src/test/org/apache/nutch/indexer/TestIndexingFilters.java
|===================================================================
|--- src/test/org/apache/nutch/indexer/TestIndexingFilters.java (revision 701044)
|+++ src/test/org/apache/nutch/indexer/TestIndexingFilters.java (working copy)
--------------------------
File to patch: src/test/org/apache/nutch/indexer/TestIndexingFilters.java (I copied the path from the revision 701044 to here)

Too many similar 'file cannot be found' errors here, so errors cut off.

File to patch:
Skip this patch? [y] y
Skipping patch.
11 out of 11 hunks ignored
patching file build.xml

When I tried to run 'ant war' in the nutch installation directory, I got this error:

BUILD FAILED
/opt/tomcat6/webapps/nutch/build.xml:107: Compile failed; see the compiler error output for details.

I wonder if my way of applying this patch is correct or not. Could you please give me some correction if I did wrong? My system is CentOS 5.2 by the way.


> Integrate Solr/Nutch
> --------------------
>
>                 Key: NUTCH-442
>                 URL: https://issues.apache.org/jira/browse/NUTCH-442
>             Project: Nutch
>          Issue Type: New Feature
>          Components: indexer, searcher
>         Environment: Ubuntu linux
>            Reporter: rubdabadub
>            Assignee: Doğacan Güney
>             Fix For: 1.0.0
>
>         Attachments: Crawl.patch, Indexer.patch, NUTCH-442_v4.patch, NUTCH-442_v5.patch, NUTCH-442_v6.patch.txt, NUTCH-442_v7.patch.txt, NUTCH-442_v7a.patch.txt, NUTCH-442_v8.patch, NUTCH_442_v3.patch, RFC_multiple_search_backends.patch, schema.xml
>
>
> Hi:
> After trying out Sami's patch regarding Solr/Nutch. Can be found here (http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html) and I can confirm it worked :-) And that lead me to request the following :
> I would be very very great full if this could be included in nutch 0.9 as I am trying to eliminate my python based crawler which post documents to solr. As I am in the corporate enviornment I can't install trunk version in the production enviornment thus I am asking this to be included in 0.9 release. I hope my wish would be granted.
> I look forward to get some feedback.
> Thank you.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch

Posted by Doğacan Güney <do...@gmail.com>.
On Sat, Jan 10, 2009 at 8:16 AM, Otis Gospodnetic
<og...@yahoo.com> wrote:
> Tony,
>
> You've sent about 10 emails about this already, both on the Nutch and on the Solr list.
> Please have a bit more patience and wait for Nutch 1.0 release.  My guess is this Nutch-Solr integration will be in Nutch 1.0.
>

Yes this one is my fault. NUTCH-442 will be in 1.0, I didn't have time
to test it for the last time, so I am the holdup :)

I will test it over the weekend and commit it soon.

>
> Thanks,
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> ----- Original Message ----
>> From: Tony Wang (JIRA) <ji...@apache.org>
>> To: nutch-dev@lucene.apache.org
>> Sent: Friday, January 9, 2009 10:40:00 PM
>> Subject: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch
>>
>>
>>     [
>> https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662613#action_12662613
>> ]
>>
>> Tony Wang commented on NUTCH-442:
>> ---------------------------------
>>
>> Hello everyone!
>>
>> I am trying to integrate Nutch with Solr by applying the NUTCH-442_v8.patch
>> file. But not much successful in the patching process. See below:
>>
>> The text below shown in red is my input on the SSH client window:
>>
>>
>> I've just downloaded NUTCH-442_v8.patch from
>> https://issues.apache.org/jira/browse/NUTCH-442, but the patching process gave
>> me lots of errors, see below:
>>
>> webby88 /opt/tomcat6/webapps/nutch: patch < NUTCH-442_v8.patch (Is this right to
>> apply patches in Linux CentOS 5.2?)
>>
>> The next patch would delete the file TestDistributedSearch.java,
>> which does not exist!  Assume -R? [n]
>> Apply anyway? [n] y   (I chose yes)
>> can't find file to patch at input line 5
>> Perhaps you should have used the -p or --strip option?
>> The text leading up to this was:
>> --------------------------
>> |Index: src/test/org/apache/nutch/
>> searcher/TestDistributedSearch.java
>> |===================================================================
>> |--- src/test/org/apache/nutch/searcher/TestDistributedSearch.java
>> (revision 701044)
>> |+++ src/test/org/apache/nutch/searcher/TestDistributedSearch.java      (working
>> copy)
>> --------------------------
>> File to patch:
>> Skip this patch? [y] n
>> File to patch: src/test/org/apache/nutch/searcher/TestDistributedSearch.java (I
>> copied the path from the revision 701044 to here)
>> patching file src/test/org/apache/nutch/searcher/TestDistributedSearch.java
>> can't find file to patch at input line 154
>> Perhaps you should have used the -p or --strip option?
>> The text leading up to this was:
>> --------------------------
>> |Index: src/test/org/apache/nutch/indexer/TestIndexingFilters.java
>> |===================================================================
>> |--- src/test/org/apache/nutch/indexer/TestIndexingFilters.java (revision
>> 701044)
>> |+++ src/test/org/apache/nutch/indexer/TestIndexingFilters.java (working copy)
>> --------------------------
>> File to patch: src/test/org/apache/nutch/indexer/TestIndexingFilters.java (I
>> copied the path from the revision 701044 to here)
>>
>> Too many similar 'file cannot be found' errors here, so errors cut off.
>>
>> File to patch:
>> Skip this patch? [y] y
>> Skipping patch.
>> 11 out of 11 hunks ignored
>> patching file build.xml
>>
>> When I tried to run 'ant war' in the nutch installation directory, I got this
>> error:
>>
>> BUILD FAILED
>> /opt/tomcat6/webapps/nutch/build.xml:107: Compile failed; see the compiler error
>> output for details.
>>
>> I wonder if my way of applying this patch is correct or not. Could you please
>> give me some correction if I did wrong? My system is CentOS 5.2 by the way.
>>
>>
>> > Integrate Solr/Nutch
>> > --------------------
>> >
>> >                 Key: NUTCH-442
>> >                 URL: https://issues.apache.org/jira/browse/NUTCH-442
>> >             Project: Nutch
>> >          Issue Type: New Feature
>> >          Components: indexer, searcher
>> >         Environment: Ubuntu linux
>> >            Reporter: rubdabadub
>> >            Assignee: Doğacan Güney
>> >             Fix For: 1.0.0
>> >
>> >         Attachments: Crawl.patch, Indexer.patch, NUTCH-442_v4.patch,
>> NUTCH-442_v5.patch, NUTCH-442_v6.patch.txt, NUTCH-442_v7.patch.txt,
>> NUTCH-442_v7a.patch.txt, NUTCH-442_v8.patch, NUTCH_442_v3.patch,
>> RFC_multiple_search_backends.patch, schema.xml
>> >
>> >
>> > Hi:
>> > After trying out Sami's patch regarding Solr/Nutch. Can be found here
>> (http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html)
>> and I can confirm it worked :-) And that lead me to request the following :
>> > I would be very very great full if this could be included in nutch 0.9 as I am
>> trying to eliminate my python based crawler which post documents to solr. As I
>> am in the corporate enviornment I can't install trunk version in the production
>> enviornment thus I am asking this to be included in 0.9 release. I hope my wish
>> would be granted.
>> > I look forward to get some feedback.
>> > Thank you.
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>
>



-- 
Doğacan Güney

Re: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch

Posted by Otis Gospodnetic <og...@yahoo.com>.
Tony,

You've sent about 10 emails about this already, both on the Nutch and on the Solr list.
Please have a bit more patience and wait for Nutch 1.0 release.  My guess is this Nutch-Solr integration will be in Nutch 1.0.


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Tony Wang (JIRA) <ji...@apache.org>
> To: nutch-dev@lucene.apache.org
> Sent: Friday, January 9, 2009 10:40:00 PM
> Subject: [jira] Commented: (NUTCH-442) Integrate Solr/Nutch
> 
> 
>     [ 
> https://issues.apache.org/jira/browse/NUTCH-442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662613#action_12662613 
> ] 
> 
> Tony Wang commented on NUTCH-442:
> ---------------------------------
> 
> Hello everyone!
> 
> I am trying to integrate Nutch with Solr by applying the NUTCH-442_v8.patch 
> file. But not much successful in the patching process. See below:
> 
> The text below shown in red is my input on the SSH client window:
> 
> 
> I've just downloaded NUTCH-442_v8.patch from 
> https://issues.apache.org/jira/browse/NUTCH-442, but the patching process gave 
> me lots of errors, see below:
> 
> webby88 /opt/tomcat6/webapps/nutch: patch < NUTCH-442_v8.patch (Is this right to 
> apply patches in Linux CentOS 5.2?)
> 
> The next patch would delete the file TestDistributedSearch.java,
> which does not exist!  Assume -R? [n]
> Apply anyway? [n] y   (I chose yes)
> can't find file to patch at input line 5
> Perhaps you should have used the -p or --strip option?
> The text leading up to this was:
> --------------------------
> |Index: src/test/org/apache/nutch/
> searcher/TestDistributedSearch.java
> |===================================================================
> |--- src/test/org/apache/nutch/searcher/TestDistributedSearch.java      
> (revision 701044)
> |+++ src/test/org/apache/nutch/searcher/TestDistributedSearch.java      (working 
> copy)
> --------------------------
> File to patch:
> Skip this patch? [y] n
> File to patch: src/test/org/apache/nutch/searcher/TestDistributedSearch.java (I 
> copied the path from the revision 701044 to here)
> patching file src/test/org/apache/nutch/searcher/TestDistributedSearch.java
> can't find file to patch at input line 154
> Perhaps you should have used the -p or --strip option?
> The text leading up to this was:
> --------------------------
> |Index: src/test/org/apache/nutch/indexer/TestIndexingFilters.java
> |===================================================================
> |--- src/test/org/apache/nutch/indexer/TestIndexingFilters.java (revision 
> 701044)
> |+++ src/test/org/apache/nutch/indexer/TestIndexingFilters.java (working copy)
> --------------------------
> File to patch: src/test/org/apache/nutch/indexer/TestIndexingFilters.java (I 
> copied the path from the revision 701044 to here)
> 
> Too many similar 'file cannot be found' errors here, so errors cut off.
> 
> File to patch:
> Skip this patch? [y] y
> Skipping patch.
> 11 out of 11 hunks ignored
> patching file build.xml
> 
> When I tried to run 'ant war' in the nutch installation directory, I got this 
> error:
> 
> BUILD FAILED
> /opt/tomcat6/webapps/nutch/build.xml:107: Compile failed; see the compiler error 
> output for details.
> 
> I wonder if my way of applying this patch is correct or not. Could you please 
> give me some correction if I did wrong? My system is CentOS 5.2 by the way.
> 
> 
> > Integrate Solr/Nutch
> > --------------------
> >
> >                 Key: NUTCH-442
> >                 URL: https://issues.apache.org/jira/browse/NUTCH-442
> >             Project: Nutch
> >          Issue Type: New Feature
> >          Components: indexer, searcher
> >         Environment: Ubuntu linux
> >            Reporter: rubdabadub
> >            Assignee: Doğacan Güney
> >             Fix For: 1.0.0
> >
> >         Attachments: Crawl.patch, Indexer.patch, NUTCH-442_v4.patch, 
> NUTCH-442_v5.patch, NUTCH-442_v6.patch.txt, NUTCH-442_v7.patch.txt, 
> NUTCH-442_v7a.patch.txt, NUTCH-442_v8.patch, NUTCH_442_v3.patch, 
> RFC_multiple_search_backends.patch, schema.xml
> >
> >
> > Hi:
> > After trying out Sami's patch regarding Solr/Nutch. Can be found here 
> (http://blog.foofactory.fi/2007/02/online-indexing-integrating-nutch-with.html) 
> and I can confirm it worked :-) And that lead me to request the following :
> > I would be very very great full if this could be included in nutch 0.9 as I am 
> trying to eliminate my python based crawler which post documents to solr. As I 
> am in the corporate enviornment I can't install trunk version in the production 
> enviornment thus I am asking this to be included in 0.9 release. I hope my wish 
> would be granted.
> > I look forward to get some feedback.
> > Thank you.
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.