You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jean-Philip EIMECKE <jp...@gmail.com> on 2009/01/07 12:46:18 UTC

Clustering Carrot2 + Solr

Hi!

I want to establish a system of clustering but i don't know how i must
realize this operation.
I have seen that the use of SOLR-769 Patch was advised but I don't know what
I must do with clustering-libs.tar and SOLR-769.patch.
Can you explain me the procedure to run clustering with Solr and Carrot?

Thanks you in advance

-- 
Jean-Philip Eimecke

Re: Clustering Carrot2 + Solr

Posted by Grant Ingersoll <gs...@apache.org>.
No problem, please add any and all comments onto the JIRA issue.   
Especially your take on the formats, etc.  I will probably get to  
committing by the end of the month.

Also, do you have any interest in other clustering algorithms?  I have  
in my head to allow for Mahout to do "offline" clustering of the whole  
collection, but haven't worked through the details of that just yet.

-Grant

On Jan 13, 2009, at 9:46 AM, Jean-Philip EIMECKE wrote:

> Thank you so much Grant
>
> Cheers
>
> -- 
> Jean-Philip Eimecke
> jpeimecke@gmail.com



Re: Clustering Carrot2 + Solr

Posted by Jean-Philip EIMECKE <jp...@gmail.com>.
Thank you so much Grant

Cheers

-- 
Jean-Philip Eimecke
jpeimecke@gmail.com

Re: Clustering Carrot2 + Solr

Posted by Grant Ingersoll <gs...@apache.org>.
I've updated the patch for trunk.  I _believe_ it should now work.

-Grant

On Jan 8, 2009, at 9:32 AM, Jean-Philip EIMECKE wrote:

> Thanks for considering my problem
>
> Cheers,
> Jean-Philip Eimecke



Re: Clustering Carrot2 + Solr

Posted by Jean-Philip EIMECKE <jp...@gmail.com>.
Thanks for considering my problem

Cheers,
Jean-Philip Eimecke

Re: Clustering Carrot2 + Solr

Posted by Grant Ingersoll <gs...@apache.org>.
Hmm, OK.  This is due, I bet, to some source being moved around in  
trunk and being in a different location in the build area.  The trick  
would be to change the classpath as appropriate in the clustering  
contrib build.

I will try to put up a new patch this weekend.


On Jan 8, 2009, at 4:51 AM, Jean-Philip EIMECKE wrote:
>
> I have these errors messages :
>
> compile:
>    [javac] Compiling 9 source files to
> /home/jeimecke/Desktop/solr/contrib/clustering/target/classes
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:19:
> package org.apache.solr.common.params does not exist
>    [javac] import org.apache.solr.common.params.SolrParams;
>    [javac]                                      ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:20:
> package org.apache.solr.common.util does not exist
>    [javac] import org.apache.solr.common.util.NamedList;
>    [javac]                                    ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:21:
> package org.apache.solr.core does not exist
>    [javac] import org.apache.solr.core.SolrCore;
>    [javac]                             ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:22:
> package org.apache.solr.core does not exist
>    [javac] import org.apache.solr.core.SolrResourceLoader;
>    [javac]                             ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:24:
> package org.apache.solr.handler.component does not exist
>    [javac] import org.apache.solr.handler.component.ResponseBuilder;
>    [javac]                                          ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:25:
> package org.apache.solr.handler.component does not exist
>    [javac] import org.apache.solr.handler.component.SearchComponent;
>    [javac]                                          ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:26:
> package org.apache.solr.search does not exist
>    [javac] import org.apache.solr.search.DocListAndSet;
>    [javac]                               ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:27:
> package org.apache.solr.util.plugin does not exist
>
> ......
>
> [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:96:
> method does not override a method from its superclass
>    [javac]   @Override
>    [javac]    ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:110:
> cannot find symbol
>    [javac] symbol  : class NamedList
>    [javac] location: class
> org.apache.solr.handler.clustering.ClusteringComponent
>    [javac]           NamedList engineNL = (NamedList)  
> initParams.getVal(i);
>    [javac]           ^
>    [javac]
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/ClusteringComponent.java:110:
> cannot find symbol
>    [javac] symbol  : class NamedList
>    [javac] location: class
> org.apache.solr.handler.clustering.ClusteringComponent
>    [javac]           NamedList engineNL = (NamedList)  
> initParams.getVal(i);
>    [javac]                                 ^
>    [javac] Note:
> /home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/ 
> apache/solr/handler/clustering/carrot2/CarrotClusteringEngine.java
> uses unchecked or unsafe operations.
>    [javac] Note: Recompile with -Xlint:unchecked for details.
>    [javac] 100 errors
>
> BUILD FAILED
> /home/jeimecke/Desktop/solr/common-build.xml:335: The following error
> occurred while executing this line:
> /home/jeimecke/Desktop/solr/common-build.xml:212: The following error
> occurred while executing this line:
> /home/jeimecke/Desktop/solr/contrib/clustering/build.xml:58: The  
> following
> error occurred while executing this line:
> /home/jeimecke/Desktop/solr/common-build.xml:152: Compile failed;  
> see the
> compiler error output for details.
>
> Total time: 16 seconds
>
> Maybe can you give the version of Solr for which the patch was  
> designed ?
> Maybe with this version, the problems will disappear!
>
> Cheers,
> Jean-Philip Eimecke

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ











Re: Clustering Carrot2 + Solr

Posted by Jean-Philip EIMECKE <jp...@gmail.com>.
Thanks for answers..

So, I download Solr thanks to SVN ->
https://svn.apache.org/repos/asf/lucene/solr/trunk/
I apply the patch SOLR-769 and i have these errors messages :
debian:/home/jeimecke/Desktop/solr# patch -p 0 -i SOLR-769.patch
patching file NOTICE.txt
Hunk #5 FAILED at 106.
1 out of 5 hunks FAILED -- saving rejects to file NOTICE.txt.rej
patching file LICENSE.txt
Reversed (or previously applied) patch detected!  Assume -R? [n] n
Apply anyway? [n] n
Skipping patch.
7 out of 7 hunks ignored -- saving rejects to file LICENSE.txt.rej
patching file
contrib/clustering/lib/solr-carrot2-filter-lingo-pom.xml.template
patching file contrib/clustering/lib/solr-jama-pom.xml.template
patching file contrib/clustering/lib/solr-violinstrings-pom.xml.template
patching file
contrib/clustering/lib/solr-carrot2-util-common-pom.xml.template
patching file
contrib/clustering/lib/solr-carrot2-snowball-stemmers-pom.xml.template
patching file
contrib/clustering/lib/solr-carrot2-local-core-pom.xml.template
patching file
contrib/clustering/lib/solr-carrot2-util-tokenizer-pom.xml.template
patching file
contrib/clustering/src/test/java/org/apache/solr/handler/clustering/ClusteringComponentTest.java
patching file
contrib/clustering/src/test/java/org/apache/solr/handler/clustering/AbstractClusteringTest.java
patching file
contrib/clustering/src/test/java/org/apache/solr/handler/clustering/carrot2/CarrotClusteringEngineTest.java
patching file
contrib/clustering/src/test/java/org/apache/solr/handler/clustering/MockDocumentClusteringEngine.java
patching file contrib/clustering/src/test/resources/solr/conf/schema.xml
patching file contrib/clustering/src/test/resources/solr/conf/solrconfig.xml
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringParams.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/DocumentClusteringEngine.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringEngine.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/SearchClusteringEngine.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/CarrotClusteringEngine.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/SolrInputComponent.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/SolrCarrotDocument.java
patching file
contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/CarrotParams.java
patching file contrib/clustering/build.xml
patching file contrib/clustering/solr-clustering-pom.xml.template
patching file contrib/dataimporthandler/build.xml
Hunk #1 FAILED at 30.
1 out of 1 hunk FAILED -- saving rejects to file
contrib/dataimporthandler/build.xml.rej
patching file CHANGES.txt
Hunk #1 succeeded at 139 with fuzz 2 (offset 69 lines).
patching file src/java/org/apache/solr/handler/component/QueryComponent.java
patching file example/clustering/solr/conf/schema.xml
patching file example/clustering/solr/conf/solrconfig.xml
patching file example/clustering/README.txt
patching file build.xml
Hunk #1 FAILED at 192.
Hunk #2 FAILED at 468.
Hunk #3 FAILED at 555.
Hunk #4 FAILED at 571.
Hunk #5 FAILED at 740.
Hunk #6 FAILED at 816.
Hunk #7 succeeded at 766 with fuzz 1 (offset -60 lines).
Hunk #8 succeeded at 784 with fuzz 2 (offset -50 lines).
Hunk #9 FAILED at 860.
7 out of 9 hunks FAILED -- saving rejects to file build.xml.rej

So, I applied the patch manually on build.xml and i have copied
clustering-tabs.tar in contrib/clustering/lib..

Then, I use the command ant run-example

I have these errors messages :

compile:
    [javac] Compiling 9 source files to
/home/jeimecke/Desktop/solr/contrib/clustering/target/classes
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:19:
package org.apache.solr.common.params does not exist
    [javac] import org.apache.solr.common.params.SolrParams;
    [javac]                                      ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:20:
package org.apache.solr.common.util does not exist
    [javac] import org.apache.solr.common.util.NamedList;
    [javac]                                    ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:21:
package org.apache.solr.core does not exist
    [javac] import org.apache.solr.core.SolrCore;
    [javac]                             ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:22:
package org.apache.solr.core does not exist
    [javac] import org.apache.solr.core.SolrResourceLoader;
    [javac]                             ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:24:
package org.apache.solr.handler.component does not exist
    [javac] import org.apache.solr.handler.component.ResponseBuilder;
    [javac]                                          ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:25:
package org.apache.solr.handler.component does not exist
    [javac] import org.apache.solr.handler.component.SearchComponent;
    [javac]                                          ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:26:
package org.apache.solr.search does not exist
    [javac] import org.apache.solr.search.DocListAndSet;
    [javac]                               ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:27:
package org.apache.solr.util.plugin does not exist

......

[javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:96:
method does not override a method from its superclass
    [javac]   @Override
    [javac]    ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:110:
cannot find symbol
    [javac] symbol  : class NamedList
    [javac] location: class
org.apache.solr.handler.clustering.ClusteringComponent
    [javac]           NamedList engineNL = (NamedList) initParams.getVal(i);
    [javac]           ^
    [javac]
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/ClusteringComponent.java:110:
cannot find symbol
    [javac] symbol  : class NamedList
    [javac] location: class
org.apache.solr.handler.clustering.ClusteringComponent
    [javac]           NamedList engineNL = (NamedList) initParams.getVal(i);
    [javac]                                 ^
    [javac] Note:
/home/jeimecke/Desktop/solr/contrib/clustering/src/main/java/org/apache/solr/handler/clustering/carrot2/CarrotClusteringEngine.java
uses unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 100 errors

BUILD FAILED
/home/jeimecke/Desktop/solr/common-build.xml:335: The following error
occurred while executing this line:
/home/jeimecke/Desktop/solr/common-build.xml:212: The following error
occurred while executing this line:
/home/jeimecke/Desktop/solr/contrib/clustering/build.xml:58: The following
error occurred while executing this line:
/home/jeimecke/Desktop/solr/common-build.xml:152: Compile failed; see the
compiler error output for details.

Total time: 16 seconds

Maybe can you give the version of Solr for which the patch was designed ?
Maybe with this version, the problems will disappear!

Cheers,
Jean-Philip Eimecke

Re: Clustering Carrot2 + Solr

Posted by Grant Ingersoll <gs...@apache.org>.
Hi Jean-Philip,

The patch should be standalone in that it creates an area under  
contrib, but it may not be completely up to date, since there have  
been some minor tweaks to the ANT builds for contrib since I wrote the  
clustering stuff.   However, it should still work once you get past  
that.

So, to get it working, apply the patch, and then put the clustering- 
libs.tar into the contrib/clustering/lib directory (I think the dir is  
called clustering).

If I recall, you should be able to do a ant example from the top level  
and it should add into the WAR, etc. (although this will be changed  
when I get a chance to update the patch).

 From there, have a look at http://wiki.apache.org/solr/ClusteringComponent

I may have a moment or two tomorrow, in which case I can look at any  
specific issues you might have.

Cheers,
Grant


On Jan 7, 2009, at 6:46 AM, Jean-Philip EIMECKE wrote:

> Hi!
>
> I want to establish a system of clustering but i don't know how i must
> realize this operation.
> I have seen that the use of SOLR-769 Patch was advised but I don't  
> know what
> I must do with clustering-libs.tar and SOLR-769.patch.
> Can you explain me the procedure to run clustering with Solr and  
> Carrot?
>
> Thanks you in advance
>
> -- 
> Jean-Philip Eimecke




Re: Clustering Carrot2 + Solr

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi,

Most likely (didn't look at SOLR-769) you need to:
1) apply the patch
2) untar the .tar file and copy the jars from it to solr home's lib/ dir


But the patch may be outdated and may not apply cleanly.

Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Jean-Philip EIMECKE <jp...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Wednesday, January 7, 2009 6:46:18 AM
> Subject: Clustering Carrot2 + Solr
> 
> Hi!
> 
> I want to establish a system of clustering but i don't know how i must
> realize this operation.
> I have seen that the use of SOLR-769 Patch was advised but I don't know what
> I must do with clustering-libs.tar and SOLR-769.patch.
> Can you explain me the procedure to run clustering with Solr and Carrot?
> 
> Thanks you in advance
> 
> -- 
> Jean-Philip Eimecke