You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Koji Sekiguchi (JIRA)" <ji...@apache.org> on 2010/12/13 10:49:01 UTC

[jira] Created: (SOLR-2282) Distributed Support for Search Result Clustering

Distributed Support for Search Result Clustering
------------------------------------------------

                 Key: SOLR-2282
                 URL: https://issues.apache.org/jira/browse/SOLR-2282
             Project: Solr
          Issue Type: New Feature
          Components: contrib - Clustering
    Affects Versions: 1.4.1, 1.4
            Reporter: Koji Sekiguchi
            Priority: Minor
             Fix For: 3.1, 4.0


Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2282:
---------------------------------

    Attachment: SOLR-2282.patch

Updated test to avoid deprecated version of cluster method.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2282:
---------------------------------

    Attachment: SOLR-2282.patch

Add a simple test for distributed mode contributed by Brad Giaccio in SOLR-769.

Currently, the test fails due to class path problem when launching Jetty:

{quote}
org/mortbay/jetty/SessionIdManager
java.lang.NoClassDefFoundError: org/mortbay/jetty/SessionIdManager
	at org.apache.solr.BaseDistributedSearchTestCase.createJetty(BaseDistributedSearchTestCase.java:211)
	at org.apache.solr.BaseDistributedSearchTestCase.createJetty(BaseDistributedSearchTestCase.java:202)
	at org.apache.solr.BaseDistributedSearchTestCase.createServers(BaseDistributedSearchTestCase.java:148)
	at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:566)
	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1104)
	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1042)
Caused by: java.lang.ClassNotFoundException: org.mortbay.jetty.SessionIdManager
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
{quote}

(I have to move now, if someone solves the problem, it is welcome :)

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stanislaw Osinski updated SOLR-2282:
------------------------------------

    Attachment: SOLR-2282-diagnostics.patch

Robert: I was using the random seed from the build result in the hope that it will fail the test for me. I'm still unable to get the exception though, with or without the seed. I suppose it shouldn't matter whether I run the complete test suite or just this one test method? (I was doing the latter to save time)

If you have a spare moment, would you be able check the following two things on your machine:

1. Apply the attached diagnostics patch and run the tests. If the test doesn't fail after the change, this means there's some concurrency issue in Carrot2's internal resource pooling mechanisms that we'll need to find. This patch is not a solution to the problem though, just a diagnostic measure.

2. It's paranoid, but can you run the test with the {{-Dargs=-XX:+TraceClassLoading}} option and check that there's no old (v3.4.0) Carrot2 JAR hiding in the bushes? Version 3.4.0 had a subtle bug that could be causing the exception. If there's no traces of Carrot2 3.4.0 JAR in the classpath, we'll need to do further inspection of our code.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982062#action_12982062 ] 

Dawid Weiss commented on SOLR-2282:
-----------------------------------

One more side comment for those interested. I used my favorite technique for debugging such things -- created another project in Eclipse (AspectJ-enabled), created a runtime weaving launch config in Eclipse that started that particular test, wrote this aspect:

{noformat}
package com.carrotsearch.aspects;

import java.util.HashMap;

/**
 * Check for multithreaded access in supposedly single-threaded objects.
 */
public aspect Solr2282
{
    pointcut guardedMethods() :
        execution(* org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.*(..));

    private HashMap<Object, Thread> t = new HashMap<Object, Thread>();
    
    Object around() : guardedMethods()
    {
        Object tokenizer = thisJoinPoint.getThis();
        Thread current = Thread.currentThread();
        try {
            synchronized (Solr2282.class) {
                Thread owner = t.get(tokenizer);
                if (owner != null && owner != current)
                    halt();
                t.put(tokenizer, current);
            }

            return proceed();
        } catch (Throwable e) {
            halt();
            return null;
        } finally {
            synchronized (Solr2282.class) {
                Thread owner = t.get(tokenizer);
                if (owner != null && owner != current)
                    halt();
                t.remove(tokenizer);
            }
        }
    }

    private void halt()
    {
        System.out.println("## HALT! ");
    }
}
{noformat}

and placed a VM-halting breakpoint in sysout inside halt()... Once I got two threads running on the same tokenizer instance, it was a matter of inspecting which objects are shared and how this could possibly happen. 

Aspect-oriented programming never really won me, but as a debugging/ performance analysis tool it simply rocks.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981100#action_12981100 ] 

Koji Sekiguchi commented on SOLR-2282:
--------------------------------------

Thanks, Robert! I committed the fix. (Still I couldn't reproduce the hudson problem on my mac if I comment out @Ignore in DistributedClusteringComponentTest.java.)

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2282:
---------------------------------

    Attachment: SOLR-2282.patch

Updated patch. I think this is ready to go.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2282:
---------------------------------

    Attachment: SOLR-2282.patch

A patch attached. Currently, carrot.produceSummary doesn't work in distributed mode:

{code:title=ClusteringComponent.finishStage()}
// TODO: Currently, docIds is set to null in distributed environment.
// This causes CarrotParams.PRODUCE_SUMMARY doesn't work.
// To work CarrotParams.PRODUCE_SUMMARY under distributed mode, we can choose either one of:
// (a) In each shard, ClusteringComponent produces summary and finishStage()
//     merges these summaries.
// (b) Adding doHighlighting(SolrDocumentList, ...) method to SolrHighlighter and
//     making SolrHighlighter uses "external text" rather than stored values to produce snippets.
{code}


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981501#action_12981501 ] 

Robert Muir commented on SOLR-2282:
-----------------------------------

Guys, thanks for the debugging help already.

Just as a side note: for these tricky non-reproducible ones, sometimes its helpful to use something like -Dtests.iter=10 
its just a convenient way to run the test method multiple times.


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Reopened: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man reopened SOLR-2282:
----------------------------


Reopening issue.

The new test added by this issue...

org.apache.solr.handler.clustering.DistributedClusteringComponentTest.testDistribSearch

...was failing consistently on both hudson, and robert muir's machine, so rmuir disabled it with @Ignore.

we should get to the bottom of this before resolving

error from hudson...

{quote}
Error Message

Some threads threw uncaught exceptions!

Stacktrace

junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:950)
	at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:888)
	at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:371)
	at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:78)
	at org.apache.solr.BaseDistributedSearchTestCase.tearDown(BaseDistributedSearchTestCase.java:130)

Standard Error

22-Dec-2010 6:27:38 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.Error: Error: could not match input
	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzScanError(ExtendedWhitespaceTokenizerImpl.java:687)
	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:836)
	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
	at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
	at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
	at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:199)
	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:44)
	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:178)
	at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222)
	at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110)
	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:171)
	at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101)
	at org.carrot2.core.Controller.process(Controller.java:287)
	at org.carrot2.core.Controller.process(Controller.java:180)
	at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:105)
	at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:171)
	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1358)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
	at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
	at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
	at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
	at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
	at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
	at org.mortbay.jetty.Server.handle(Server.java:326)
	at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
	at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
	at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
	at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
	at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
	at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
	at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

NOTE: reproduce with: ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=412049972111174180:6405396687385598457 -Dtests.multiplier=3
The following exceptions were thrown by threads:
*** Thread: Thread-13 ***
junit.framework.AssertionFailedError: .clusters.length:4!=5
	at junit.framework.Assert.fail(Assert.java:47)
	at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:494)
	at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:262)
*** Thread: Thread-14 ***
java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: Error executing query
	at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:265)
Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
	at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
	at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
	at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:260)
Caused by: org.apache.solr.common.SolrException: Error: could not match input  java.lang.Error: Error: could not match input 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzScanError(ExtendedWhitespaceTokenizerImpl.java:687) 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:836) 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46) 	at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147) 	at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54) 	at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:199) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:44) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:178) 	at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222) 	at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:171) 	at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101) 	at org.carrot2.core.Controller.process(Controller.java:287) 	at org.carrot2.core.Controller.process(Controller.java:180) 	at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:105) 	at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:171) 	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296) 	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1358) 	at org.apache.solr.servlet.SolrDispatchFilter.

Error: could not match input  java.lang.Error: Error: could not match input 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzScanError(ExtendedWhitespaceTokenizerImpl.java:687) 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:836) 	at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46) 	at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147) 	at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54) 	at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:199) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:44) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:178) 	at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:222) 	at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:110) 	at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:171) 	at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:101) 	at org.carrot2.core.Controller.process(Controller.java:287) 	at org.carrot2.core.Controller.process(Controller.java:180) 	at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:105) 	at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:171) 	at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:296) 	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1358) 	at org.apache.solr.servlet.SolrDispatchFilter.

request: http://localhost:14333/solr/select?clustering=true&q=*:*&sort=id desc&clustering.results=true&shards=localhost:14333/solr&wt=javabin&version=2
	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
	at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
	at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
	... 2 more
WARNING: test class left thread running: Thread[MultiThreadedHttpConnectionManager cleanup,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-1,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-3,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-2,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-4,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-5,5,main]
WARNING: test class left thread running: Thread[pool-7-thread-6,5,main]
RESOURCE LEAK: test class left 7 thread(s) running
NOTE: test params are: locale=en_CA, timezone=Asia/Ashgabat
NOTE: all tests run in this JVM:
[ClusteringComponentTest, DistributedClusteringComponentTest]
{quote}

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982114#action_12982114 ] 

Robert Muir commented on SOLR-2282:
-----------------------------------

Thanks a lot for tracking this one down... I tested the patch over here and didn't have any test fails.

+1 from me to commit the patch and enable the distributed test


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-concurrency-branch_3x.patch, SOLR-2282-concurrency-trunk.patch, SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982061#action_12982061 ] 

Dawid Weiss commented on SOLR-2282:
-----------------------------------

I think I nailed it. I did whitebox-inspect Carrot2 code and thought it impossible for a concurrency bug to creep in (in particular with a simple controller), but what we didn't take into account is that Carrot2 infrastructure itself allows a scenario in which a single object instance is bound to multiple components at runtime (and is then effectively shared in a multi threaded context). This code happens to be in Solr's code base, not in Carrot2. The bug happens because of the following series of events:

1) The controller in Solr itself is initialized with a single instance of "new LuceneLanguageModelFactory()" -- this factory is then injected into all components at runtime.
2) The base class of LuceneLanguageModelFactory is DefaultLanguageModelFactory which has an object-local cache of stemmers and tokenizers. In Carrot2 3.4.2, factories are component-bound anyway, so a factory can reuse its resources. In the trunk version, this is no longer the case (factories simply create new objects as they are requested).
3) Because of the tokenizers/stemmers cache, tokenizers and stemmers can be used in parallel when two requests are made at the same time. I think this should be fairly repeatable on all computers, regardless of the number of cores/speed, it's just a matter of time. Clustering is relatively longer than tokenization, so for two tokenizations to overlap (and screw up internal data structures) is a rare event (and yet, as we could see, frequent enough to manifest itself during tests).

{noformat}
    // Customize the language model factory. The implementation we provide here
    // is included in the code base of Solr, so that it's possible to refactor
    // the Lucene APIs the factory relies on if needed.
    initAttributes.put("PreprocessingPipeline.languageModelFactory",
      new LuceneLanguageModelFactory());
    this.controller.init(initAttributes);
{noformat}

The fix for the problem would be to:

1) upgrade to trunk/future Carrot2 version (because of different memory management in factories),
2) pass a class instead of an instance to the initialization parameters. So this should do:

{noformat}
    // Customize the language model factory. The implementation we provide here
    // is included in the code base of Solr, so that it's possible to refactor
    // the Lucene APIs the factory relies on if needed.
    initAttributes.put("PreprocessingPipeline.languageModelFactory",
      LuceneLanguageModelFactory.class);
    this.controller.init(initAttributes);
{noformat}

Works on my machine :) But I'll let Staszek review this again so that we're sure it's really this.



> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981241#action_12981241 ] 

Stanislaw Osinski commented on SOLR-2282:
-----------------------------------------

{quote}
well, its not completely consistent even with the seed to me (smells like a concurrency issue).
{quote}

This is what I've been suspecting from the beginning, I hope Dawid gets better luck at reproducing the problem on his 4-core HT machine.

{quote}
Silly question, but did you remove the @Ignore on DistributedClusteringComponentTest?
Otherwise, the reproducibility problem could be that it doesn't consistently fail every time, even with the same seed.
{quote}

Yeah, I did remove the @Ignore, I'm getting "Testsuite: org.apache.solr.handler.clustering.DistributedClusteringComponentTest, Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 59,658 sec" in the test results dir. When it comes to reproducibility, I wasn't able to reproduce some other concurrency issue on my 2-core machine, while on Dawid's 4-core hardware the tests would fail sometimes, so I hope we can eventually get the exception locally.

{quote}
I ran my previous fail three times, with the patch. This failed two out of three times.
{quote}

Thanks for verifying this! It looks like the bug may be at some other place in C2 code than I initially thought. Let us review the code once again, as soon as we come up with the fix, I'll attach a patch.


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi resolved SOLR-2282.
----------------------------------

    Resolution: Fixed
      Assignee: Koji Sekiguchi

trunk: Committed revision 1051715.
3x: Committed revision 1051725.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980749#action_12980749 ] 

Robert Muir commented on SOLR-2282:
-----------------------------------

sorry guys, i screwed this up, by not adding logic to the BaseDistributedTestCase to make it work for contribs, from resources.

I saw that it extended SolrTestCaseJ4 but I neglected to realize that it doesnt use initCore, so i'll take a look at fixing this.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981237#action_12981237 ] 

Robert Muir commented on SOLR-2282:
-----------------------------------

bq. Robert: I was using the random seed from the build result in the hope that it will fail the test for me. I'm still unable to get the exception though, with or without the seed. I suppose it shouldn't matter whether I run the complete test suite or just this one test method? (I was doing the latter to save time)

well, its not completely consistent even with the seed to me (smells like a concurrency issue).

Silly question, but did you remove the @Ignore on DistributedClusteringComponentTest?
Otherwise, the reproducibility problem could be that it doesn't consistently fail every time, even with the same seed.

I ran my previous fail three times, with the patch: 
{noformat}
ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=8909233178291932652:-4859244606911873252
{noformat}

This failed two out of three times.

I also then ran it with traceclassloading, logging to a file:
{noformat}
ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=8909233178291932652:-4859244606911873252 -Dargs="-XX:+TraceClassLoading" > test.out
{noformat}

all the carrot classes are being loaded from solr/contrib/clustering/lib/carrot2-core-3.4.2.jar


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974314#action_12974314 ] 

Stanislaw Osinski commented on SOLR-2282:
-----------------------------------------

This may be related to a concurrency bug we fixed in the latest (3.4.2) release of Carrot2. Tomorrow morning I can prepare a Carrot2 upgrade patch, which should hopefully fix the problem.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981178#action_12981178 ] 

Robert Muir commented on SOLR-2282:
-----------------------------------

Stanislaw: it is true that with that exact random seed, the test passes for me.
But if i just run 'ant test', often it fails.

Below is the output... i put my OS configuration first here... sorry for the noise.

{noformat}
    [junit] NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_23 (32-bit)/cpus=4,threads=4,free=5267640,total=16384000

test:
    [junit] Testsuite: org.apache.solr.handler.clustering.DistributedClusteringComponentTest
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 13.18 sec
    [junit] ------------- Standard Error -----------------
    [junit] 2011-1-13 3:35:19 org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine cluster
    [junit] SEVERE: Carrot2 clustering failed
    [junit] java.lang.IndexOutOfBoundsException
    [junit]     at java.io.StringReader.read(StringReader.java:76)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:754)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
    [junit]     at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
    [junit]     at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
    [junit]     at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)

    [junit]     at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:111)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)
    [junit]     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:347)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:239)
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)
    [junit]     at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)
    [junit]     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit] 2011-1-13 3:35:19 org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine cluster
    [junit] SEVERE: Carrot2 clustering failed
    [junit] java.lang.IndexOutOfBoundsException
    [junit]     at java.io.StringReader.read(StringReader.java:76)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:754)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
    [junit]     at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
    [junit]     at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
    [junit]     at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)

    [junit]     at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:111)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)
    [junit]     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:347)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:239)
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)
    [junit]     at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)
    [junit]     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit] 2011-1-13 3:35:19 org.apache.solr.common.SolrException log
    [junit] SEVERE: org.apache.solr.common.SolrException: Carrot2 clustering failed
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:110)
    [junit]     at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)
    [junit]     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit] Caused by: java.lang.IndexOutOfBoundsException
    [junit]     at java.io.StringReader.read(StringReader.java:76)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:754)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
    [junit]     at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
    [junit]     at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
    [junit]     at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)

    [junit]     at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:111)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)
    [junit]     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:347)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:239)
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)
    [junit]     ... 19 more
    [junit]
    [junit] 2011-1-13 3:35:19 org.apache.solr.common.SolrException log
    [junit] SEVERE: org.apache.solr.common.SolrException: Carrot2 clustering failed
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:110)
    [junit]     at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)
    [junit]     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit] Caused by: java.lang.IndexOutOfBoundsException
    [junit]     at java.io.StringReader.read(StringReader.java:76)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:754)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
    [junit]     at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
    [junit]     at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
    [junit]     at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)

    [junit]     at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:111)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)
    [junit]     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:347)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:239)
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)
    [junit]     ... 19 more
    [junit]
    [junit] NOTE: reproduce with: ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch
-Dtests.seed=8909233178291932652:-4859244606911873252
    [junit] The following exceptions were thrown by threads:
    [junit] *** Thread: Thread-29 ***
    [junit] java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:333)
    [junit] Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
    [junit]     at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:328)
    [junit] Caused by: org.apache.solr.common.SolrException: Carrot2 clustering failed  org.apache.solr.common.SolrExcep
tion: Carrot2 clustering failed         at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(Car
rotClusteringEngine.java:110)   at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponen
t.java:167)     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)    at org.a
pache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)        at org.apache.solr.core.SolrCore
.execute(SolrCore.java:1296)    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)     at org.mortbay.jetty.servlet.Ser
vletHandler$CachedChain.doFilter(ServletHandler.java:1212)      at org.mortbay.jetty.servlet.ServletHandler.handle(Servl
etHandler.java:399)     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)     at org.mortbay.j
etty.handler.ContextHandler.handle(ContextHandler.java:766)     at org.mortbay.jetty.handler.HandlerWrapper.handle(Handl
erWrapper.java:152)     at org.mortbay.jetty.Server.handle(Server.java:326)     at org.mortbay.jetty.HttpConnection.hand
leRequest(HttpConnection.java:542)      at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection
.java:928)      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)          at org.mortbay.jetty.HttpParser.
parseAvailable(HttpParser.java:212)     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)     at org.m
ortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)       at org.mortbay.thread.QueuedThreadPool$P
oolThread.run(QueuedThreadPool.java:582)  Caused by: java.lang.IndexOutOfBoundsException        at java.io.StringReader.
read(StringReader.java:76)      at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespace
TokenizerImpl.java:557)         at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhites
paceTokenizerImpl
    [junit]
    [junit] Carrot2 clustering failed  org.apache.solr.common.SolrException: Carrot2 clustering failed          at org.a
pache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:110)   at org.apache.so
lr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)     at org.apache.solr.handler.compo
nent.SearchHandler.handleRequestBody(SearchHandler.java:336)    at org.apache.solr.handler.RequestHandlerBase.handleRequ
est(RequestHandlerBase.java:129)        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)    at org.apache.so
lr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)      at org.apache.solr.servlet.SolrDispatchFilter.do
Filter(SolrDispatchFilter.java:240)     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.
java:1212)      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)     at org.mortbay.jetty.ser
vlet.SessionHandler.handle(SessionHandler.java:182)     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandle
r.java:766)     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)     at org.mortbay.jetty.Ser
ver.handle(Server.java:326)     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)      at org.m
ortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)      at org.mortbay.jetty.HttpParser.
parseNext(HttpParser.java:549)          at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)     at org.m
ortbay.jetty.HttpConnection.handle(HttpConnection.java:404)     at org.mortbay.jetty.bio.SocketConnector$Connection.run(
SocketConnector.java:228)       at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)  Caused
 by: java.lang.IndexOutOfBoundsException        at java.io.StringReader.read(StringReader.java:76)      at org.carrot2.t
ext.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)     at org.carrot2.text.
analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl
    [junit]
    [junit] request: http://localhost:65510/solr/select?clustering=true&q=*:*&sort=id desc&clustering.results=true&shard
s=localhost:65509/solr,[::1]:33332/solr|localhost:65510/solr&wt=javabin&version=2
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
    [junit]     ... 2 more
    [junit] *** Thread: Thread-27 ***
    [junit] java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:333)
    [junit] Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
    [junit]     at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:328)
    [junit] Caused by: org.apache.solr.common.SolrException: Carrot2 clustering failed  org.apache.solr.common.SolrExcep
tion: Carrot2 clustering failed         at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(Car
rotClusteringEngine.java:110)   at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponen
t.java:167)     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)    at org.a
pache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)        at org.apache.solr.core.SolrCore
.execute(SolrCore.java:1296)    at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)     at org.mortbay.jetty.servlet.Ser
vletHandler$CachedChain.doFilter(ServletHandler.java:1212)      at org.mortbay.jetty.servlet.ServletHandler.handle(Servl
etHandler.java:399)     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)     at org.mortbay.j
etty.handler.ContextHandler.handle(ContextHandler.java:766)     at org.mortbay.jetty.handler.HandlerWrapper.handle(Handl
erWrapper.java:152)     at org.mortbay.jetty.Server.handle(Server.java:326)     at org.mortbay.jetty.HttpConnection.hand
leRequest(HttpConnection.java:542)      at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection
.java:928)      at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)          at org.mortbay.jetty.HttpParser.
parseAvailable(HttpParser.java:212)     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)     at org.m
ortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)       at org.mortbay.thread.QueuedThreadPool$P
oolThread.run(QueuedThreadPool.java:582)  Caused by: java.lang.IndexOutOfBoundsException        at java.io.StringReader.
read(StringReader.java:76)      at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespace
TokenizerImpl.java:557)         at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhites
paceTokenizerImpl
    [junit]
    [junit] Carrot2 clustering failed  org.apache.solr.common.SolrException: Carrot2 clustering failed          at org.a
pache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:110)   at org.apache.so
lr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)     at org.apache.solr.handler.compo
nent.SearchHandler.handleRequestBody(SearchHandler.java:336)    at org.apache.solr.handler.RequestHandlerBase.handleRequ
est(RequestHandlerBase.java:129)        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)    at org.apache.so
lr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)      at org.apache.solr.servlet.SolrDispatchFilter.do
Filter(SolrDispatchFilter.java:240)     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.
java:1212)      at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)     at org.mortbay.jetty.ser
vlet.SessionHandler.handle(SessionHandler.java:182)     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandle
r.java:766)     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)     at org.mortbay.jetty.Ser
ver.handle(Server.java:326)     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)      at org.m
ortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)      at org.mortbay.jetty.HttpParser.
parseNext(HttpParser.java:549)          at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)     at org.m
ortbay.jetty.HttpConnection.handle(HttpConnection.java:404)     at org.mortbay.jetty.bio.SocketConnector$Connection.run(
SocketConnector.java:228)       at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)  Caused
 by: java.lang.IndexOutOfBoundsException        at java.io.StringReader.read(StringReader.java:76)      at org.carrot2.t
ext.analysis.ExtendedWhitespaceTokenizerImpl.zzRefill(ExtendedWhitespaceTokenizerImpl.java:557)     at org.carrot2.text.
analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl
    [junit]
    [junit] request: http://localhost:65510/solr/select?clustering=true&q=*:*&sort=id desc&clustering.results=true&shards=localhost:65509/solr,[::1]:33332/solr|localhost:65510/solr&wt=javabin&version=2
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
    [junit]     ... 2 more
    [junit] WARNING: test class left thread running: Thread[MultiThreadedHttpConnectionManager cleanup,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-5-thread-1,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-5-thread-2,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-5-thread-3,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-6-thread-1,5,main]
    [junit] RESOURCE LEAK: test class left 5 thread(s) running
    [junit] NOTE: test params are: codec=PreFlex, locale=zh, timezone=America/Indiana/Vincennes
    [junit] NOTE: all tests run in this JVM:
    [junit] [DistributedClusteringComponentTest]
    [junit] NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_23 (32-bit)/cpus=4,threads=4,free=5267640,total=16384000
    [junit] ------------- ---------------- ---------------
    [junit]
    [junit] Testcase: testDistribSearch took 6.559 sec
    [junit]     FAILED
    [junit] Some threads threw uncaught exceptions!
    [junit] junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
    [junit]     at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1127)
    [junit]     at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1059)
    [junit]     at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:511)
    [junit]     at org.apache.solr.SolrTestCaseJ4.tearDown(SolrTestCaseJ4.java:81)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase.tearDown(BaseDistributedSearchTestCase.java:153)
    [junit]
    [junit] Test org.apache.solr.handler.clustering.DistributedClusteringComponentTest FAILED

BUILD FAILED
{noformat}

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980881#action_12980881 ] 

Stanislaw Osinski commented on SOLR-2282:
-----------------------------------------

Sure, I'll take a look at it tomorrow morning. 

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi resolved SOLR-2282.
----------------------------------

    Resolution: Fixed

Thanks everyone!

trunk: Committed revision 1059426.
3x: Committed revision 1059428.


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-concurrency-branch_3x.patch, SOLR-2282-concurrency-trunk.patch, SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981168#action_12981168 ] 

Stanislaw Osinski edited comment on SOLR-2282 at 1/13/11 3:19 AM:
------------------------------------------------------------------

Hi Robert,

What's the configuration (OS / JVM) on which the test is failing for you? I can't get it to fail on my machines (Win 7 64-bit with Sun JVM 1.6.0_20 Client VM and Oracle 1.6.0_23 Server VM, Ubuntu 64-bit with Sun JVM 1.6.0_20 Server VM). I'm running the test using the command I found in Hudson logs (ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=412049972111174180:6405396687385598457 -Dtests.multiplier=3).

S.

      was (Author: stanislaw.osinski):
    Hi Robert,

What's the configuration (OS / JVM) on which the test is failing for you? I can't get it to fail on my machines (Win 7 64-bit with Sun JVM 1.6.0_20 and Oracle 1.6.0_23, Ubuntu 64-bit with Sun JVM 1.6.0_20). I'm running the test using the command I found in Hudson logs (ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=412049972111174180:6405396687385598457 -Dtests.multiplier=3).

S.
  
> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979555#action_12979555 ] 

Stanislaw Osinski commented on SOLR-2282:
-----------------------------------------

Thanks for committing SOLR-2296, Koji! When I now run the test, I'm getting a different exception, which looks like some misconfiguration of the test itself:

{code}
test:
    [junit] Testsuite: org.apache.solr.handler.clustering.DistributedClusteringComponentTest
    [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 4,539 sec
    [junit] ------------- Standard Error -----------------
    [junit] 10 Januari 2011 2:59:26 PM org.apache.solr.common.SolrException log
    [junit] SEVERE: org.apache.solr.common.SolrException: ERROR:unknown field 'url'
    [junit]     at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:321)
    [junit]     at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60)
    [junit]     at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:119)
    [junit]     at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
    [junit]     at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:843)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit]
    [junit] NOTE: reproduce with: ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch
-Dtests.seed=412049972111174180:6405396687385598457 -Dtests.multiplier=3
{code}

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980741#action_12980741 ] 

Koji Sekiguchi commented on SOLR-2282:
--------------------------------------

I've committed the fix for "unknown field 'url'".

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated SOLR-2282:
------------------------------

    Attachment: SOLR-2282_test.patch

here's a patch to fix the BaseDistributedTestCase, so clustering and other contribs can set their own home and use it.

this fixes the unknown field problem, but i'm still seeing the zzBuffer array index out of bounds exception... perhaps 
my checkout is somehow out of date... maybe you can test the patch?


> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Stanislaw Osinski updated SOLR-2282:
------------------------------------

    Attachment: SOLR-2282-concurrency-branch_3x.patch
                SOLR-2282-concurrency-trunk.patch

Thanks for debugging this, Dawid! I think solution 2) you suggested would be the best because it applies both to version 3.4.2 of Carrot2 (currently used by Solr) and the 3.5.0 version (not yet released).

I'm attaching patches for Solr trunk and branch_3x that fix the concurrency issue and correct a typo in a log message output by {{LuceneLanguageModelFactory}}.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-concurrency-branch_3x.patch, SOLR-2282-concurrency-trunk.patch, SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981453#action_12981453 ] 

Dawid Weiss commented on SOLR-2282:
-----------------------------------

I confirm this must be something related to concurrency, although from whitebox code review I have no clue how this can happen. Seems like a long, fascinating weekend is waiting for me (I am busy tomorrow and won't be able to look into it). What is weird is that we're running this code on our demo server, we do have parallel stress tests and still this happens only here. Life.

{noformat}
test:
    [junit] Testsuite: org.apache.solr.handler.clustering.DistributedClusteringComponentTest
    [junit] Tests run: 1, Failures: 1, Errors: 0, Time elapsed: 12.311 sec
    [junit] ------------- Standard Error -----------------
    [junit] 2011-1-13 20:05:39 org.apache.solr.common.SolrException log
    [junit] SEVERE: java.lang.Error: Error: could not match input
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzScanError(ExtendedWhitespaceTokenizerImpl.java:687)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:836)
    [junit]     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)
    [junit]     at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)
    [junit]     at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipeline.java:54)
    [junit]     at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)
    [junit]     at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualClustering.java:111)
    [junit]     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)
    [junit]     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:347)
    [junit]     at org.carrot2.core.Controller.process(Controller.java:239)
    [junit]     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)
    [junit]     at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)
    [junit]     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:336)
    [junit]     at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
    [junit]     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
    [junit]     at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:240)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
    [junit]     at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
    [junit]     at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
    [junit]     at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    [junit]     at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    [junit]     at org.mortbay.jetty.Server.handle(Server.java:326)
    [junit]     at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
    [junit]     at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
    [junit]     at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
    [junit]     at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
    [junit]     at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
    [junit]     at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
    [junit]     at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
    [junit]
    [junit] NOTE: reproduce with: ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=8909233178291932652:-485924
4606911873252
    [junit] The following exceptions were thrown by threads:
    [junit] *** Thread: Thread-28 ***
    [junit] junit.framework.AssertionFailedError: .clusters.length:4!=5
    [junit]     at junit.framework.Assert.fail(Assert.java:47)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase.compareResponses(BaseDistributedSearchTestCase.java:562)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:330)
    [junit] *** Thread: Thread-27 ***
    [junit] java.lang.RuntimeException: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:333)
    [junit] Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing query
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
    [junit]     at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119)
    [junit]     at org.apache.solr.BaseDistributedSearchTestCase$5.run(BaseDistributedSearchTestCase.java:328)
    [junit] Caused by: org.apache.solr.common.SolrException: Error: could not match input  java.lang.Error: Error: could not match input        at org.carrot2.t
ext.analysis.ExtendedWhitespaceTokenizerImpl.zzScanError(ExtendedWhitespaceTokenizerImpl.java:687)      at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer
Impl.getNextToken(ExtendedWhitespaceTokenizerImpl.java:836)     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.j
ava:46)         at org.carrot2.text.preprocessing.Tokenizer.tokenize(Tokenizer.java:147)        at org.carrot2.text.preprocessing.pipeline.CompletePreprocessing
Pipeline.preprocess(CompletePreprocessingPipeline.java:54)      at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocess
ingPipeline.java:92)    at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)     at org.carrot2.clustering.lingo.
LingoClusteringAlgorithm.access$000(LingoClusteringAlgorithm.java:43)   at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgori
thm.java:177)   at org.carrot2.text.clustering.MultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)        at org.carrot2.text.clustering.M
ultilingualClustering.process(MultilingualClustering.java:111)          at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorith
m.java:170)     at org.carrot2.core.ControllerUtils.performProcessing(ControllerUtils.java:102)     at org.carrot2.core.Controller.process(Controller.java:347)
        at org.carrot2.core.Controller.process(Controller.java:239)     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClust
eringEngine.java:106)   at org.apache.solr.handler.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)     at org.apache.solr.handler.compo
nent.SearchHandler.handleRequestBody(SearchHandler.java:336)    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1296)    at org.apache.solr.servle
    [junit]
    [junit] Error: could not match input  java.lang.Error: Error: could not match input         at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.zzS
canError(ExtendedWhitespaceTokenizerImpl.java:687)      at org.carrot2.text.analysis.ExtendedWhitespaceTokenizerImpl.getNextToken(ExtendedWhitespaceTokenizerImp
l.java:836)     at org.carrot2.text.analysis.ExtendedWhitespaceTokenizer.nextToken(ExtendedWhitespaceTokenizer.java:46)         at org.carrot2.text.preprocessin
g.Tokenizer.tokenize(Tokenizer.java:147)        at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.preprocess(CompletePreprocessingPipelin
e.java:54)      at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.preprocess(BasicPreprocessingPipeline.java:92)    at org.carrot2.clusterin
g.lingo.LingoClusteringAlgorithm.cluster(LingoClusteringAlgorithm.java:198)     at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.access$000(LingoCluster
ingAlgorithm.java:43)   at org.carrot2.clustering.lingo.LingoClusteringAlgorithm$1.process(LingoClusteringAlgorithm.java:177)   at org.carrot2.text.clustering.M
ultilingualClustering.clusterByLanguage(MultilingualClustering.java:223)        at org.carrot2.text.clustering.MultilingualClustering.process(MultilingualCluste
ring.java:111)          at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.process(LingoClusteringAlgorithm.java:170)     at org.carrot2.core.ControllerUt
ils.performProcessing(ControllerUtils.java:102)         at org.carrot2.core.Controller.process(Controller.java:347)     at org.carrot2.core.Controller.process(C
ontroller.java:239)     at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:106)   at org.apache.solr.handl
er.clustering.ClusteringComponent.finishStage(ClusteringComponent.java:167)     at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandl
er.java:336)    at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)        at org.apache.solr.core.SolrCore.execute(SolrCor
e.java:1296)    at org.apache.solr.servle
    [junit]
    [junit] request: http://localhost:1550/solr/select?clustering=true&q=*:*&sort=id desc&clustering.results=true&shards=localhost:1549/solr,[::1]:33332/solr|lo
calhost:1550/solr&wt=javabin&version=2
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435)
    [junit]     at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    [junit]     at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
    [junit]     ... 2 more
    [junit] WARNING: test class left thread running: Thread[MultiThreadedHttpConnectionManager cleanup,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-5-thread-1,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-5-thread-2,5,main]
    [junit] WARNING: test class left thread running: Thread[pool-6-thread-1,5,main]
    [junit] RESOURCE LEAK: test class left 4 thread(s) running
    [junit] NOTE: test params are: codec=PreFlex, locale=zh, timezone=Europe/London
    [junit] NOTE: all tests run in this JVM:
    [junit] [DistributedClusteringComponentTest]
    [junit] NOTE: Windows 7 6.1 amd64/Sun Microsystems Inc. 1.6.0_20 (64-bit)/cpus=8,threads=4,free=129024280,total=219873280
{noformat}

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981651#action_12981651 ] 

Dawid Weiss commented on SOLR-2282:
-----------------------------------

This tests.iter is exactly what I will need :) I'll most likely weave a runtime aspect into the code to verify when two threads enter the same critical section. Again, from whitebox review it seems impossible, but then actually detecting and fixing impossible things are what we love in our profession... 

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282-diagnostics.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Dawid Weiss (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980936#action_12980936 ] 

Dawid Weiss commented on SOLR-2282:
-----------------------------------

Robert, can you somehow check if it's the input that's causing these errors?

SEVERE: java.lang.Error: Error: could not match input

I don't have any idea when such an error could happen, but it doesn't seem to be related to concurrency (at first glance).

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980622#action_12980622 ] 

Koji Sekiguchi commented on SOLR-2282:
--------------------------------------

bq. When I now run the test, I'm getting a different exception, which looks like some misconfiguration of the test itself: 

Confirmed.

contrib/clustering/build.xml seems to be changed in SOLR-2299, but I'm not sure the cause of the failure.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2282:
---------------------------------

    Attachment: SOLR-2282.patch

Forgot to svn add a file...

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (SOLR-2282) Distributed Support for Search Result Clustering

Posted by "Stanislaw Osinski (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981168#action_12981168 ] 

Stanislaw Osinski commented on SOLR-2282:
-----------------------------------------

Hi Robert,

What's the configuration (OS / JVM) on which the test is failing for you? I can't get it to fail on my machines (Win 7 64-bit with Sun JVM 1.6.0_20 and Oracle 1.6.0_23, Ubuntu 64-bit with Sun JVM 1.6.0_20). I'm running the test using the command I found in Hudson logs (ant test -Dtestcase=DistributedClusteringComponentTest -Dtestmethod=testDistribSearch -Dtests.seed=412049972111174180:6405396687385598457 -Dtests.multiplier=3).

S.

> Distributed Support for Search Result Clustering
> ------------------------------------------------
>
>                 Key: SOLR-2282
>                 URL: https://issues.apache.org/jira/browse/SOLR-2282
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - Clustering
>    Affects Versions: 1.4, 1.4.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282.patch, SOLR-2282_test.patch
>
>
> Brad Giaccio contributed a patch for this in SOLR-769. I'd like to incorporate it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org