You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Grant Ingersoll <gs...@apache.org> on 2009/08/18 15:55:41 UTC

Re: Validating clustering output

On Jul 27, 2009, at 9:42 PM, Ted Dunning wrote:

> The other reference I am looking for may be in David Mackay's book.   
> The
> idea is that you measure the quality of the approximation by looking  
> at the
> entropy in the cluster assignment relative to the residual required to
> precisely specify the original data relative to the quantized value.

Is the WM Rand paper in JSTOR ("Object Criteria for Evaluation of  
Clustering Methods") worthwhile on this topic?  Basic searches for  
"evaluating clustering" or "cluster evaluation" on Google Scholar turn  
up very little.  The Rand paper is from 1971, but who knows...

Of course, I'd like something that doesn't require purchase (sigh.)

Re: Validating clustering output

Posted by Ted Dunning <te...@gmail.com>.
These all depend on gold standards.  If you have those, then it is easy to
evaluate a clustering.

What is hard is to evaluate a clustering without a standard.  I have done
this, somewhat, in the past by looking at stability over time in terms of
cluster size and membership.  I have also looked at the utility of cluster
membership in predicting objective attributes not used in the clustering.
The stability criteria might apply to some of our data sets.  The utility
measure only works in a modeling setting.

On Tue, Aug 18, 2009 at 7:32 AM, Grant Ingersoll <gs...@apache.org>wrote:

> Also found:
> http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
>
>
> On Aug 18, 2009, at 9:55 AM, Grant Ingersoll wrote:
>
>
>> On Jul 27, 2009, at 9:42 PM, Ted Dunning wrote:
>>
>>  The other reference I am looking for may be in David Mackay's book.  The
>>> idea is that you measure the quality of the approximation by looking at
>>> the
>>> entropy in the cluster assignment relative to the residual required to
>>> precisely specify the original data relative to the quantized value.
>>>
>>
>> Is the WM Rand paper in JSTOR ("Object Criteria for Evaluation of
>> Clustering Methods") worthwhile on this topic?  Basic searches for
>> "evaluating clustering" or "cluster evaluation" on Google Scholar turn up
>> very little.  The Rand paper is from 1971, but who knows...
>>
>> Of course, I'd like something that doesn't require purchase (sigh.)
>>
>
>
>


-- 
Ted Dunning, CTO
DeepDyve

Re: AW: Validating clustering output

Posted by Grant Ingersoll <gs...@apache.org>.
Hi Benjamin,

Please start a separate thread with an appropriate subject, as you  
will be much more likely to get answers for your question.

-Grant

On Aug 18, 2009, at 11:37 AM, Benjamin Dageroth wrote:

> I just installed Mahout on my windows machine and wanted to try out  
> the taste example with the grouplens data. Although I seem to have  
> done everything according to the suggested instructions at http://lucene.apache.org/mahout/taste.html#demo 
>  - However, I cannot get the webapp running and get a 503 message:  
> Service unavailable. When starting jetty, the servlet Container  
> accompanying the demo, it goes through and boasts that it started  
> Jetty Server, but during startup it lets me know that there is an  
> exception, which I suppose will be the culprit.
>
> java.net.URISyntaxException: Illegal character in path at index 18:  
> file:/C:/Dokumente und Einstellungen/bda/.m2/repository/org/mortbay/ 
> jetty/jetty-maven-plugin/7.0.0.1beta3/jetty-maven- 
> plugin-7.0.0.1beta3.jar
> at java.net.URI$Parser.fail<URI.java:2089>
> at java.net.URI$Parser.checkChars<URI.java:2982>
> at java.net.URI$Parser.parseHierarchical<URI.java:3066>
> at java.net.URI$Parser.parse<URI.java:3014>
> at java.net.URI.<init><URI.java:578>
> at java.net.URL.toURI<URL.java:918>
> ...
> Etc.
> The complete log of the startup process can be found further down. I  
> would guess that empty spaces might pose a problem, but I am not  
> sure what I can do about that when the home directory of a user is  
> used which is always filed under c:\dokumente und Einstellungen\ and  
> maven goes to look there.
>
> Any Idea where I can change the path, in case that this is indeed  
> the problem? Otherwise, what is my problem? ;-)
>
> Thanks a lot,
>
> Benjamin
>
> ------------------------------------------------------------------
> Complete Log:
> $ /cygdrive/c/workspace/maven/apache-maven-2.2.0/bin/mvn jetty:run-war
> [INFO] Scanning for projects...
> [INFO]  
> ------------------------------------------------------------------------
> [INFO] Building Mahout Taste Webapp
> [INFO]    task-segment: [jetty:run-war]
> [INFO]  
> ------------------------------------------------------------------------
> [INFO] Preparing jetty:run-war
> [INFO] [resources:resources {execution: default-resources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 4 resources
> [INFO] Copying 1 resource to c:\workspace\Mahout for Zanox\taste-web 
> \target/maho
> ut-taste-webapp-0.2-SNAPSHOT/WEB-INF/lib
> [INFO] [resources:copy-resources {execution: copy-resources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 3 resources
> [INFO] [compiler:compile {execution: default-compile}]
> [INFO] Nothing to compile - all classes are up to date
> [INFO] [resources:testResources {execution: default-testResources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] skip non existing resourceDirectory c:\workspace\Mahout for  
> Zanox\taste-w
> eb\src\test\resources
> [INFO] [compiler:testCompile {execution: default-testCompile}]
> [INFO] Nothing to compile - all classes are up to date
> [INFO] [surefire:test {execution: default-test}]
> [INFO] No tests to run.
> [INFO] [war:war {execution: default-war}]
> [INFO] Packaging webapp
> [INFO] Assembling webapp[mahout-taste-webapp] in [c:\workspace 
> \Mahout for Zanox\
> taste-web\target\mahout-taste-webapp-0.2-SNAPSHOT]
> [INFO] Dependency[Dependency {groupId=org.apache.mahout,  
> artifactId=mahout-core,
> version=0.2-SNAPSHOT, type=jar}] has changed (was Dependency  
> {groupId=org.apach
> e.mahout, artifactId=mahout-core, version=0.2-SNAPSHOT, type=jar}).
> [INFO] Dependency[Dependency {groupId=axis, artifactId=axis,  
> version=1.4, type=j
> ar}] has changed (was Dependency {groupId=axis, artifactId=axis,  
> version=1.4, ty
> pe=jar}).
> [INFO] Dependency[Dependency {groupId=javax.servlet,  
> artifactId=servlet-api, ver
> sion=2.4, type=jar}] has changed (was Dependency  
> {groupId=javax.servlet, artifac
> tId=servlet-api, version=2.4, type=jar}).
> [INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j- 
> api, version=1
> .5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j,  
> artifactId=slf4
> j-api, version=1.5.6, type=jar}).
> [INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j- 
> jcl, version=1
> .5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j,  
> artifactId=slf4
> j-jcl, version=1.5.6, type=jar}).
> [INFO] Processing war project
> [INFO] Copying webapp resources[c:\workspace\Mahout for Zanox\taste- 
> web\src\main
> \webapp]
> [INFO] Webapp assembled in[94 msecs]
> [INFO] Building war: c:\workspace\Mahout for Zanox\taste-web\target 
> \mahout-taste
> -webapp-0.2-SNAPSHOT.war
> [INFO] [jetty:run-war {execution: default-cli}]
> [INFO] Configuring Jetty for project: Mahout Taste Webapp
> 2009-08-18 17:29:38.216::INFO:  Logging to STDERR via  
> org.eclipse.jetty.util.log
> .StdErrLog
> [INFO] Context path = /
> [INFO] Tmp directory = C:\workspace\Mahout for Zanox\taste-web\target 
> \work
> [INFO] Web defaults = org/eclipse/jetty/webapp/webdefault.xml
> [INFO] Web overrides =  none
> [INFO] Starting jetty 7.0.0.M4 ...
> 2009-08-18 17:29:38.247::INFO:  jetty-7.0.0.M4
> 2009-08-18 17:29:38.278::INFO:  Extract C:\workspace\Mahout for Zanox 
> \taste-web\
> target\mahout-taste-webapp-0.2-SNAPSHOT.war to C:\workspace\Mahout  
> for Zanox\tas
> te-web\target\work\webapp
> 2009-08-18 17:29:41.106::WARN:  Failed startup of context  
> JettyWebAppContext@4eb
> 585@4eb585/,file:/C:/workspace/Mahout%20for%20Zanox/taste-web/target/ 
> work/webapp
> /,C:\workspace\Mahout for Zanox\taste-web\target\mahout-taste- 
> webapp-0.2-SNAPSHO
> T.war
> java.net.URISyntaxException: Illegal character in path at index 18:  
> file:/C:/Dok
> umente und Einstellungen/bda/.m2/repository/org/mortbay/jetty/jetty- 
> maven-plugin
> /7.0.0.1beta3/jetty-maven-plugin-7.0.0.1beta3.jar
>        at java.net.URI$Parser.fail(URI.java:2809)
>        at java.net.URI$Parser.checkChars(URI.java:2982)
>        at java.net.URI$Parser.parseHierarchical(URI.java:3066)
>        at java.net.URI$Parser.parse(URI.java:3014)
>        at java.net.URI.<init>(URI.java:578)
>        at java.net.URL.toURI(URL.java:918)
>        at  
> org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfi
> guration.java:79)
>        at  
> org.mortbay.jetty.plugin.MavenWebInfConfiguration.preConfigure(MavenW
> ebInfConfiguration.java:39)
>        at  
> org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:343
> )
>        at  
> org.mortbay.jetty.plugin.JettyWebAppContext.doStart(JettyWebAppContex
> t.java:89)
>        at  
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
>        at  
> org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
> lection.java:164)
>        at  
> org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(Con
> textHandlerCollection.java:161)
>        at  
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
>        at  
> org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
> lection.java:164)
>        at  
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
>        at  
> org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrappe
> r.java:92)
>        at org.eclipse.jetty.server.Server.doStart(Server.java:225)
>        at  
> org.mortbay.jetty.plugin.JettyServer.doStart(JettyServer.java:69)
>        at  
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
>        at  
> org.mortbay.jetty.plugin.AbstractJettyMojo.startJetty(AbstractJettyMo
> jo.java:423)
>        at  
> org.mortbay.jetty.plugin.AbstractJettyMojo.execute(AbstractJettyMojo.
> java:366)
>        at  
> org.mortbay.jetty.plugin.JettyRunWarMojo.execute(JettyRunWarMojo.java
> :68)
>        at  
> org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPlugi
> nManager.java:483)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(Defa
> ultLifecycleExecutor.java:678)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandalone
> Goal(DefaultLifecycleExecutor.java:553)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(Defau
> ltLifecycleExecutor.java:523)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHan
> dleFailures(DefaultLifecycleExecutor.java:371)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegmen
> ts(DefaultLifecycleExecutor.java:332)
>        at  
> org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLi
> fecycleExecutor.java:181)
>        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java: 
> 356)
>        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:137)
>        at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
>        at  
> org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:4
> 1)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at  
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
> java:39)
>        at  
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
> sorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at  
> org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
>        at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
>        at  
> org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>
>        at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
> 2009-08-18 17:29:41.184::INFO:  Started  
> SelectChannelConnector@0.0.0.0:8080
> [INFO] Started Jetty Server
>
> _______________________________________
> Benjamin Dageroth, Key Account Manager / Softwareentwickler
> Webtrekk GmbH
> Boxhagener Str. 76-78, 10245 Berlin
> fon 030 - 755 415 - 360
> fax 030 - 755 415 - 100
> benjamin.dageroth@webtrekk.com
> http://www.webtrekk.com
> Amtsgericht Berlin, HRB 93435 B
> Geschäftsführer Christian Sauer
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


RE: AW: Validating clustering output

Posted by Jack Tanner <ih...@hotmail.com>.
As Grant said, please start new threads for new questions.

Aside from that, this is apparently a known issue in maven/jetty.

https://issues.sonatype.org/browse/MVNDEF-114
http://jira.codehaus.org/browse/JETTY-1063

One workaround is to define a localRepository path that has no spaces.

----------------------------------------
> From: Benjamin.Dageroth@webtrekk.com
> To: mahout-user@lucene.apache.org
> Date: Tue, 18 Aug 2009 17:37:46 +0200
> Subject: AW: Validating clustering output
>
> I just installed Mahout on my windows machine and wanted to try out the taste example with the grouplens data. Although I seem to have done everything according to the suggested instructions at http://lucene.apache.org/mahout/taste.html#demo - However, I cannot get the webapp running and get a 503 message: Service unavailable. When starting jetty, the servlet Container accompanying the demo, it goes through and boasts that it started Jetty Server, but during startup it lets me know that there is an exception, which I suppose will be the culprit.
>
> java.net.URISyntaxException: Illegal character in path at index 18: file:/C:/Dokumente und Einstellungen/bda/.m2/repository/org/mortbay/jetty/jetty-maven-plugin/7.0.0.1beta3/jetty-maven-plugin-7.0.0.1beta3.jar
> at java.net.URI$Parser.fail
> at java.net.URI$Parser.checkChars
> at java.net.URI$Parser.parseHierarchical
> at java.net.URI$Parser.parse
> at java.net.URI.
> at java.net.URL.toURI
> ...
> Etc.
> The complete log of the startup process can be found further down. I would guess that empty spaces might pose a problem, but I am not sure what I can do about that when the home directory of a user is used which is always filed under c:\dokumente und Einstellungen\ and maven goes to look there.
>
> Any Idea where I can change the path, in case that this is indeed the problem? Otherwise, what is my problem? ;-)
>
> Thanks a lot,
>
> Benjamin
>
> ------------------------------------------------------------------
> Complete Log:
> $ /cygdrive/c/workspace/maven/apache-maven-2.2.0/bin/mvn jetty:run-war
> [INFO] Scanning for projects...
> [INFO] ------------------------------------------------------------------------
> [INFO] Building Mahout Taste Webapp
> [INFO] task-segment: [jetty:run-war]
> [INFO] ------------------------------------------------------------------------
> [INFO] Preparing jetty:run-war
> [INFO] [resources:resources {execution: default-resources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 4 resources
> [INFO] Copying 1 resource to c:\workspace\Mahout for Zanox\taste-web\target/maho
> ut-taste-webapp-0.2-SNAPSHOT/WEB-INF/lib
> [INFO] [resources:copy-resources {execution: copy-resources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 3 resources
> [INFO] [compiler:compile {execution: default-compile}]
> [INFO] Nothing to compile - all classes are up to date
> [INFO] [resources:testResources {execution: default-testResources}]
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] skip non existing resourceDirectory c:\workspace\Mahout for Zanox\taste-w
> eb\src\test\resources
> [INFO] [compiler:testCompile {execution: default-testCompile}]
> [INFO] Nothing to compile - all classes are up to date
> [INFO] [surefire:test {execution: default-test}]
> [INFO] No tests to run.
> [INFO] [war:war {execution: default-war}]
> [INFO] Packaging webapp
> [INFO] Assembling webapp[mahout-taste-webapp] in [c:\workspace\Mahout for Zanox\
> taste-web\target\mahout-taste-webapp-0.2-SNAPSHOT]
> [INFO] Dependency[Dependency {groupId=org.apache.mahout, artifactId=mahout-core,
> version=0.2-SNAPSHOT, type=jar}] has changed (was Dependency {groupId=org.apach
> e.mahout, artifactId=mahout-core, version=0.2-SNAPSHOT, type=jar}).
> [INFO] Dependency[Dependency {groupId=axis, artifactId=axis, version=1.4, type=j
> ar}] has changed (was Dependency {groupId=axis, artifactId=axis, version=1.4, ty
> pe=jar}).
> [INFO] Dependency[Dependency {groupId=javax.servlet, artifactId=servlet-api, ver
> sion=2.4, type=jar}] has changed (was Dependency {groupId=javax.servlet, artifac
> tId=servlet-api, version=2.4, type=jar}).
> [INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j-api, version=1
> .5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j, artifactId=slf4
> j-api, version=1.5.6, type=jar}).
> [INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j-jcl, version=1
> .5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j, artifactId=slf4
> j-jcl, version=1.5.6, type=jar}).
> [INFO] Processing war project
> [INFO] Copying webapp resources[c:\workspace\Mahout for Zanox\taste-web\src\main
> \webapp]
> [INFO] Webapp assembled in[94 msecs]
> [INFO] Building war: c:\workspace\Mahout for Zanox\taste-web\target\mahout-taste
> -webapp-0.2-SNAPSHOT.war
> [INFO] [jetty:run-war {execution: default-cli}]
> [INFO] Configuring Jetty for project: Mahout Taste Webapp
> 2009-08-18 17:29:38.216::INFO: Logging to STDERR via org.eclipse.jetty.util.log
> .StdErrLog
> [INFO] Context path = /
> [INFO] Tmp directory = C:\workspace\Mahout for Zanox\taste-web\target\work
> [INFO] Web defaults = org/eclipse/jetty/webapp/webdefault.xml
> [INFO] Web overrides = none
> [INFO] Starting jetty 7.0.0.M4 ...
> 2009-08-18 17:29:38.247::INFO: jetty-7.0.0.M4
> 2009-08-18 17:29:38.278::INFO: Extract C:\workspace\Mahout for Zanox\taste-web\
> target\mahout-taste-webapp-0.2-SNAPSHOT.war to C:\workspace\Mahout for Zanox\tas
> te-web\target\work\webapp
> 2009-08-18 17:29:41.106::WARN: Failed startup of context JettyWebAppContext@4eb
> 585@4eb585/,file:/C:/workspace/Mahout%20for%20Zanox/taste-web/target/work/webapp
> /,C:\workspace\Mahout for Zanox\taste-web\target\mahout-taste-webapp-0.2-SNAPSHO
> T.war
> java.net.URISyntaxException: Illegal character in path at index 18: file:/C:/Dok
> umente und Einstellungen/bda/.m2/repository/org/mortbay/jetty/jetty-maven-plugin
> /7.0.0.1beta3/jetty-maven-plugin-7.0.0.1beta3.jar
> at java.net.URI$Parser.fail(URI.java:2809)
> at java.net.URI$Parser.checkChars(URI.java:2982)
> at java.net.URI$Parser.parseHierarchical(URI.java:3066)
> at java.net.URI$Parser.parse(URI.java:3014)
> at java.net.URI.(URI.java:578)
> at java.net.URL.toURI(URL.java:918)
> at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfi
> guration.java:79)
> at org.mortbay.jetty.plugin.MavenWebInfConfiguration.preConfigure(MavenW
> ebInfConfiguration.java:39)
> at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:343
> )
> at org.mortbay.jetty.plugin.JettyWebAppContext.doStart(JettyWebAppContex
> t.java:89)
> at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
> at org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
> lection.java:164)
> at org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(Con
> textHandlerCollection.java:161)
> at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
> at org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
> lection.java:164)
> at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
> at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrappe
> r.java:92)
> at org.eclipse.jetty.server.Server.doStart(Server.java:225)
> at org.mortbay.jetty.plugin.JettyServer.doStart(JettyServer.java:69)
> at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
> Cycle.java:56)
> at org.mortbay.jetty.plugin.AbstractJettyMojo.startJetty(AbstractJettyMo
> jo.java:423)
> at org.mortbay.jetty.plugin.AbstractJettyMojo.execute(AbstractJettyMojo.
> java:366)
> at org.mortbay.jetty.plugin.JettyRunWarMojo.execute(JettyRunWarMojo.java
> :68)
> at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPlugi
> nManager.java:483)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(Defa
> ultLifecycleExecutor.java:678)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandalone
> Goal(DefaultLifecycleExecutor.java:553)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(Defau
> ltLifecycleExecutor.java:523)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHan
> dleFailures(DefaultLifecycleExecutor.java:371)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegmen
> ts(DefaultLifecycleExecutor.java:332)
> at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLi
> fecycleExecutor.java:181)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:356)
> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:137)
> at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
> at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:4
> 1)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
> java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
> sorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
> at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
> at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)
>
> at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
> 2009-08-18 17:29:41.184::INFO: Started SelectChannelConnector@0.0.0.0:8080
> [INFO] Started Jetty Server
>
> _______________________________________
> Benjamin Dageroth, Key Account Manager / Softwareentwickler
> Webtrekk GmbH
> Boxhagener Str. 76-78, 10245 Berlin
> fon 030 - 755 415 - 360
> fax 030 - 755 415 - 100
> benjamin.dageroth@webtrekk.com
> http://www.webtrekk.com
> Amtsgericht Berlin, HRB 93435 B
> Geschäftsführer Christian Sauer
>

_________________________________________________________________
Hotmail® is up to 70% faster. Now good news travels really fast. 
http://windowslive.com/online/hotmail?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_faster:082009

AW: Validating clustering output

Posted by Benjamin Dageroth <Be...@webtrekk.com>.
I just installed Mahout on my windows machine and wanted to try out the taste example with the grouplens data. Although I seem to have done everything according to the suggested instructions at http://lucene.apache.org/mahout/taste.html#demo - However, I cannot get the webapp running and get a 503 message: Service unavailable. When starting jetty, the servlet Container accompanying the demo, it goes through and boasts that it started Jetty Server, but during startup it lets me know that there is an exception, which I suppose will be the culprit.

java.net.URISyntaxException: Illegal character in path at index 18: file:/C:/Dokumente und Einstellungen/bda/.m2/repository/org/mortbay/jetty/jetty-maven-plugin/7.0.0.1beta3/jetty-maven-plugin-7.0.0.1beta3.jar
at java.net.URI$Parser.fail<URI.java:2089>
at java.net.URI$Parser.checkChars<URI.java:2982>
at java.net.URI$Parser.parseHierarchical<URI.java:3066>
at java.net.URI$Parser.parse<URI.java:3014>
at java.net.URI.<init><URI.java:578>
at java.net.URL.toURI<URL.java:918>
...
Etc.
The complete log of the startup process can be found further down. I would guess that empty spaces might pose a problem, but I am not sure what I can do about that when the home directory of a user is used which is always filed under c:\dokumente und Einstellungen\ and maven goes to look there.

Any Idea where I can change the path, in case that this is indeed the problem? Otherwise, what is my problem? ;-)

Thanks a lot,

Benjamin

------------------------------------------------------------------
Complete Log:
$ /cygdrive/c/workspace/maven/apache-maven-2.2.0/bin/mvn jetty:run-war
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Building Mahout Taste Webapp
[INFO]    task-segment: [jetty:run-war]
[INFO] ------------------------------------------------------------------------
[INFO] Preparing jetty:run-war
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 4 resources
[INFO] Copying 1 resource to c:\workspace\Mahout for Zanox\taste-web\target/maho
ut-taste-webapp-0.2-SNAPSHOT/WEB-INF/lib
[INFO] [resources:copy-resources {execution: copy-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 3 resources
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Nothing to compile - all classes are up to date
[INFO] [resources:testResources {execution: default-testResources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory c:\workspace\Mahout for Zanox\taste-w
eb\src\test\resources
[INFO] [compiler:testCompile {execution: default-testCompile}]
[INFO] Nothing to compile - all classes are up to date
[INFO] [surefire:test {execution: default-test}]
[INFO] No tests to run.
[INFO] [war:war {execution: default-war}]
[INFO] Packaging webapp
[INFO] Assembling webapp[mahout-taste-webapp] in [c:\workspace\Mahout for Zanox\
taste-web\target\mahout-taste-webapp-0.2-SNAPSHOT]
[INFO] Dependency[Dependency {groupId=org.apache.mahout, artifactId=mahout-core,
 version=0.2-SNAPSHOT, type=jar}] has changed (was Dependency {groupId=org.apach
e.mahout, artifactId=mahout-core, version=0.2-SNAPSHOT, type=jar}).
[INFO] Dependency[Dependency {groupId=axis, artifactId=axis, version=1.4, type=j
ar}] has changed (was Dependency {groupId=axis, artifactId=axis, version=1.4, ty
pe=jar}).
[INFO] Dependency[Dependency {groupId=javax.servlet, artifactId=servlet-api, ver
sion=2.4, type=jar}] has changed (was Dependency {groupId=javax.servlet, artifac
tId=servlet-api, version=2.4, type=jar}).
[INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j-api, version=1
.5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j, artifactId=slf4
j-api, version=1.5.6, type=jar}).
[INFO] Dependency[Dependency {groupId=org.slf4j, artifactId=slf4j-jcl, version=1
.5.6, type=jar}] has changed (was Dependency {groupId=org.slf4j, artifactId=slf4
j-jcl, version=1.5.6, type=jar}).
[INFO] Processing war project
[INFO] Copying webapp resources[c:\workspace\Mahout for Zanox\taste-web\src\main
\webapp]
[INFO] Webapp assembled in[94 msecs]
[INFO] Building war: c:\workspace\Mahout for Zanox\taste-web\target\mahout-taste
-webapp-0.2-SNAPSHOT.war
[INFO] [jetty:run-war {execution: default-cli}]
[INFO] Configuring Jetty for project: Mahout Taste Webapp
2009-08-18 17:29:38.216::INFO:  Logging to STDERR via org.eclipse.jetty.util.log
.StdErrLog
[INFO] Context path = /
[INFO] Tmp directory = C:\workspace\Mahout for Zanox\taste-web\target\work
[INFO] Web defaults = org/eclipse/jetty/webapp/webdefault.xml
[INFO] Web overrides =  none
[INFO] Starting jetty 7.0.0.M4 ...
2009-08-18 17:29:38.247::INFO:  jetty-7.0.0.M4
2009-08-18 17:29:38.278::INFO:  Extract C:\workspace\Mahout for Zanox\taste-web\
target\mahout-taste-webapp-0.2-SNAPSHOT.war to C:\workspace\Mahout for Zanox\tas
te-web\target\work\webapp
2009-08-18 17:29:41.106::WARN:  Failed startup of context JettyWebAppContext@4eb
585@4eb585/,file:/C:/workspace/Mahout%20for%20Zanox/taste-web/target/work/webapp
/,C:\workspace\Mahout for Zanox\taste-web\target\mahout-taste-webapp-0.2-SNAPSHO
T.war
java.net.URISyntaxException: Illegal character in path at index 18: file:/C:/Dok
umente und Einstellungen/bda/.m2/repository/org/mortbay/jetty/jetty-maven-plugin
/7.0.0.1beta3/jetty-maven-plugin-7.0.0.1beta3.jar
        at java.net.URI$Parser.fail(URI.java:2809)
        at java.net.URI$Parser.checkChars(URI.java:2982)
        at java.net.URI$Parser.parseHierarchical(URI.java:3066)
        at java.net.URI$Parser.parse(URI.java:3014)
        at java.net.URI.<init>(URI.java:578)
        at java.net.URL.toURI(URL.java:918)
        at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfi
guration.java:79)
        at org.mortbay.jetty.plugin.MavenWebInfConfiguration.preConfigure(MavenW
ebInfConfiguration.java:39)
        at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:343
)
        at org.mortbay.jetty.plugin.JettyWebAppContext.doStart(JettyWebAppContex
t.java:89)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
Cycle.java:56)
        at org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
lection.java:164)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.doStart(Con
textHandlerCollection.java:161)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
Cycle.java:56)
        at org.eclipse.jetty.server.handler.HandlerCollection.doStart(HandlerCol
lection.java:164)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
Cycle.java:56)
        at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrappe
r.java:92)
        at org.eclipse.jetty.server.Server.doStart(Server.java:225)
        at org.mortbay.jetty.plugin.JettyServer.doStart(JettyServer.java:69)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLife
Cycle.java:56)
        at org.mortbay.jetty.plugin.AbstractJettyMojo.startJetty(AbstractJettyMo
jo.java:423)
        at org.mortbay.jetty.plugin.AbstractJettyMojo.execute(AbstractJettyMojo.
java:366)
        at org.mortbay.jetty.plugin.JettyRunWarMojo.execute(JettyRunWarMojo.java
:68)
        at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPlugi
nManager.java:483)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(Defa
ultLifecycleExecutor.java:678)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeStandalone
Goal(DefaultLifecycleExecutor.java:553)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(Defau
ltLifecycleExecutor.java:523)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHan
dleFailures(DefaultLifecycleExecutor.java:371)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegmen
ts(DefaultLifecycleExecutor.java:332)
        at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLi
fecycleExecutor.java:181)
        at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:356)
        at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:137)
        at org.apache.maven.cli.MavenCli.main(MavenCli.java:362)
        at org.apache.maven.cli.compat.CompatibleMain.main(CompatibleMain.java:4
1)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
        at org.codehaus.classworlds.Launcher.launch(Launcher.java:255)
        at org.codehaus.classworlds.Launcher.mainWithExitCode(Launcher.java:430)

        at org.codehaus.classworlds.Launcher.main(Launcher.java:375)
2009-08-18 17:29:41.184::INFO:  Started SelectChannelConnector@0.0.0.0:8080
[INFO] Started Jetty Server

_______________________________________
Benjamin Dageroth, Key Account Manager / Softwareentwickler
Webtrekk GmbH
Boxhagener Str. 76-78, 10245 Berlin
fon 030 - 755 415 - 360
fax 030 - 755 415 - 100
benjamin.dageroth@webtrekk.com
http://www.webtrekk.com
Amtsgericht Berlin, HRB 93435 B
Geschäftsführer Christian Sauer


Re: Validating clustering output

Posted by Grant Ingersoll <gs...@apache.org>.
Also found: http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html

On Aug 18, 2009, at 9:55 AM, Grant Ingersoll wrote:

>
> On Jul 27, 2009, at 9:42 PM, Ted Dunning wrote:
>
>> The other reference I am looking for may be in David Mackay's  
>> book.  The
>> idea is that you measure the quality of the approximation by  
>> looking at the
>> entropy in the cluster assignment relative to the residual required  
>> to
>> precisely specify the original data relative to the quantized value.
>
> Is the WM Rand paper in JSTOR ("Object Criteria for Evaluation of  
> Clustering Methods") worthwhile on this topic?  Basic searches for  
> "evaluating clustering" or "cluster evaluation" on Google Scholar  
> turn up very little.  The Rand paper is from 1971, but who knows...
>
> Of course, I'd like something that doesn't require purchase (sigh.)