You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Henri Biestro (JIRA)" <ji...@apache.org> on 2007/04/26 16:20:15 UTC

[jira] Created: (SOLR-215) Multiple Solr Cores

Multiple Solr Cores
-------------------

                 Key: SOLR-215
                 URL: https://issues.apache.org/jira/browse/SOLR-215
             Project: Solr
          Issue Type: Improvement
            Reporter: Henri Biestro
            Priority: Minor


Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.


Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by Will Johnson <wj...@GETCONNECTED.COM>.
>One question I had was about backward compatibility... is there a way
to >register a null or default core that reverts to the original paths?
Are >there any other backward compatible gotchas (not related to custom
java >code)?

I'm very excited about this patch as it would remove my current scheme
of running shell scripts to hot deploy new solr webapps on the fly. 

Along with registering a default core so that all existing code/tests
continue to work I think it would be nice to have the core name
specified as a CGI param instead of (or in addition to) a url path.
Otherwise, large section of client code (such as solrj/solr#) will need
to be changed.  

For example:

http://localhost:8983/solr/select?q=foo&core=core1
http://localhost:8983/solr/update?core=core1 

- will

[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511454 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri:
I've finally started looking at this.  The latest version of the patch doesn't apply 100% cleanly (e.g. src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java has been replaced by src/java/org/apache/solr/handler/XppUpdateRequestHandler.java) and thus 'ant compile' results in several compilation errors.  You can probably see the same locally, but just in case it make it easier for you, here is how patching looks for me:

[otis@localhost trunk]$ patch -p0 < solr-215.patch
patching file src/test/org/apache/solr/update/AutoCommitTest.java
Hunk #1 FAILED at 55.
1 out of 2 hunks FAILED -- saving rejects to file src/test/org/apache/solr/update/AutoCommitTest.java.rej
patching file src/test/org/apache/solr/analysis/TestBufferedTokenStream.java
patching file src/test/org/apache/solr/analysis/TestPatternReplaceFilter.java
patching file src/test/org/apache/solr/analysis/TestPhoneticFilter.java
patching file src/test/org/apache/solr/analysis/AnalysisTestCase.java
patching file src/test/org/apache/solr/analysis/TestPatternTokenizerFactory.java
patching file src/test/org/apache/solr/analysis/TestRemoveDuplicatesTokenFilter.java
patching file src/test/org/apache/solr/analysis/TestKeepWordFilter.java
Hunk #1 FAILED at 27.
1 out of 3 hunks FAILED -- saving rejects to file src/test/org/apache/solr/analysis/TestKeepWordFilter.java.rej
patching file src/test/org/apache/solr/analysis/BaseTokenTestCase.java
patching file src/test/org/apache/solr/servlet/SolrRequestParserTest.java
patching file src/test/org/apache/solr/servlet/DirectSolrConnectionTest.java
patching file src/test/org/apache/solr/core/TestConfig.java
patching file src/test/org/apache/solr/core/SolrCoreTest.java
patching file src/test/org/apache/solr/core/RequestHandlersTest.java
patching file src/test/org/apache/solr/core/TestBadConfig.java
patching file src/test/org/apache/solr/schema/BadIndexSchemaTest.java
patching file src/test/org/apache/solr/schema/NotRequiredUniqueKeyTest.java
patching file src/test/org/apache/solr/schema/RequiredFieldsTest.java
patching file src/test/org/apache/solr/schema/IndexSchemaTest.java
patching file src/test/org/apache/solr/BasicFunctionalityTest.java
patching file src/test/org/apache/solr/handler/StandardRequestHandlerTest.java
patching file src/test/org/apache/solr/handler/XmlUpdateRequestHandlerTest.java
Hunk #2 FAILED at 13.
1 out of 2 hunks FAILED -- saving rejects to file src/test/org/apache/solr/handler/XmlUpdateRequestHandlerTest.java.rej
patching file src/test/org/apache/solr/handler/MoreLikeThisHandlerTest.java
patching file src/java/org/apache/solr/schema/IndexSchema.java
Hunk #2 succeeded at 57 (offset 1 line).
Hunk #4 succeeded at 294 (offset 1 line).
Hunk #5 FAILED at 303.
Hunk #6 succeeded at 314 with fuzz 2.
Hunk #7 FAILED at 327.
Hunk #8 succeeded at 458 (offset 3 lines).
Hunk #10 succeeded at 593 (offset 3 lines).
Hunk #12 succeeded at 617 (offset 3 lines).
2 out of 13 hunks FAILED -- saving rejects to file src/java/org/apache/solr/schema/IndexSchema.java.rej
patching file src/java/org/apache/solr/update/UpdateHandler.java
patching file src/java/org/apache/solr/update/DirectUpdateHandler2.java
Hunk #1 succeeded at 607 (offset 11 lines).
patching file src/java/org/apache/solr/update/SolrIndexConfig.java
patching file src/java/org/apache/solr/analysis/PatternTokenizerFactory.java
patching file src/java/org/apache/solr/analysis/TokenizerFactory.java
patching file src/java/org/apache/solr/analysis/PatternReplaceFilterFactory.java
patching file src/java/org/apache/solr/analysis/BaseTokenFilterFactory.java
patching file src/java/org/apache/solr/analysis/TrimFilterFactory.java
patching file src/java/org/apache/solr/analysis/KeepWordFilterFactory.java
patching file src/java/org/apache/solr/analysis/TokenFilterFactory.java
patching file src/java/org/apache/solr/analysis/EnglishPorterFilterFactory.java
Hunk #2 succeeded at 33 with fuzz 1.
patching file src/java/org/apache/solr/analysis/PhoneticFilterFactory.java
patching file src/java/org/apache/solr/analysis/WordDelimiterFilterFactory.java
patching file src/java/org/apache/solr/analysis/SynonymFilterFactory.java
Hunk #2 succeeded at 31 with fuzz 1.
patching file src/java/org/apache/solr/analysis/SnowballPorterFilterFactory.java
patching file src/java/org/apache/solr/analysis/EdgeNGramTokenizerFactory.java
patching file src/java/org/apache/solr/analysis/PhoneticFilter.java
Hunk #1 FAILED at 28.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/solr/analysis/PhoneticFilter.java.rej
patching file src/java/org/apache/solr/analysis/LengthFilterFactory.java
patching file src/java/org/apache/solr/analysis/StopFilterFactory.java
Hunk #2 succeeded at 32 with fuzz 1.
patching file src/java/org/apache/solr/analysis/NGramTokenizerFactory.java
patching file src/java/org/apache/solr/analysis/BaseTokenizerFactory.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #10 FAILED at 261.
Hunk #11 succeeded at 329 (offset 1 line).
Hunk #13 succeeded at 589 (offset 1 line).
Hunk #14 succeeded at 979 (offset 1 line).
1 out of 14 hunks FAILED -- saving rejects to file src/java/org/apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/search/CacheConfig.java
Hunk #1 succeeded at 37 with fuzz 2.
patching file src/java/org/apache/solr/search/DocSetHitCollector.java
patching file src/java/org/apache/solr/core/SolrInfoRegistry.java
patching file src/java/org/apache/solr/core/Config.java
Hunk #1 succeeded at 44 with fuzz 2.
patching file src/java/org/apache/solr/core/SolrConfig.java
Hunk #2 succeeded at 44 with fuzz 2.
patching file src/java/org/apache/solr/core/SolrCore.java
Hunk #2 FAILED at 75.
Hunk #3 succeeded at 144 (offset 2 lines).
Hunk #5 succeeded at 177 (offset 2 lines).
Hunk #6 succeeded at 185 with fuzz 2.
Hunk #7 succeeded at 349 (offset 6 lines).
Hunk #9 succeeded at 479 (offset 13 lines).
Hunk #11 succeeded at 629 (offset 13 lines).
Hunk #13 succeeded at 768 (offset 13 lines).
Hunk #15 succeeded at 876 (offset 13 lines).
Hunk #16 FAILED at 896.
Hunk #17 FAILED at 906.
3 out of 17 hunks FAILED -- saving rejects to file src/java/org/apache/solr/core/SolrCore.java.rej
patching file src/java/org/apache/solr/core/RequestHandlers.java
Hunk #1 FAILED at 45.
Hunk #3 FAILED at 128.
Hunk #4 FAILED at 153.
Hunk #5 FAILED at 193.
Hunk #6 succeeded at 201 with fuzz 1 (offset -23 lines).
4 out of 7 hunks FAILED -- saving rejects to file src/java/org/apache/solr/core/RequestHandlers.java.rej
patching file src/java/org/apache/solr/core/AbstractSolrEventListener.java
patching file src/java/org/apache/solr/core/QuerySenderListener.java
Hunk #1 succeeded at 31 with fuzz 1.
patching file src/java/org/apache/solr/core/RunExecutableListener.java
patching file src/java/org/apache/solr/request/XSLTResponseWriter.java
patching file src/java/org/apache/solr/request/StandardRequestHandler.java
patching file src/java/org/apache/solr/request/DisMaxRequestHandler.java
patching file src/java/org/apache/solr/tst/OldRequestHandler.java
patching file src/java/org/apache/solr/tst/TestRequestHandler.java
patching file src/java/org/apache/solr/handler/RequestHandlerBase.java
patching file src/java/org/apache/solr/handler/CSVRequestHandler.java
patching file src/java/org/apache/solr/handler/StandardRequestHandler.java
Hunk #1 succeeded at 61 (offset 1 line).
patching file src/java/org/apache/solr/handler/admin/PropertiesRequestHandler.java
patching file src/java/org/apache/solr/handler/admin/LukeRequestHandler.java
Hunk #1 succeeded at 43 (offset 1 line).
Hunk #2 succeeded at 85 (offset -1 lines).
patching file src/java/org/apache/solr/handler/admin/PluginInfoHandler.java
patching file src/java/org/apache/solr/handler/admin/ThreadDumpHandler.java
patching file src/java/org/apache/solr/handler/admin/SystemInfoHandler.java
can't find file to patch at input line 2931
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|Index: src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java
|===================================================================
|--- src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java     (revision 548274)
|+++ src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java     (working copy)
--------------------------
File to patch: src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java
src/java/org/apache/solr/handler/StaxUpdateRequestHandler.java: No such file or directory
Skip this patch? [y]
Skipping patch.
1 out of 1 hunk ignored
patching file src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java
Hunk #1 FAILED at 51.
Hunk #2 FAILED at 102.
2 out of 2 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java.rej
patching file src/java/org/apache/solr/handler/SpellCheckerRequestHandler.java
patching file src/java/org/apache/solr/handler/DisMaxRequestHandler.java
Hunk #1 succeeded at 158 (offset 1 line).
patching file src/java/org/apache/solr/handler/DumpRequestHandler.java
patching file src/java/org/apache/solr/handler/MoreLikeThisHandler.java
Hunk #1 succeeded at 71 (offset 1 line).
patching file src/java/org/apache/solr/util/AbstractSolrTestCase.java
patching file src/java/org/apache/solr/util/TestHarness.java
Hunk #1 FAILED at 74.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/solr/util/TestHarness.java.rej
patching file src/java/org/apache/solr/util/xslt/TransformerProvider.java
patching file src/webapp/WEB-INF/web.xml
patching file src/webapp/src/org/apache/solr/servlet/SolrInit.java
patching file src/webapp/src/org/apache/solr/servlet/SolrServlet.java
patching file src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java
patching file src/webapp/src/org/apache/solr/servlet/SolrUpdateServlet.java
patching file src/webapp/src/org/apache/solr/servlet/SolrRequestParsers.java
patching file src/webapp/src/org/apache/solr/servlet/DirectSolrConnection.java
patching file src/webapp/resources/admin/raw-schema.jsp
patching file src/webapp/resources/admin/_info.jsp
patching file src/webapp/resources/admin/get-file.jsp
patching file src/webapp/resources/admin/ping.jsp
patching file src/webapp/resources/admin/stats.jsp
patching file src/webapp/resources/admin/index.jsp
patching file example/solr/conf/core0.schema.xml
patching file example/solr/conf/core1.schema.xml
patching file example/solr/conf/core0.config.xml
patching file example/solr/conf/core1.config.xml
patching file client/java/solrj/test/org/apache/solr/client/solrj/embedded/TestEmbeddedSolrServer.java
patching file client/java/solrj/test/org/apache/solr/client/solrj/embedded/TestJettySolrRunner.java
patching file client/java/solrj/src/org/apache/solr/client/solrj/embedded/EmbeddedSolrServer.java



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507344 ] 

Henri Biestro edited comment on SOLR-215 at 6/25/07 3:33 AM:
-------------------------------------------------------------

About solr-255, I've posted a small comment to Toru.
Seems to me that solr-255/solr-215 features are mostly orthogonal; solr-255 allows one core to use multiple indexes, solr-215 allows multiple cores in one instance.
But I like the idea of federated search (and federated indexing!).
I'm a bit worried though that adding a Lucene patch dependency & merging solr-215/solr-255 will make the commit occur even later... And I'm getting confused; how could this fusion help reduce the amount of effort to review the patch?
But I'll follow your lead; I'll try & see if I can merge.


 was:
About solr-255, I've posted a small comment to Toru.
Seems to me that solr-255/solr-215 features are mostly orthogonal; solr-255 allows one core to use mutliple indexes, solr-255 allows multiple cores in one instance.
But I like the idea of federated search (and federated indexing!).
I'm a bit worried though that adding a Lucene patch dependency & merging solr-215/solr-255 will make the commit occur even later...
But I'll follow your lead; I'll try & see if I can merge.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description: 
What
-------
As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

Why
------
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

How
------
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

Details (per package)
-----------------------------
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

Replication
----------------
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

Future
---------
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

Misc
-------
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
3/ Apply the patch to the 'clean trunk'.
You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.


  was:
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.

Implementation notes for the patch:
The patch allows to have multiple 'named' cores in the same application.
The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).

A few classes were only existing as singletons and have thus been refactored.
The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
The SolrCore is built from a SolrConfig & an IndexSchema.

The creation of a core has become:
//create a configuration
SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0");


There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.

Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355


Patch can now be installed on a clean trunk.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> What
> -------
> As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> Why
> ------
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355
> How
> ------
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> Details (per package)
> -----------------------------
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> Replication
> ----------------
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> Future
> ---------
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> Misc
> -------
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
> 3/ Apply the patch to the 'clean trunk'.
> You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525941 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

I just committed a HUGE patch that removes the SolrCore static singleton.

This does not yet support configuring and using multiple cores.  For clarity, i think that should get its own new issue while we figure out the best interface.  Lets continue to use this issue to resolve any problems that may occur from the core changes.

Henri - thanks for your patience and stamina!

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for trunk 557340

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506920 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri, question about this:

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.

Wouldn't one simply way to get dynamic SolrCore creation be via custom/specific admin request handlers that create a core with the given name by adding it to that static map of cores that you've created?

e.g.
/admin/coremanager?cmd=add&name=foo
/admin/coremanager?cmd=del&name=foo

Maybe a naming convention could be used to figure out which schema.xml + solrconfig.xml to use for the newly added core.  e.g. foo-schema.xml and foo-solrconfig.xml.  The assumption would be that when the new core is added, the needed config files are already in place and ready to be loaded.

Thoughts anyone?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513946 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

 
> My suggestion is that this be added in phase 2, after Henri's initial changes are committed.
> Does this sound reasonable?
> 

Yes - perhaps getting this checked in without touching handlers or the web-app side is a good idea.  It is a little weird since the multi-core aspect would only be usable programatically, but that will make it possible to easily bat around a 'core manager' and http design.

The one big question is what to do with the TokenizerFactory API.  

Yonik, how do you suggest upgrading an interface?  The only clean way I can think is to upgrade the TokenizerFactory interface with a 'MulitCoreTokenizerFactory'  adding an additional argument.  I don't like it, but don't know the API compatibility rules well enough to know if it is required or is ok to change the API.

----

Will - as is, this patch does not let you dynamically change the core.  They are statically defined in web.xml.  This will change.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515888 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Ryan e Henri:

1. Re TokenizerFactory - what will break with this change?  Is the fear that people implemented this interface in their Solr apps and this change will break their implementations, or something else?

2. So can SolrUpdateServlet  get axed, so SolrInit can be eliminated?

If we can resolve these two, it sounds like we can commit this patch and then work on the rest.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description: 
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.

Implementation notes for the patch:
The patch allows to have multiple 'named' cores in the same application.
The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).

A few classes were only existing as singletons and have thus been refactored.
The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
The SolrCore is built from a SolrConfig & an IndexSchema.

The creation of a core has become:
//create a configuration
SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0");


There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.

Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

  was:
Allow multiple cores in one web-application (or one class-loader):
This allows to have multiple cores created from different config & schema in the same application.
The side effect is that this also allows different indexes.


Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic reopened SOLR-215:
-----------------------------------


I think Henri accidentally resolved this.  Reopening.
Btw. I'm *very* interested in serving multiple indices under a single Solr instance, possibly even embedded as described on the wiki or in LUCENE-212.  I may not find the time to look at the patch before next week, though.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for trunk 557340 (ASF inclusion...)

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511796 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

Otis,
You need to grab the 'zipped' version aka solr-215.patch.zip (since June 23).
I was trying to be space & bandwidth friendly...
Sorry I did not make it more obvious in some previous comments.
Henri

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499919 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri - I'm starting to ook at this.  I see a lot of space changes in the patch.  Could you please generate a patch that doesn't have all those space changes?
When you generate a diff file for the patch, these may be handy parameters to use (I'm assuming you're going work under some kind of UNIX)

       -E  --ignore-tab-expansion
              Ignore changes due to tab expansion.

       -b  --ignore-space-change
              Ignore changes in the amount of white space.

       -w  --ignore-all-space
              Ignore all white space.

       -B  --ignore-blank-lines
              Ignore changes whose lines are all blank.

Thanks!
I just skimmed the patch and didn't see where the name of the index/core gets passed in the request.  Can you please point me to the right place to look?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513789 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

I just took a quick (well not so quick in applying the patch) look at this.  I have not run it, or tried to use it, just following the changes.

Here are a couple concerns:

1. TokenizerFactory breaks API compatibility:
-  public void init(Map<String,String> args);
+  public void init(SolrConfig solrConfig, Map<String,String> args);

I'm not sure the best fix and understand Yonik's aversion to interfaces.


2.  Why do RequestHandlers all need to know what core they are from?  The core is (and has been) passed along with the request.  It looks like the only place it is used is in @deprecated  SolrUpdateServlet.

If we only support the multi-core stuff from the dispatch filter, we don't need to augment every request handler with the core that created it.

The 'SolrInit' stuff that works for filters or servlets is nice, but is more confusing then it needs to be if multi-core support were only avaliable from the filter.


3.  I'm not sure I like that you have to create a new filter and edit web.xml for each core.  If thats the case, why not run multiple web apps?  Perhaps the RequestDispatcher could accept ?core=name or look for a path that starts with /core:name/ to choose the core.  It would store the 'null' core to avoid a map lookup in the default case.


4. Rather then have each core configured in web.xml, perhaps there should be a core.xml or core.properties file that sits in solr home?

- - - - - -

As for committing this soon.  If 1 & 2 are delt with before committing, I'm for it.  It will be easier to push around improvements with smaller manageable patches.  





> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for trunk 555252; should apply cleanly now (a few fuzzies but no rejects).

Yonik,
About backwards compatibility & named cores, the 'null' core (ie the core named 'null') is equivalent to the (non-solr215-patched) original version; SolrCore.getSolrCore() returns that core.
Besides the obvious SolrConfig.config that has been removed, I dont (fore)see any other non-compatible behaviors.
Henri

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

update to current trunk; patch generated from a Solaris Express 10 box.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12506717 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri, I think Toru is doing something useful in SOLR-255 - FederatedSearch over RMI + support for multiple local indices.  I think your work is overlapping a lot and you two need to sync, either working on a single patch or on multiple smaller patches with serial dependency.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-533775.patch

The patch as it stands still requires some refactoring 'above' the Java core.
Although the 'single core' feature has been retained (aka the static SolrCore.getCore), the SolrConfig.config could not; the admin servlet has been modified accordingly.

Updated patch based on latest trunk.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492743 ] 

Hoss Man commented on SOLR-215:
-------------------------------

I'm confused ... why is this issue Resolved:Fixed ?

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-542847.patch

A revised version of the patch based on revision 542847.

The patch was produced with the following command run from trunk directory:
svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N" > solr-trunk-542847.patch
This should take care of the white spaces as well as inclusion of new files.
All unit tests behave as in the single core version; 133 tests, 5 failures, 0 errors

The content of the patch also includes modifications to the admin, servlet & filters to accomodate the declaration & handling of multiple cores. The example conf & web.xml have been modified to declare 2 other cores (besides the default) named 'core0' and 'core1'.
The filter itself forwards to the proper servlet if no specific handler exists in the core configuration.
Example:
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content.

Comments & advice welcome.



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534437 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

Ok, I liked... fixing this is not hard.

Deprecation support was already baked into IndexSchema:

    TokenFilterFactory tfac = (TokenFilterFactory)solrConfig.newInstance(className);
    if (tfac instanceof SolrConfig.Initializable)
      ((SolrConfig.Initializable)tfac).init(solrConfig, DOMUtil.toMapExcept(attrs,"class"));
    else
      tfac.init(DOMUtil.toMapExcept(attrs,"class"));

the problem is that BaseTokenizerFactory and BaseTokenFilterFactory both implement SolrConfig.Initializable so the IndexSchema assumes they are using the new interface.  If someone extends something from these Base classes it is not called.

the fix is simply to call init( args ) from within init( config, args ) -- I'll remove the warning message since that will be called by default now.

ryan

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

Thanks Mike for your usefull advice; 
I've corrected the (modified) tests so they are now behaving as the non-patched version do (aka no error nor failure, 133 tests); there were still some of them using the 'unnamed/null' core. My bad, thanks again for pointing it out.
The 'superseding' patch is now called solr-215.patch so JIRA should take care of keeping only the last version. (all others can be ignored & deleted).
This drop is based on svn revision 543145.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Mike Klaas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500216 ] 

Mike Klaas commented on SOLR-215:
---------------------------------

I haven't looked at the patch, but:

  - there are no current failures on trunk, save from a sporatic AutoCommitTest failure if the machine is heavily-loaded.  Are you testing this patch in the context of other local changes?
  - if you maintain the same name for subsequent versions of the patch, JIRA automatically keeps track of the most recent for you
  - personally, I find it helpful to check out a fresh copy of trunk and apply my patch and run the tests there.  It helps ferret out the problematic issues and oversights.

cheers

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516718 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri:
Typically a contributor will try to push a patch as far as he/she can based on committers' comments and suggestions.  It sounds like we are all in agreement here and you know exactly what to do, so we'll wait for your next patch and hopefully that will be the one that we can commit.  Thank you very much for your patience with this - I'm impressed.

>From what I can tell, this is what is left:
1. Remove anything in this patch that touches o.a.s.webapp.* and o.a.s.handler.* 
2. Deprecate factory interfaces and add abstract factory base classes as Hoss descibed above

One that's in the repository, I think you/we can take the description you wrote (all the way at the top of this issue) and turn it into a Wiki page (anyone can add/edit pages, just create an account on the Wiki).

I'm eager to try the new patch!


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516192 ] 

Hoss Man commented on SOLR-215:
-------------------------------

For the record, I have not looked at the most recent version of the patch -- don't think i've ever had a chance to look at any version of this patch actually, but I since my fiance is currently reading harry potter, i figured this was a good day to at least try and catch up on all the issue comments.

so far, i like what i'm reading -- i think the plan to first commit the "framework" code so that multiple cores can be programmatic created, then tackle the syntax for defining/creating/querying multiple cores via config files and/or http params makes sense.

As far as the backwards compatibility issues go with things like the Token(izer|Filter)Factory APIs, I think it's safe to say that people who want to use multiple cores can be required to make minor modifications to custom plugins they may have written in order to get them to work correctly with those multiple cores.  

what we have to watch out for is people who don't care about custom cores, and have written custom plugins.  things should continue to work for those people.

In the case of the token-blah-factories, a simple way to go (which can also help us move away from the interface headache) might be to deprecate the current factory interfaces, and add new abstract factory base classes which implement those interfaces and are multi-core aware ... the initialization code can first check to see if the class name in question extends the new abstract base class -- if so then jolly good, if not then fall back to the legacy behavior and init the class without any info about it's core.

the kind of situation i do worry about however is along the lines of a comment Henri pointed out (very early on evidently, not sure how i missed it back then)...

> Although the 'single core' feature has been retained (aka the static SolrCore.getCore), 
> the SolrConfig.config could not; 

...this is a little alarming, because there *may* be custom plugins that use SolrConfig.config to get arbitrary configuration inforrmation from solrconfig.xml ... i say *may* because we've never exactly advertised that as a recommended technique, but that doesn't mean peope aren't doing it.  At a minimum we need a well documented replacement for (hopefully something like SolrCore.getSolrCore(null).getSolrConfig() works) but the question that immediately popped into my head was: "if SolrCore.getCore(null) can return 'the null core' why can't SolrCOnfig.config be assigned the config from the null core?"

In general, this is the kind of thing i worry about: making sure that any and all custom plugin code that may exist right now can continue to exist and function using a single core even after the multi-core functionality is committed.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Walter Ferrara (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510723 ] 

Walter Ferrara commented on SOLR-215:
-------------------------------------

By using the patch, (assuming I'm using it correctly), it seems that Solr is not able anymore to load my handlers, which resides in a jar under solr/lib dir. The exception I've got is (handler class name censored):

GRAVE: org.apache.solr.common.SolrException: Error loading class 'com.******.******'
        at org.apache.solr.core.Config.findClass(Config.java:295)
[..]
Caused by: java.lang.ClassNotFoundException: com.******.******
        at java.net.URLClassLoader$1.run(Unknown Source)
        at java.security.AccessController.doPrivileged(Native Method)
[..]
(full stack trace available if needed)

The problem arise in both patched trunk I've tested (550264 with previous patch, and 552910 with latest patch), I've been compiling it using Netbeans 5.5 and java1.6 on windows.
To resolve the issue, I modified a bit the Config.java. Now it works fine, it loads all the jars, but full implication of the change I made have to be determined.

Here the modification I made on patched (org.apache.solr.core) Config.java (working Config.java versus original solr-215  "Config_solr215.java")

*** Config.java
--- Config_origSolr215.java
***************
*** 393,399 ****
            SolrException.log(log,"Can't construct solr lib class loader", e);
          }
        }
!       if (null == classLoader) classLoader = loader;
      }
      return classLoader;
    }
--- 393,399 ----
            SolrException.log(log,"Can't construct solr lib class loader", e);
          }
        }
!       classLoader = loader;
      }
      return classLoader;
    }


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

Sorry for the long delay in looking at this...  

I got the patch applied to trunk and it appears to be working.  I removed the servlet configuration stuff and think we should revist that in a different patch as soon as this monster is commited.

Otis, if you have time to check this out that would be great. has time to check this over before commiting that would be great -- otherwise, I'll add it in a few days.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525054 ] 

Henri Biestro commented on SOLR-215:
------------------------------------


Rakesh,
The patch needs to be applied to the Solr source 1.3 dev trunk.
Getting the source is decribed in http://lucene.apache.org/solr/version_control.html (and I suggest you also read the FAQ here http://wiki.apache.org/solr/FAQ ).
Instructions to apply the patch are described in the Jira issue (as well as a description of its applicability & usefulness; are you sure you need this patch?)
Regards
Henri

Quoted from: 
http://www.nabble.com/-jira--Created%3A-%28SOLR-215%29-Multiple-Solr-Cores-tf3651963.html#a12487432



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534433 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

Mike Klass points out a BIG BAD problem with this patch:
http://www.nabble.com/Deprecations-and-SolrConfig-patch-tf4611038.html

The token filter interface keeps:
@Deprecated
  public void init(Map<String,String> args) {
    log.warning("calling the deprecated form of init; should be calling init(SolrConfig solrConfig, Map<String,String> args)");
    this.args=args;
  } 

but this is never called, so it only tricks us into thinking it is backwards compatible. 

Options:
1. Break the API -- at least no one would get fooled into thinking it works

2. Add some hacky bits to IndexSchema readTokenFilterFactory that first calls the deprecated init, then calls the 'real' one. -- make some clear statemes somewhere about how this works and how it will go away.

I don't have time to look at this for another week or so, but it is very important.  Henri, if you have some time, it would be great if you could take a look at some options.

ryan


> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch

New version of the patch that should be easier to verify.
Created with: svn diff  --diff-cmd /usr/bin/diff -x "-w -B -b -E -N -u" > ~/solr-215.patch
Verified it can be applied on clean trunk through: patch -u -p0 < ~/solr-215.patch

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for trunk revision 55291

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516387 ] 

Henri Biestro commented on SOLR-215:
------------------------------------


About the SolrConfig.config, the replacement is indeed SolrCore.getSolrCore(null).getSolrConfig().
What I meant about the SolrConfig.config not being retained was that there was no way to access through SolrConfig.config the configuration of an arbitrary core; 
we can easily 'reinstate' SolrConfig.config by assigning it the 'null' core config as a compatibility (deprecated?) feature; I'd advocate for stating this very clearly (to avoid unexpected side effects in the multi core case).
This should allow a 'default' deployment to work as it was (without the patch).

I like the plan; is there anything expected/needed that I can/should do? 'process-wise', I'm a little confused about the patch status; should I create/upload a new version of the patch with the described modifications or is this taken care of by the committer? (this sounds like a stupid question, my apologies if it is; just let me know).

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492836 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

I think Henri accidentally resolved this. Reopening.
Btw. I'm *very* interested in serving multiple indices under a single Solr instance, possibly even embedded as described on the wiki or in SOLR-212. I may not find the time to look at the patch before next week, though.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-215:
-------------------------------

    Summary: Multiple Solr Cores - remove static singleton  (was: Multiple Solr Cores)

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513595 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri - is SolrInit something that you added in this patch or something that Solr once had?  I don't recall seeing SolrInit.java before, so I'm guessing you added SolrInit.java in your patch.  However, your patch does not contain SolrInit.java (forgot to svn add it?), so things don't compile even when using the latest .zip (.gz really) and the correct Solr revision:

compile:
    [javac] Compiling 5 source files to /home/otis/dev/repos/lucene/solr/foo/trunk/build/core
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java:69: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrDispatchFilter
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]       ^
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrDispatchFilter.java:69: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrDispatchFilter
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]                               ^
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrServlet.java:49: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrServlet
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]       ^
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrServlet.java:49: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrServlet
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]                               ^
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrUpdateServlet.java:48: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrUpdateServlet
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]       ^
    [javac] /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrUpdateServlet.java:48: cannot find symbol
    [javac] symbol  : class SolrInit
    [javac] location: class org.apache.solr.servlet.SolrUpdateServlet
    [javac]       SolrInit solrInit = new SolrInit(log) {
    [javac]                               ^
    [javac] Note: /home/otis/dev/repos/lucene/solr/foo/trunk/src/webapp/src/org/apache/solr/servlet/SolrUpdateServlet.java uses or overrides a deprecated API.
    [javac] Note: Recompile with -Xlint:deprecation for details.
    [javac] 6 errors

BUILD FAILED


Could you please add SolrInit.java to the patch?  I'd like to give this a go as soon as possible, actually.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513829 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

On Otis's comments:
1 & 2- static initializers for lock related value: you are correct, the code has been lost most likely in some merge- my bad.
3- SolrInfoRegistry  deprecated: you are correct, functionality is replaced by SolrCore.getSolrCore().getInfoRegistry().
4-classLoader not assigned: not sure why it happens but this fixes it...
5- checkName is not subtle: I had the idea of "normalizing" the core name (url like normalize for instance) but did not pursue since it might make the replication scripts more complex to modify (aka the normalization code would need to be duplicated in the script). And since the solaris scripts were not completely functional (my dev machine being solaris), I've postponed the task... ( I also was "dreaming" about being able to derive from SorlCore to benefit from the static map, implement a naming policy that would encompass the config & schema name generations, etc...). Anyhow, this can indeed be simplified with a regexp match.
6-finalize(): no, I believe finalizing one core should just ensure that this core is shutdown.This is only for completeness though since I cant see how a core could be gc-ed & finalized before it actually gets shutdown & removed from the map of cores.

On Ryan's comments:
1- factory/init interface compatibility break: I'll look into other ways since if this is a blocker (ctor, setter or wrap/delegate...). 
2- RequestHandlers know core: SolrUpdateServlet is deprecated but is still there; I was just trying to ensure correct/compatible behavior. I agree SolrInit is more clutter than necessity but can be dropped easily if there is no need to support the SolrUpdateServlet.
3- I do agree that there must be an easier & more functional way to declare and access a core than the current one. I'll try the route you describe.
4- Having core "descriptors" (config/schema) as explicit files in a $solrhome/cores directory; might use some naming convention to derive the core name from them (related to uploading/dynamic creation of cores).

I'm mostly "off the grid" today but I'll try my best on Friday.


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-src.patch

The patch that allows multiple cores/indexes

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511551 ] 

Yonik Seeley commented on SOLR-215:
-----------------------------------

I don't know if we should make Henri keep his patch up to date with the trunk (since it's likely to continue evolving right now) until he's received more feedback about the approach and we are ready to commit it.

One question I had was about backward compatibility... is there a way to register a null or default core that reverts to the original paths?  Are there any other backward compatible gotchas (not related to custom java code)?

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510853 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

Thanks Walter.

I've been "fighting" a bit with this code in the same kind of environment (NB5.5 / JVM 1.5).
The static classLoader was not assigned correctly and I already had to modify the original code to workaround it.
Looks like the JVM 1.6 reintroduces the issue. I don't understand why this happens - may be class loading through NB...
The fix you propose seems totally harmless; I'll check against a 1.5 JVM & introduce it in the next upload.

Using the patch should be straightforward besides handler classes needing a constructor with a SolrCore.
Let me know how it goes.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529625 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

Replacing the line
          SolrEventListener listener = (SolrEventListener)solrConfig.newInstance(className);
With
          SolrEventListener listener = createEventListener(className);
should fix it.

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515912 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------


> 
> 1. Re TokenizerFactory - what will break with this change?  

Personally, I don't have any problem with it.  But is is an API breaking change (a custom 1.2 TokenizerFactory would not work with 1.3).  

I am fine with noting that in CHANGES.txt, but we should make sure more people are aware of this change.  


> 2. So can SolrUpdateServlet  get axed, so SolrInit can be eliminated?
> 

lets not axe SolrUpdateServlet just yet -- this patch does not need to touch SolrUpdateServlet  and SolrInit can be removed.


> If we can resolve these two, it sounds like we can commit this patch and then work on the rest.
>

+1

For now, I think we should remove anything in this patch that touches o.a.s.webapp.* and o.a.s.handler.*

With Multiple Solr Cores working, we can bat around the best interface to accessing/modifying them.



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description: 
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content. 

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0"); 

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.


  was:
What
-------
As of Solr 1.2, Solr only instantiates one SolrCore which handles one Lucene index. This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

Why
------
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents when needed. If you believe you need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and you functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
Some background on the 'whys':
http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

How
------
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

Details (per package)
-----------------------------
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

Replication
----------------
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

Future
---------
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; besides the upload mechanism itself which should be easy, the servlet filter would have to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

Misc
-------
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing.
3/ Apply the patch to the 'clean trunk'.
You can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
Alternatively, TortoiseSVN 'apply patch' command since the patch format is 'unified diff'.



Forgot usage example in description

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529668 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

fixed the SolrEventListener issue in rev578451


> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529509 ] 

Yonik Seeley commented on SOLR-215:
-----------------------------------

FYI, firstSearcher/newSearcher hooks are now broken because the constructors to AbstractSolrEventListener was changed to take a SolrCore, and the code in SolrCore that creates event listeners does a simple newInstance()

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for revision 566269;
added back SolrConfig.config;
cleaned-up o.a.s.handler*

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro resolved SOLR-215.
--------------------------------

    Resolution: Fixed

junits & admin servlet (single core) test ok

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley resolved SOLR-215.
--------------------------------

    Resolution: Fixed
      Assignee: Ryan McKinley

This was committed a while ago.  If it causes any problems, we should open a new issue to track progress.

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Assignee: Ryan McKinley
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic updated SOLR-215:
----------------------------------

    Comment: was deleted

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511783 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Henri, I'm starting to suspect I'm doing something wrong here:

svn co -r 555252 https://svn.apache.org/repos/asf/lucene/solr/trunk
cd trunk
svn info
  Path: .
  URL: https://svn.apache.org/repos/asf/lucene/solr/trunk
  Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
  Revision: 555252
  Node Kind: directory
  Schedule: normal
  Last Changed Author: ryan
  Last Changed Rev: 554915
  Last Changed Date: 2007-07-10 13:57:36 +0200 (Tue, 10 Jul 2007)
  Properties Last Updated: 2007-07-11 17:48:55 +0200 (Wed, 11 Jul 2007)

wget https://issues.apache.org/jira/secure/attachment/12360039/solr-215.patch
patch -p0 < solr-215 &> patch.out

$ grep .rej$ patch.out
1 out of 2 hunks FAILED -- saving rejects to file src/test/org/apache/solr/update/AutoCommitTest.java.rej
1 out of 3 hunks FAILED -- saving rejects to file src/test/org/apache/solr/analysis/TestKeepWordFilter.java.rej
1 out of 2 hunks FAILED -- saving rejects to file src/test/org/apache/solr/handler/XmlUpdateRequestHandlerTest.java.rej
2 out of 13 hunks FAILED -- saving rejects to file src/java/org/apache/solr/schema/IndexSchema.java.rej
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/solr/analysis/PhoneticFilter.java.rej
1 out of 14 hunks FAILED -- saving rejects to file src/java/org/apache/solr/search/SolrIndexSearcher.java.rej
3 out of 17 hunks FAILED -- saving rejects to file src/java/org/apache/solr/core/SolrCore.java.rej
4 out of 7 hunks FAILED -- saving rejects to file src/java/org/apache/solr/core/RequestHandlers.java.rej
2 out of 2 hunks FAILED -- saving rejects to file src/java/org/apache/solr/handler/XmlUpdateRequestHandler.java.rej
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/solr/util/TestHarness.java.rej


Am I doing something wrong?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516415 ] 

Ryan McKinley commented on SOLR-215:
------------------------------------

> we can easily 'reinstate' SolrConfig.config by assigning it the 'null' core config as a compatibility (deprecated?) 

yes.  that is good.  

> 
>  should I create/upload a new version of the patch with the described modifications or is this taken care of by the committer? (this sounds like a stupid question, my apologies if it is; just let me know).
> 

whatever happens first ;)

If you have time, can you make the modifications.  That will make it easier.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507189 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

I like the suggestion. Thanks Otis.
The specific admin handler is definitely a good idea to handle cores (no need to modify the servlet dispatch filter).

Could definitely use a naming convention and/or specify schema & config as parameters as in:
/admin/coremanager?cmd=add&name=foo&schema=foo-schema.xml&config=foo-config.xml

It does not preclude being able to upload the schema & config files (so the files dont have to be there before).

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Rakesh (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524891 ] 

Rakesh commented on SOLR-215:
-----------------------------

Hi --

   Currently I am using SOLR 1.2.0 stable build, and this version does not have this feature (Support for Multiple SOLR cores). How do I get this feature? I tried to open the .patch file but I could not understand. Is there any particular version of SOLR in which I can get this. I also looked in to the file https://svn.apache.org/repos/asf/lucene/solr/trunk/src/java/org/apache/solr/core/SolrCore.java, but this does not contain any changes as this feature mentioned like introduction new SolrCore constructor.

 If possible could you please point me to the instructions where I can check out this feature or the latest source and build the SOLR binary.

Thanks in advance.
Rakesh

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description: 
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
The patch file to grab is solr-215.patch.zip (see MISC session below).

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content. 

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0"); 

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz


  was:
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content. 

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0"); 

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

previous patch version was missing SolrInit & AnalysisTestCase; sorry Otis

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507344 ] 

Henri Biestro commented on SOLR-215:
------------------------------------

About solr-255, I've posted a small comment to Toru.
Seems to me that solr-255/solr-215 features are mostly orthogonal; solr-255 allows one core to use mutliple indexes, solr-255 allows multiple cores in one instance.
But I like the idea of federated search (and federated indexing!).
I'm a bit worried though that adding a Lucene patch dependency & merging solr-215/solr-255 will make the commit occur even later...
But I'll follow your lead; I'll try & see if I can merge.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-538091.patch

Updated for revision 538091

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-trunk-542847-1.patch

Supersedes previous patches (including solr-trunk-542847.patch); all other attached patches should be ignored (& removed by anyone with proper permissions?).

Forgot to svn add some new files before creating the patch;
fixed a stupid logic error in SolrInit when parameters were missing;
added a way to get to the config & schema file names from a configured core.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> Allow multiple cores in one web-application (or one class-loader):
> This allows to have multiple cores created from different config & schema in the same application.
> The side effect is that this also allows different indexes.
> Implementation notes for the patch:
> The patch allows to have multiple 'named' cores in the same application.
> The current single core behavior has been retained  - the core named 'null' - but code could not be kept 100% compatible. (In particular, Solrconfig.config is gone; SolrCore.getCore() is still here though).
> A few classes were only existing as singletons and have thus been refactored.
> The Config class feature-set has been narrowed to class loading relative to the installation (lib) directory;
> The SolrConfig class feature-set has evolved towards the 'solr config' part, caching frequently accessed parameters;
> The IndexSchema class uses a SolrConfig instance; there are a few parameters in the configuration that pertain to indexing that were needed.
> The SolrCore is built from a SolrConfig & an IndexSchema.
> The creation of a core has become:
> //create a configuration
> SolrConfig config = SolrConfig.createConfiguration("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0");
> There are few other changes mainly related to passing through constructors the SolrCore/SolrConfig used.
> Some background on the 'whys':
> http://www.nabble.com/Multiple-Solr-Cores-tf3608399.html#a10082201
> http://www.nabble.com/Embedding-Solr-vs-Lucene%2C-multiple-Solr-cores--tf3572324.html#a9981355

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Will Johnson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513912 ] 

Will Johnson commented on SOLR-215:
-----------------------------------

did anything ever get baked into the patch for handling the core name as a cgi param instead of as a url path element?  the email thread we had going didn't seem to come to any hard conclusions but i'd like to lobby for it as a part of the spec.  i read through the patch but i couldn't quite follow things enough to tell.

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores - remove static singleton

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-215:
--------------------------

    Fix Version/s: 1.3

marking Fixed in 1.3

(I believe Ryan left this open to track any potential issues ...  if nothing else this way we'll remember to resolve it before releasing)

> Multiple Solr Cores - remove static singleton
> ---------------------------------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512038 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Just a quick comment - the .zip version of the patch is really a gzipped file:

$ wget --quiet https://issues.apache.org/jira/secure/attachment/12361583/solr-215.patch.zip
$ file solr-215.patch.zip
solr-215.patch.zip: gzip compressed data, was "solr-215.patch", from Unix


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507591 ] 

Henri Biestro edited comment on SOLR-215 at 6/25/07 3:37 AM:
-------------------------------------------------------------

updated for trunk 550264
patch is zipped; solr-215.patch.zip


 was:
updated for trunk 550028;
patch is zipped; solr-215.patch.zip

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513896 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

I didn't even realize this patch would still require cores to be declared apriori in static files such as web.xml. 

I think this new multi-core functionality should come with the "core manager" handler, as we said here:
https://issues.apache.org/jira/browse/SOLR-215#action_12506920
https://issues.apache.org/jira/browse/SOLR-215#action_12507189

So, something like:
/admin/coremanager?cmd=add&name=foo&schema=foo-schema.xml&config=foo-solrconfig.xml
(this assumes that foo-schema.xml and foo-solrconfig.xml already exist in conf/ dir)

One could also POST this and *include* the *content* of the 2 .xml files.  In that case the core manager would be the one writing their content to disk in conf/ dir prior to starting the given core.

My suggestion is that this be added in phase 2, after Henri's initial changes are committed.
Does this sound reasonable?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12507207 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

Excellent.  I'll assume you'll add something like this to your patch, then.
Any thoughs on SOLR-255 and ensuring you and Toru don't step on each other's toes?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Attachment: solr-215.patch.zip

updated for trunk 550028;
patch is zipped; solr-215.patch.zip

> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-215) Multiple Solr Cores

Posted by "Henri Biestro (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Henri Biestro updated SOLR-215:
-------------------------------

    Description: 
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content. 

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0"); 

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.

For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz


  was:
WHAT:
As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.

WHY:
The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
Multiple cores:
Deployment issues within some organizations where IT will resist deploying multiple web applications.
Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
Multiple indexes:
Multiple language collections where each document exists in different languages, analysis being language dependant.
Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.

HOW:
The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
Each core is 'named' and a static map (keyed by name) allows to easily manage them.
You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 

USAGE (example web deployment, patch installed):
Step0
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
Will index the 2 documents in solr.xml & monitor.xml
Step1:
http://localhost:8983/solr/core0/admin/stats.jsp
Will produce the statistics page from the admin servlet on core0 index; 2 documents
Step2:
http://localhost:8983/solr/core1/admin/stats.jsp
Will produce the statistics page from the admin servlet on core1 index; no documents
Step3:
java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
running queries from the admin interface, you can verify indexes have different content. 

USAGE (Java code):
//create a configuration
SolrConfig config = new SolrConfig("solrconfig.xml");
//create a schema
IndexSchema schema = new IndexSchema(config, "schema0.xml");
//create a core from the 2 other.
SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
//Accessing a core:
SolrCore core = SolrCore.getCore("core0"); 

PATCH MODIFICATIONS DETAILS (per package):
org.apache.solr.core:
The heaviest modifications are in SolrCore & SolrConfig.
SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.

org.apache.solr.analysis:
TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.

org.apache.solr.handler:
RequestHandlerBase takes the core as a constructor parameter.

org.apache.solr.util:
The test harness has been modified to expose the core it instantiates.

org.apache.solr.servlet:
SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.

Admin/servlet:
Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.

REPLICATION:
The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 

FUTURE:
Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.

MISC:
The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
3/ Apply the patch to the 'clean trunk'.
TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.



> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-215) Multiple Solr Cores

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513744 ] 

Otis Gospodnetic commented on SOLR-215:
---------------------------------------

I reviewed about 43% of this looooooooooooooooong patch (up to RequestHandlers.java).  Everything sees pretty clear so far, the changes are limited to SolrCore and SolrConfig changes.  Everything compiles and all test pass - good!

I will review the rest of the patch tomorrow and if there are no surprises, I hope to commit this tomorrow or Friday.

NOTE: If anyone does NOT want this committed this week, please shout!

Here are some comments about the things I saw in the patch so far:

1. src/java/org/apache/solr/update/SolrIndexConfig.java

-    if (writeLockTimeout != -1) IndexWriter.WRITE_LOCK_TIMEOUT=writeLockTimeout;
-    if (commitLockTimeout != -1) IndexWriter.COMMIT_LOCK_TIMEOUT=commitLockTimeout;

I think the above got lost, but maybe I missed where the timeouts are set now.

2. src/java/org/apache/solr/core/SolrCore.java

- if (mainIndexConfig.writeLockTimeout != -1) IndexWriter.setDefaultWriteLockTimeout(mainIndexConfig.writeLockTimeout);

Same as above - this might have gotten lost.

3. Why is SolrInfoRegistry deprecated?  Because it is no longer really needed and its functionality is replaced by SolrCore.getSolrCore().getInfoRegistry()?  Just checking.

4. src/java/org/apache/solr/core/Config.java

-      classLoader = Thread.currentThread().getContextClassLoader();
+      // NB5.5/win32/1.5_10: need to go thru local var or classLoader is not set!
+      ClassLoader loader = Thread.currentThread().getContextClassLoader();

Ah, NetBeans problem that you mentioned earlier.  This is just a local var being set, looks fine to me.

5. src/java/org/apache/solr/core/SolrCore.java

private static String checkName(String name) 

Couldn't the implementation of this checkName(name) method be simpler?  Aren't there String methods that will let you look for '/' or any other unwanted string/pattern?
Also, why does this method return a name when it doesn't modify it?  Wouldn't void or boolean without the exception be more straight forward?

   @Override
   protected void finalize() { close(); }

Shouldn't this finalize() method call shutdown() in order to close *all* cores?


> Multiple Solr Cores
> -------------------
>
>                 Key: SOLR-215
>                 URL: https://issues.apache.org/jira/browse/SOLR-215
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Henri Biestro
>            Priority: Minor
>         Attachments: solr-215.patch, solr-215.patch, solr-215.patch, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-215.patch.zip, solr-trunk-533775.patch, solr-trunk-538091.patch, solr-trunk-542847-1.patch, solr-trunk-542847.patch, solr-trunk-src.patch
>
>
> WHAT:
> As of 1.2, Solr only instantiates one SolrCore which handles one Lucene index.
> This patch is intended to allow multiple cores in Solr which also brings multiple indexes capability.
> The patch file to grab is solr-215.patch.zip (see MISC session below).
> WHY:
> The current Solr practical wisdom is that one schema - thus one index - is most likely to accomodate your indexing needs, using a filter to segregate documents if needed. If you really need multiple indexes, deploy multiple web applications.
> There are a some use cases however where having multiple indexes or multiple cores through Solr itself may make sense.
> Multiple cores:
> Deployment issues within some organizations where IT will resist deploying multiple web applications.
> Seamless schema update where you can create a new core and switch to it without starting/stopping servers.
> Embedding Solr in your own application (instead of 'raw' Lucene) and functionally need to segregate schemas & collections.
> Multiple indexes:
> Multiple language collections where each document exists in different languages, analysis being language dependant.
> Having document types that have nothing (or very little) in common with respect to their schema, their lifetime/update frequencies or even collection sizes.
> HOW:
> The best analogy is to consider that instead of deploying multiple web-application, you can have one web-application that hosts more than one Solr core. The patch does not change any of the core logic (nor the core code); each core is configured & behaves exactly as the one core in 1.2; the various caches are per-core & so is the info-bean-registry.
> What the patch does is replace the SolrCore singleton by a collection of cores; all the code modifications are driven by the removal of the different singletons (the config, the schema & the core).
> Each core is 'named' and a static map (keyed by name) allows to easily manage them.
> You declare one servlet filter mapping per core you want to expose in the web.xml; this allows easy to access each core through a different url. 
> USAGE (example web deployment, patch installed):
> Step0
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar solr.xml monitor.ml
> Will index the 2 documents in solr.xml & monitor.xml
> Step1:
> http://localhost:8983/solr/core0/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core0 index; 2 documents
> Step2:
> http://localhost:8983/solr/core1/admin/stats.jsp
> Will produce the statistics page from the admin servlet on core1 index; no documents
> Step3:
> java -Durl='http://localhost:8983/solr/core0/update' -jar post.jar ipod*.xml
> java -Durl='http://localhost:8983/solr/core1/update' -jar post.jar mon*.xml
> Adds the ipod*.xml to index of core0 and the mon*.xml to the index of core1;
> running queries from the admin interface, you can verify indexes have different content. 
> USAGE (Java code):
> //create a configuration
> SolrConfig config = new SolrConfig("solrconfig.xml");
> //create a schema
> IndexSchema schema = new IndexSchema(config, "schema0.xml");
> //create a core from the 2 other.
> SolrCore core = new SolrCore("core0", "/path/to/index", config, schema);
> //Accessing a core:
> SolrCore core = SolrCore.getCore("core0"); 
> PATCH MODIFICATIONS DETAILS (per package):
> org.apache.solr.core:
> The heaviest modifications are in SolrCore & SolrConfig.
> SolrCore is the most obvious modification; instead of a singleton, there is a static map of cores keyed by names and assorted methods. To retain some compatibility, the 'null' named core replaces the singleton for the relevant methods, for instance SolrCore.getCore(). One small constraint on the core name is they can't contain '/' or '\' avoiding potential url & file path problems.
> SolrConfig (& SolrIndexConfig) are now used to persist all configuration options that need to be quickly accessible to the various components. Most of these variables were static like those found in SolrIndexSearcher. Mimicking the intent of these static variables, SolrConfig & SolrIndexConfig use public final members to expose them.
> SolrConfig inherits from Config which has been modified; Config is now more strictly a dom document (filled from some resource) and methods to evaluate xpath expressions. Config also continues to be the classloader singleton that allows to easily instantiate classes located in the Solr installation directory.
> org.apache.solr.analysis:
> TokenizerFactory & FilterFactory now get the SolrConfig passed as a parameter to init; one might want to read some resources to initialize the factory and the config dir is in the config. This is partially redundant with the argument map though.
> org.apache.solr.handler:
> RequestHandlerBase takes the core as a constructor parameter.
> org.apache.solr.util:
> The test harness has been modified to expose the core it instantiates.
> org.apache.solr.servlet:
> SolrDispatchFilter is now instantiating a core configured at init time; the web.xml must contain one filter declaration and one filter mapping per core you want to expose.  Wherever some admin or servlet or page was referring to the SolrCore singleton or SolrConfig, they now check for the request attribute 'org.apache.solr.SolrCore' first; the filters set this attribute before forwarding to the other parts.
> Admin/servlet:
> Has been modified to use the core exposed through the request attribute 'org.apache.solr.SolrCore'.
> REPLICATION:
> The feature has not been implemented yet; the starting point is that instead of having just one index directory 'index/', the naming scheme for the index data directories is 'index*/'. Have to investigate. 
> FUTURE:
> Uploading new schema/conf would be nice, allowing Solr to create cores dynamically; the upload mechanism itself is easy, the servlet dispatch filter needs to be modified.
> Having replication embedded in the Solr application itself using an http based version of the rsync algorithm; some of the core code of jarsync might be handy.
> MISC:
> The patch production process (not as easy as I thought it was with a Windows/Netbeans/cygwin/TortoiseSVN).
> 0/ Initial point is to have the modified code running in a local patch branch, all tests ok.
> 1/ Have one 'clean version' of the trunk aside the local patch branch; you'll need to verify that your patch can be applied to the last clean trunk version and that various tests still work from there. Creating the patch is key.
> 2/ If you used some IDE and forgot to set the auto-indentation corrrectly, you most likely need working around the space/indentation patch clutter that results. I could not find a way to get TortoiseSVN create a patch with the proper options (ignore spaces & al) and could not find a way to get NetbeansSVN generate one either. Thus I create the patch from the local trunk root through cygwin (with svn+patchutils); svn diff --diff-cmd /usr/bin/diff -x "-w -B -b -E -d -N -u" > ~/solr-215.patch.
> Before generating the patch, it is important to issue an 'svn add ...' for each file you might have added; a quick "svn status | grep '?'" allows to verify nothing will be missing. Not elegant, but you can even follow with: svn status | grep '?' | awk '{print $2}' | xargs svn add
> 3/ Apply the patch to the 'clean trunk'.
> TortoiseSVN 'apply patch' command only understands 'unified diff' thus the '-u' option.
> Alternatively, you can apply the patch through cygwin: patch -p0 -u < solr-215.patch.
> I've updated the 'dev' environment to an x86 Solaris 10 box which now generates the zipped patch( solr-215.patch.zip , same patch production method).
> For Solaris 10 users, patch must be "gnu" patch: /usr/local/bin/patch is its usual location (not to be confused with /bin/patch...)
> For x86, you can find it at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/patch-2.5.4-sol10-x86-local.gz ; I don't know about diff but I'm using the version located at ftp://ftp.sunfreeware.com/pub/freeware/intel/10/diffutils-2.8.1-sol10-intel-local.gz

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.