You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2007/12/08 07:56:43 UTC

[jira] Created: (SOLR-433) MultiCore and SpellChecker replication

MultiCore and SpellChecker replication
--------------------------------------

                 Key: SOLR-433
                 URL: https://issues.apache.org/jira/browse/SOLR-433
             Project: Solr
          Issue Type: Improvement
          Components: replication, spellchecker
    Affects Versions: 1.3
            Reporter: Otis Gospodnetic
             Fix For: 1.3


With MultiCore functionality coming along, it looks like we'll need to be able to:
  A) snapshot each core's index directory, and
  B) replicate any and all cores' complete data directories, not just their index directories.

Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m

Otis:

I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.

Right?

Ryan:

Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have

  /path/to/dist/index/...
  /path/to/dist/spelling-index/...
  /path/to/dist/foo

and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:

  /path/to/dist/core0/index/...
  /path/to/dist/core0/spelling-index/...
  /path/to/dist/core0/foo
  /path/to/dist/core1/index/...
  /path/to/dist/core1/spelling-index/...
  /path/to/dist/core1/foo


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jonathan Lee (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620430#action_12620430 ] 

Jonathan Lee commented on SOLR-433:
-----------------------------------

I'm interested in getting using this patch to replicate the spell index, but I am not using multiple cores.  However, the scripts in the patch do not work for single core setups since the assume that ${core} contains a valid value.  Even specifying -c "" will not work, since the variable is used in paths.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12571501#action_12571501 ] 

Otis Gospodnetic commented on SOLR-433:
---------------------------------------

You definitely want to put this in a separate JIRA issue and in Lucene's JIRA where you'll find a few similar issues already.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>         Attachments: RunExecutableListener.patch, solr-433.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jonathan Lee (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Lee updated SOLR-433:
------------------------------

    Attachment: SOLR-433.patch

(patch from comment https://issues.apache.org/jira/browse/SOLR-433?focusedCommentId=12620430#action_12620430)

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Stephane Bailliez (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12636273#action_12636273 ] 

Stephane Bailliez commented on SOLR-433:
----------------------------------------

Note that on the last patches, the grep to retrieve the snapshot is incorrect:

{noformat}
ls ${data_dir}|grep "${snap_prefix}\."|grep -v wip|sort -r|head -1
{noformat}

would always retrieve the latest one on the ls, it needs to be with an anchor in the grep for the prefix otherwise it will never update the index snapshot (since 'snapshot' is present in every snapshot of index)

{noformat}
ls ${data_dir}|grep "^${snap_prefix}\."|grep -v wip|sort -r|head -1
{noformat}

should be changed in snappuller and snapinstaller

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.4
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Steigerwald updated SOLR-433:
----------------------------------

    Attachment: RunExecutableListener.patch

Should work.  Had to pull changes out of a separate class I write to create this patch.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>         Attachments: RunExecutableListener.patch, solr-433.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Mike Klaas (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mike Klaas updated SOLR-433:
----------------------------

    Fix Version/s:     (was: 1.3)

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jonathan Lee (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Lee updated SOLR-433:
------------------------------

    Attachment: SOLR-433.patch

Includes Stephane's fixes for snappuller & snapinstaller and some minor edits 

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.4
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558064#action_12558064 ] 

Otis Gospodnetic commented on SOLR-433:
---------------------------------------

I don't think so, Doug.  If you are asking if you should start working on this and contribute a patch, by all means, please! :)


> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jonathan Lee (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620430#action_12620430 ] 

jonjlee edited comment on SOLR-433 at 8/7/08 9:01 AM:
-----------------------------------------------------------

I'm interested in using this patch to replicate the spell index, but I am not using multiple cores.  The scripts in the patch do not work for single core setups since it makes an assumptions that the core name will appear in the rsync path.

I am attaching another version of the patch which makes a few changes:
* snapcleaner, snapshooter, and snappuller are backwards compatible with the naming convention used by the current scripts
* Fixed a few small bugs where static strings were used instead of the corresponding variables
* Avoid assuming that an executable will accept a '-c' parameter to specify the core name.  Instead, allow a RunExecutableListener to be conditionally executed for specified cores. RunExecutableListener now accepts a <arr name="cores"> parameter. You would need to add a listener for each core. For example:
{code}
<listener event="postOptimize" class="core.RunExecutableListener">
    <arr name="cores"> <str>core0</str> </arr>
    <str name="exe">solr/bin/snapshooter</str>
    <str name="dir">.</str>
    <arr name="args"> <str>-d /usr/local/solr/core0/data</str> </arr>
</listener>
<listener event="postOptimize" class="core.RunExecutableListener">
    <arr name="cores"> <str>core1</str> </arr>
    <str name="exe">solr/bin/snapshooter</str>
    <str name="dir">.</str>
    <arr name="args"> <str>-d /usr/local/solr/core1/data</str> </arr>
</listener>
{code}
(Another reasonable alternative to this might also be to accept variables like ${SOLR_CORE} in <args> and <env> which are resolved by RunExecutableListener.)
* Removed the '-c' parameter from snappuller, replacing it instead with the '-r' option to specify an rsync module. This allows us to not assume the location of the core's data path. Instead we would add a new module to rsyncd.conf for each core. e.g.
{code}
...
[core0]
    path = /usr/local/solr/core0/data
    comment = core0
[core1]
    path = /usr/local/solr/core1/data
    comment = core1
...
{code}
then use:
{code}
./snappuller -r core0
{code}



      was (Author: jonjlee):
    I'm interested in getting using this patch to replicate the spell index, but I am not using multiple cores.  However, the scripts in the patch do not work for single core setups since the assume that ${core} contains a valid value.  Even specifying -c "" will not work, since the variable is used in paths.
  
> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Steigerwald updated SOLR-433:
----------------------------------

    Attachment: spellindexfix.patch

This may be better suited as another bug, or even posted in the lucene project, but I also made a small patch for the lucene spellchecker.

Hitting the spellcheck request handler with a reopen command wasn't working after we installed a new spell index snapshot.  Searcher was being created for the new index, but not a reader.

Also, if you rebuild the spell index after it has already been built, it cleans out the index.  You then have to send it a rebuild again to actually rebuild the index.  Frequency of words in the spell index seemed to remain constant when rebuilding the spell index multiple times.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>         Attachments: RunExecutableListener.patch, solr-433.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558528#action_12558528 ] 

Hoss Man commented on SOLR-433:
-------------------------------

as i recall, most of the scripts currently take a "-d data_dir" option ... and then use makethe following assumptions...
  1) ${data_dir}/index is what gets snapshooted/snapinstalled
  2) ${data_dir}/snapshot* is the pattern for naming snapshots.

a good way to evolve the scripts would probably be to have an alternate set of options ... maybe "-D dir" and "-S snapshots" that assumes ${dir} is where *all* the data to be snapshooted/snapinstalled lives, and ${snapshots} is where *all* snapshoots live.

(except i think some of the scripts already have a -D .. snappuller or snapcleaner maybe?)

alternate idea: we don't have to add a command line option at all ... support for something like this could require special scripts.conf options.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Nicolas Lalevée (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Lalevée updated SOLR-433:
---------------------------------

    Attachment: SOLR-433-r698590.patch

The patch that works the best for me is the last one, Jonathan's one, as I run Solr with only one core but with two spellchecker indexes.
* I have fixed in that patch the snapcleaner when called to explicitely clean the main index ({{./snapcleaner -i index}}).
* I also fixed every "USAGE" printing.
* and I have included the patch on {{RunExecutableListener}} and a little simplified it (it can now access to the solr core instance).


> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.4
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic updated SOLR-433:
----------------------------------

    Fix Version/s: 1.4

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.4
>
>         Attachments: RunExecutableListener.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558031#action_12558031 ] 

Doug Steigerwald commented on SOLR-433:
---------------------------------------

Is anyone looking into this?

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jason Rutherglen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787155#action_12787155 ] 

Jason Rutherglen commented on SOLR-433:
---------------------------------------

Are the existing patches for multiple cores or only for spellchecking?

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (scripts), spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.5
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jeremy Hinegardner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617694#action_12617694 ] 

Jeremy Hinegardner commented on SOLR-433:
-----------------------------------------

If this patch works for folks, I would like to see it committed and put in the nightly snapshot.  Or at least the RunExecutableListener.patch and solr-433.patch.

If there is any more work here, I'd be happy to work on it.  

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jeremy Hinegardner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619391#action_12619391 ] 

jjh edited comment on SOLR-433 at 8/3/08 5:29 PM:
-----------------------------------------------------------------

I've combined solr-433.patch and RunExecutableListener.patch into a single patch against svn trunk.  This also fixes a syntax bug in RunExecutableListener where super(core); was not invoked.

      was (Author: jjh):
    A single patch file that combines the original solr-433.patch and RunExecutableListener.patch into a single patch against svn trunk
  
> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Steigerwald updated SOLR-433:
----------------------------------

    Attachment: solr-433.patch

We have a ctl script that controls all of the functions and makes sure we don't run some things on the slaves (ie, snappuller, snapinstaller, rsyncd stuff).

We pass it a core and an index type:
{noformat}
./snapctl -a [rsyncd-enable/snappuller/snapinstaller/etc] -c core0 -i spell
{noformat}
spell is the name of our spellcheck index.  Default index is 'index'.

This is testing well in QA right now.  Hopefully this will help others out and maybe we'll have something similar committed soon.

Here's the basics of our snapctl script.  It's really tailored to our environment, so I probably won't post it as is.
{noformat}
$SOLR_BIN=/home/dsteiger/apps/solr/solr/bin
$CORES_PATH=/home/dsteiger/local/solr/cores
$CORE=core0 # from -c arg, default is 'null'
$INDEX=spell  # from -i arg, default is 'index'
$SOLR_LOGS=/home/dsteiger/apps/solr/logs # symlinked to somewhere else
# $MASTER_HOST is determined based on environment (devel/qa/prod) from the scripts.conf

$SOLR_BIN/rsyncd-enable

$SOLR_BIN/rsyncd-disable

$SOLR_BIN/rsyncd-start -d $CORES_PATH

$SOLR_BIN/rsyncd-stop

$SOLR_BIN/snappuller-enable

$SOLR_BIN/snappuller-disable

$SOLR_BIN/snapshooter -d $CORES_PATH/$CORE/data -i $INDEX

$SOLR_BIN/snappuller -M $MASTER_HOST -S $SOLR_LOGS/clients -D $CORES_PATH/$CORE/data -d $CORES_PATH/$CORE/data -z -c $CORE -i $INDEX

$SOLR_BIN/snapinstaller -M $MASTER_HOST -S $SOLR_LOGS/clients -d $CORES_PATH/$CORE/data -c $CORE -i $INDEX

$SOLR_BIN/snapcleaner -D 1 -d $CORES_PATH/$CORE/data -i $INDEX
{noformat}


We also modified core.RunExecutableListener to be able to pass the core name to out snapctl script.

{code:xml}
    <listener event="postCommit" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"><str>-a snapshooter</str><str>-i index</str></arr>
    </listener>
    <listener event="postOptimize" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"> <str>-a snapshooter</str> </arr>
    </listener>
{code}

I think this is all I changed for RunExecutableListener:

{code:title=core.RunExecutableListener.java|borderStyle=solid}
  // in init()
  // Add the core name to the command.
  if ("true".equals(args.get("coreName")) || Boolean.TRUE.equals(args.get("coreName"))) {
    cmdlist.add("-c " + core.getName());
  }
{code} 

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>         Attachments: solr-433.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jeremy Hinegardner (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeremy Hinegardner updated SOLR-433:
------------------------------------

    Attachment: SOLR-433.patch

I've updated the patch to apply cleanly against trunk.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (scripts), spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.5
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570712#action_12570712 ] 

dsteigerwald edited comment on SOLR-433 at 2/20/08 7:19 AM:
----------------------------------------------------------------

We have a ctl script that controls all of the functions and makes sure we don't run some things on the slaves (ie, snappuller, snapinstaller, rsyncd stuff).

We pass it a core and an index type:
{noformat}
./snapctl -a [rsyncd-enable/snappuller/snapinstaller/etc] -c core0 -i spell
{noformat}
spell is the name of our spellcheck index.  Default index is 'index'.

This is testing well in QA right now.  Hopefully this will help others out and maybe we'll have something similar committed soon.

Here's the basics of our snapctl script.  It's really tailored to our environment, so I probably won't post it as is.
{noformat}
$SOLR_BIN=/home/dsteiger/apps/solr/solr/bin
$CORES_PATH=/home/dsteiger/local/solr/cores
$CORE=core0 # from -c arg, default is 'null'
$INDEX=spell  # from -i arg, default is 'index'
$SOLR_LOGS=/home/dsteiger/apps/solr/logs # symlinked to somewhere else
# $MASTER_HOST is determined based on environment (devel/qa/prod) from the scripts.conf

$SOLR_BIN/rsyncd-enable

$SOLR_BIN/rsyncd-disable

$SOLR_BIN/rsyncd-start -d $CORES_PATH

$SOLR_BIN/rsyncd-stop

$SOLR_BIN/snappuller-enable

$SOLR_BIN/snappuller-disable

$SOLR_BIN/snapshooter -d $CORES_PATH/$CORE/data -i $INDEX

$SOLR_BIN/snappuller -M $MASTER_HOST -S $SOLR_LOGS/clients -D $CORES_PATH/$CORE/data -d $CORES_PATH/$CORE/data -z -c $CORE -i $INDEX

$SOLR_BIN/snapinstaller -M $MASTER_HOST -S $SOLR_LOGS/clients -d $CORES_PATH/$CORE/data -c $CORE -i $INDEX

$SOLR_BIN/snapcleaner -D 1 -d $CORES_PATH/$CORE/data -i $INDEX
{noformat}


We also modified core.RunExecutableListener to be able to pass the core name to out snapctl script.

{code:xml}
    <listener event="postCommit" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"><str>-a snapshooter</str><str>-i index</str></arr>
    </listener>
    <listener event="postOptimize" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"> <str>-a snapshooter</str> </arr>
    </listener>
{code}

Going to attach patch to RunExecutableListener we're using.

      was (Author: dsteigerwald):
    We have a ctl script that controls all of the functions and makes sure we don't run some things on the slaves (ie, snappuller, snapinstaller, rsyncd stuff).

We pass it a core and an index type:
{noformat}
./snapctl -a [rsyncd-enable/snappuller/snapinstaller/etc] -c core0 -i spell
{noformat}
spell is the name of our spellcheck index.  Default index is 'index'.

This is testing well in QA right now.  Hopefully this will help others out and maybe we'll have something similar committed soon.

Here's the basics of our snapctl script.  It's really tailored to our environment, so I probably won't post it as is.
{noformat}
$SOLR_BIN=/home/dsteiger/apps/solr/solr/bin
$CORES_PATH=/home/dsteiger/local/solr/cores
$CORE=core0 # from -c arg, default is 'null'
$INDEX=spell  # from -i arg, default is 'index'
$SOLR_LOGS=/home/dsteiger/apps/solr/logs # symlinked to somewhere else
# $MASTER_HOST is determined based on environment (devel/qa/prod) from the scripts.conf

$SOLR_BIN/rsyncd-enable

$SOLR_BIN/rsyncd-disable

$SOLR_BIN/rsyncd-start -d $CORES_PATH

$SOLR_BIN/rsyncd-stop

$SOLR_BIN/snappuller-enable

$SOLR_BIN/snappuller-disable

$SOLR_BIN/snapshooter -d $CORES_PATH/$CORE/data -i $INDEX

$SOLR_BIN/snappuller -M $MASTER_HOST -S $SOLR_LOGS/clients -D $CORES_PATH/$CORE/data -d $CORES_PATH/$CORE/data -z -c $CORE -i $INDEX

$SOLR_BIN/snapinstaller -M $MASTER_HOST -S $SOLR_LOGS/clients -d $CORES_PATH/$CORE/data -c $CORE -i $INDEX

$SOLR_BIN/snapcleaner -D 1 -d $CORES_PATH/$CORE/data -i $INDEX
{noformat}


We also modified core.RunExecutableListener to be able to pass the core name to out snapctl script.

{code:xml}
    <listener event="postCommit" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"><str>-a snapshooter</str><str>-i index</str></arr>
    </listener>
    <listener event="postOptimize" class="core.RunExecutableListener">
        <str name="exe">./solr/bin/snapctl</str>
        <str name="dir">.</str>
        <bool name="wait">true</bool>
        <bool name="coreName">true</bool>
        <arr name="args"> <str>-a snapshooter</str> </arr>
    </listener>
{code}

I think this is all I changed for RunExecutableListener:

{code:title=core.RunExecutableListener.java|borderStyle=solid}
  // in init()
  // Add the core name to the command.
  if ("true".equals(args.get("coreName")) || Boolean.TRUE.equals(args.get("coreName"))) {
    cmdlist.add("-c " + core.getName());
  }
{code} 
  
> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>         Attachments: solr-433.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Doug Steigerwald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12558068#action_12558068 ] 

Doug Steigerwald commented on SOLR-433:
---------------------------------------

It's on our TODO list here if no one was working on it yet.  I'll see what we can get done based on the comments in the description.

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.3
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Chris Haggstrom (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Haggstrom updated SOLR-433:
---------------------------------

    Attachment: SOLR-433.patch

I've been using the patch submitted by Jonathan Lee on 10-02-08 for replicating a spelling directory in addition to the index, and it works very well for that purpose.

I'm attaching a slightly modified patch that allows the snapshooter "-c" option to work with an index or spelling directory that is not named "index".



> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication (scripts), spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>             Fix For: 1.4
>
>         Attachments: RunExecutableListener.patch, SOLR-433-r698590.patch, SOLR-433.patch, SOLR-433.patch, SOLR-433.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-433) MultiCore and SpellChecker replication

Posted by "Jeremy Hinegardner (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeremy Hinegardner updated SOLR-433:
------------------------------------

    Attachment: SOLR-433_unified.patch

A single patch file that combines the original solr-433.patch and RunExecutableListener.patch into a single patch against svn trunk

> MultiCore and SpellChecker replication
> --------------------------------------
>
>                 Key: SOLR-433
>                 URL: https://issues.apache.org/jira/browse/SOLR-433
>             Project: Solr
>          Issue Type: Improvement
>          Components: replication, spellchecker
>    Affects Versions: 1.3
>            Reporter: Otis Gospodnetic
>         Attachments: RunExecutableListener.patch, solr-433.patch, SOLR-433_unified.patch, spellindexfix.patch
>
>
> With MultiCore functionality coming along, it looks like we'll need to be able to:
>   A) snapshot each core's index directory, and
>   B) replicate any and all cores' complete data directories, not just their index directories.
> Pulled from the "spellchecker and multi-core index replication" thread - http://markmail.org/message/pj2rjzegifd6zm7m
> Otis:
> I think that makes sense - distribute everything for a given core, not just its index.  And the spellchecker could then also have its data dir (and only index/ underneath really) and be replicated in the same fashion.
> Right?
> Ryan:
> Yes, that was my thought.  If an arbitrary directory could be distributed, then you could have
>   /path/to/dist/index/...
>   /path/to/dist/spelling-index/...
>   /path/to/dist/foo
> and that would all get put into a snapshot.  This would also let you put multiple cores within a single distribution:
>   /path/to/dist/core0/index/...
>   /path/to/dist/core0/spelling-index/...
>   /path/to/dist/core0/foo
>   /path/to/dist/core1/index/...
>   /path/to/dist/core1/spelling-index/...
>   /path/to/dist/core1/foo

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.