You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2013/06/08 00:52:19 UTC

[jira] [Created] (SOLR-4910) solr.xml persistence is completely broken

Erick Erickson created SOLR-4910:
------------------------------------

             Summary: solr.xml persistence is completely broken
                 Key: SOLR-4910
                 URL: https://issues.apache.org/jira/browse/SOLR-4910
             Project: Solr
          Issue Type: Bug
    Affects Versions: 5.0, 4.4
            Reporter: Erick Erickson
            Assignee: Erick Erickson
            Priority: Blocker


I'm working on SOLR-4862 (persisting a created core doesn't preserve some values) and at least compared to 4.3 code, persisting to solr.xml is completely broken.

I learned to hate persistence while working on SOLR-4196 & etc. and I'm glad it's going away. I frequently got lost in implicit properties (they're easy to persist and shouldn't be), what should/shouldn't be persisted (e.g. the translated ${var:default} or the original), and it was a monster, so don't think I'm nostalgic for the historical behavior.

Before I dive back in I want to get some idea whether or not the current behavior was intentional or not, I don't want to go back into that junk only to undo someone else's work.

Creating a new core (collection2 in my example) with persistence turned on in solr.xml for instance changes the original definition for collection1 (stock 4.x as of tonight) from this:

<core name="collection1" instanceDir="collection1" shard="${shard:}" collection="${collection:collection1}" config="${solrconfig:solrconfig.xml}" schema="${schema:schema.xml}"
          coreNodeName="${coreNodeName:}"/>
to this:


  <core loadOnStartup="true" shard="${shard:}" instanceDir="collection1/" transient="false" name="collection1" dataDir="data/" collection="${collection:collection1}">
      <property name="name" value="collection1"/>
      <property name="config" value="solrconfig.xml"/>
      <property name="solr.core.instanceDir" value="solr/collection1/"/>
      <property name="transient" value="false"/>
      <property name="schema" value="schema.xml"/>
      <property name="loadOnStartup" value="true"/>
      <property name="solr.core.schemaName" value="schema.xml"/>
      <property name="solr.core.name" value="collection1"/>
      <property name="solr.core.dataDir" value="data/"/>
      <property name="instanceDir" value="collection1/"/>
      <property name="solr.core.configName" value="solrconfig.xml"/>
    </core>



So, there are two questions:
1> what is correct for 4.x?
2> do we care at all about 5.x?

As much as I hate to say it, I think that we need to go back to the 4.3 behavior. It might be as simple as not persisting in the <property> tags anything already in the original definition. Not quite sure what to put where in the newly-created core though, I suspect that the compact <core + attribs> would be best (assuming there's no <property> tag already in the definition. I really hate the mix of attributes on the <core> tag and <property> tags, wish we had one or the other....


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [jira] [Created] (SOLR-4910) solr.xml persistence is completely broken

Posted by Mark Miller <ma...@gmail.com>.
I think in the past, it simply tried to write out what it read in. That should be the end goal, and AFAIK things pretty much worked like that in the late 3's or early 4's.

I'm pretty sure for ${} sys prop notation, 4.3 broke the <cores section, but now it sounds like the <core part is broken. Just from memory though - I know something wasn't working in 4.3 along those lines. So the real issue seems to be the current testing...

We have a some pretty simple tests for this (eg, they are easy to add too), so it sounds like we just need to beef them up - the changes necessary are easy from there.

- Mark

On Jun 8, 2013, at 10:04 AM, Erick Erickson <er...@gmail.com> wrote:

> I'd post this on the JIRA, but it's undergoing maintenance....
> 
> I'm thinking that the right thing to do is establish rules that are
> enforced at persist time rather than what used to happen, which was
> all sorts of gyrations trying to remember what was added when. I
> hacked together a quick PoC patch that I'll attach to the JIRA when
> it's back up and we can hack from there. It enforces two simple rules
> when persisting solr.xml
> 
> 1> if its already an attribute, don't add it as a <property> tag in a core.
> 2> if it's an implicit property, don't add it as a <property> tag in a core.
> 
> Or decide on another approach.
> 
> here's the meat of the patch, I've left out some bookeeping (creating
> and use of implicitPropertiesMap in CoreDescriptor):
> 
> Index: solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java
> IDEA additional info:
> Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
> <+>UTF-8
> ===================================================================
> --- solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java
> (revision 1490849)
> +++ solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java (revision )
> @@ -86,8 +86,14 @@
>     for (String key : keys) {
>       writeAttribute(w, key, coreDef.coreAttribs.get(key));
>     }
> -    Properties properties = coreDef.coreProperties;
> -    if (properties == null || properties.isEmpty()) w.write("/>\n"); // core
> +    // We should not be writing out implicit properties nor
> duplicating as properties things that are already attribs
> +    Properties properties = new Properties();
> +    for (String s : coreDef.coreProperties.stringPropertyNames()) {
> +      if (coreDef.coreAttribs.keySet().contains(s)) continue;
> +      if (CoreDescriptor.implicitPropertiesMap.containsKey(s)) continue;
> +    }
> +
> +    if (properties.isEmpty()) w.write("/>\n"); // core
>     else {
>       w.write(">\n");
>       writeProperties(w, properties, "      ");
> 
> On Fri, Jun 7, 2013 at 6:52 PM, Erick Erickson (JIRA) <ji...@apache.org> wrote:
>> Erick Erickson created SOLR-4910:
>> ------------------------------------
>> 
>>             Summary: solr.xml persistence is completely broken
>>                 Key: SOLR-4910
>>                 URL: https://issues.apache.org/jira/browse/SOLR-4910
>>             Project: Solr
>>          Issue Type: Bug
>>    Affects Versions: 5.0, 4.4
>>            Reporter: Erick Erickson
>>            Assignee: Erick Erickson
>>            Priority: Blocker
>> 
>> 
>> I'm working on SOLR-4862 (persisting a created core doesn't preserve some values) and at least compared to 4.3 code, persisting to solr.xml is completely broken.
>> 
>> I learned to hate persistence while working on SOLR-4196 & etc. and I'm glad it's going away. I frequently got lost in implicit properties (they're easy to persist and shouldn't be), what should/shouldn't be persisted (e.g. the translated ${var:default} or the original), and it was a monster, so don't think I'm nostalgic for the historical behavior.
>> 
>> Before I dive back in I want to get some idea whether or not the current behavior was intentional or not, I don't want to go back into that junk only to undo someone else's work.
>> 
>> Creating a new core (collection2 in my example) with persistence turned on in solr.xml for instance changes the original definition for collection1 (stock 4.x as of tonight) from this:
>> 
>> <core name="collection1" instanceDir="collection1" shard="${shard:}" collection="${collection:collection1}" config="${solrconfig:solrconfig.xml}" schema="${schema:schema.xml}"
>>          coreNodeName="${coreNodeName:}"/>
>> to this:
>> 
>> 
>>  <core loadOnStartup="true" shard="${shard:}" instanceDir="collection1/" transient="false" name="collection1" dataDir="data/" collection="${collection:collection1}">
>>      <property name="name" value="collection1"/>
>>      <property name="config" value="solrconfig.xml"/>
>>      <property name="solr.core.instanceDir" value="solr/collection1/"/>
>>      <property name="transient" value="false"/>
>>      <property name="schema" value="schema.xml"/>
>>      <property name="loadOnStartup" value="true"/>
>>      <property name="solr.core.schemaName" value="schema.xml"/>
>>      <property name="solr.core.name" value="collection1"/>
>>      <property name="solr.core.dataDir" value="data/"/>
>>      <property name="instanceDir" value="collection1/"/>
>>      <property name="solr.core.configName" value="solrconfig.xml"/>
>>    </core>
>> 
>> 
>> 
>> So, there are two questions:
>> 1> what is correct for 4.x?
>> 2> do we care at all about 5.x?
>> 
>> As much as I hate to say it, I think that we need to go back to the 4.3 behavior. It might be as simple as not persisting in the <property> tags anything already in the original definition. Not quite sure what to put where in the newly-created core though, I suspect that the compact <core + attribs> would be best (assuming there's no <property> tag already in the definition. I really hate the mix of attributes on the <core> tag and <property> tags, wish we had one or the other....
>> 
>> 
>> --
>> This message is automatically generated by JIRA.
>> If you think it was sent incorrectly, please contact your JIRA administrators
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [jira] [Created] (SOLR-4910) solr.xml persistence is completely broken

Posted by Erick Erickson <er...@gmail.com>.
I'd post this on the JIRA, but it's undergoing maintenance....

I'm thinking that the right thing to do is establish rules that are
enforced at persist time rather than what used to happen, which was
all sorts of gyrations trying to remember what was added when. I
hacked together a quick PoC patch that I'll attach to the JIRA when
it's back up and we can hack from there. It enforces two simple rules
when persisting solr.xml

1> if its already an attribute, don't add it as a <property> tag in a core.
2> if it's an implicit property, don't add it as a <property> tag in a core.

Or decide on another approach.

here's the meat of the patch, I've left out some bookeeping (creating
and use of implicitPropertiesMap in CoreDescriptor):

Index: solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java
(revision 1490849)
+++ solr/core/src/java/org/apache/solr/core/SolrXMLSerializer.java (revision )
@@ -86,8 +86,14 @@
     for (String key : keys) {
       writeAttribute(w, key, coreDef.coreAttribs.get(key));
     }
-    Properties properties = coreDef.coreProperties;
-    if (properties == null || properties.isEmpty()) w.write("/>\n"); // core
+    // We should not be writing out implicit properties nor
duplicating as properties things that are already attribs
+    Properties properties = new Properties();
+    for (String s : coreDef.coreProperties.stringPropertyNames()) {
+      if (coreDef.coreAttribs.keySet().contains(s)) continue;
+      if (CoreDescriptor.implicitPropertiesMap.containsKey(s)) continue;
+    }
+
+    if (properties.isEmpty()) w.write("/>\n"); // core
     else {
       w.write(">\n");
       writeProperties(w, properties, "      ");

On Fri, Jun 7, 2013 at 6:52 PM, Erick Erickson (JIRA) <ji...@apache.org> wrote:
> Erick Erickson created SOLR-4910:
> ------------------------------------
>
>              Summary: solr.xml persistence is completely broken
>                  Key: SOLR-4910
>                  URL: https://issues.apache.org/jira/browse/SOLR-4910
>              Project: Solr
>           Issue Type: Bug
>     Affects Versions: 5.0, 4.4
>             Reporter: Erick Erickson
>             Assignee: Erick Erickson
>             Priority: Blocker
>
>
> I'm working on SOLR-4862 (persisting a created core doesn't preserve some values) and at least compared to 4.3 code, persisting to solr.xml is completely broken.
>
> I learned to hate persistence while working on SOLR-4196 & etc. and I'm glad it's going away. I frequently got lost in implicit properties (they're easy to persist and shouldn't be), what should/shouldn't be persisted (e.g. the translated ${var:default} or the original), and it was a monster, so don't think I'm nostalgic for the historical behavior.
>
> Before I dive back in I want to get some idea whether or not the current behavior was intentional or not, I don't want to go back into that junk only to undo someone else's work.
>
> Creating a new core (collection2 in my example) with persistence turned on in solr.xml for instance changes the original definition for collection1 (stock 4.x as of tonight) from this:
>
> <core name="collection1" instanceDir="collection1" shard="${shard:}" collection="${collection:collection1}" config="${solrconfig:solrconfig.xml}" schema="${schema:schema.xml}"
>           coreNodeName="${coreNodeName:}"/>
> to this:
>
>
>   <core loadOnStartup="true" shard="${shard:}" instanceDir="collection1/" transient="false" name="collection1" dataDir="data/" collection="${collection:collection1}">
>       <property name="name" value="collection1"/>
>       <property name="config" value="solrconfig.xml"/>
>       <property name="solr.core.instanceDir" value="solr/collection1/"/>
>       <property name="transient" value="false"/>
>       <property name="schema" value="schema.xml"/>
>       <property name="loadOnStartup" value="true"/>
>       <property name="solr.core.schemaName" value="schema.xml"/>
>       <property name="solr.core.name" value="collection1"/>
>       <property name="solr.core.dataDir" value="data/"/>
>       <property name="instanceDir" value="collection1/"/>
>       <property name="solr.core.configName" value="solrconfig.xml"/>
>     </core>
>
>
>
> So, there are two questions:
> 1> what is correct for 4.x?
> 2> do we care at all about 5.x?
>
> As much as I hate to say it, I think that we need to go back to the 4.3 behavior. It might be as simple as not persisting in the <property> tags anything already in the original definition. Not quite sure what to put where in the newly-created core though, I suspect that the compact <core + attribs> would be best (assuming there's no <property> tag already in the definition. I really hate the mix of attributes on the <core> tag and <property> tags, wish we had one or the other....
>
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org