You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Stephen Duncan Jr <st...@gmail.com> on 2011/07/14 01:13:30 UTC

XInclude Multiple Elements

I've spent some time looking at various conversations on this problem,
but I can't find a solution that works.  XInclude has to point a valid
XML document, with a single root element.  It should be possible to
use xpointer to specify children elements to include, but as far as I
can tell, the xpointer support doesn't include any scheme complex
enough to express "all the child elements of a given element", which
is what I would like.

So, here's what I have (it's more complicated than necessary for this
example because I also want it to support includes for both the root
level and a sub-level to use when doing schema, as I want to do
particular fields, but not all of them in the include file):

<<includes.dtd>>

<!ELEMENT includes (include+)>
<!ELEMENT include ANY>
<!ATTLIST include id ID #REQUIRED>

<<solrconfigIncludes.xml>>

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE includes SYSTEM "includes.dtd">
<includes>
	<include id="root">
...several elements to be included at the root of a solrconfig.xml file...
	</include>
</includes>

<<solrconfig.xml>>

<?xml version="1.0" encoding="UTF-8" ?>
<config>
	<xi:include href="../../conf/solrconfigIncludes.xml" xpointer="root"
xmlns:xi="http://www.w3.org/2001/XInclude" />

	<requestHandler name="dismax" class="solr.SearchHandler">
		<lst name="defaults">
			<str name="defType">dismax</str>
			<str name="echoParams">explicit</str>
			<float name="tie">0.01</float>
			<str name="qf">all_text</str>
			<str name="pf">all_text</str>
			<str name="bf">recip(ms(NOW,dateoccurredboost),3.16e-11,1,1)</str>
			<str name="fl"></str>
			<str name="mm">1&lt;-1 4&lt;-2</str>
			<int name="ps">100</int>
			<str name="q.alt">*:*</str>
			<str name="hl.fl">highlight</str>
		</lst>

		<arr name="last-components">
			<str>spellcheck</str>
		</arr>
	</requestHandler>
</config>

So, that xpointer="root", unfortunately, grabs the <include id="root">
element, but what I need, of course, is the contents of that element
(the children).

I see this post:
http://lucene.472066.n3.nabble.com/including-external-files-in-config-by-corename-td698324.html
that implies you can use #xpointer(/*/node()) to get all elements of
the root node (like if I changed my example to only have one include,
and just used multiple files, which is fine if it works), however my
testing gave this error: ERROR org.apache.solr.core.CoreContainer -
org.xml.sax.SAXParseException: Fragment identifiers must not be used.
The 'href' attribute value
'../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not
permitted.  I tried several other variations of trying to come up with
pointers using node() or *, none of which worked.

And I also see this post:
http://lucene.472066.n3.nabble.com/solrconfig-xml-and-xinclude-td984058.html
that shows off a cumbersome way to list out each child element by
index number using the element scheme, which I assume works, but is
way to cumbersome to use.

Does any have any success using XInclude to include more than one
element?  I'm open to any and all approaches to having
partially-common configuration between cores.

Thanks,
Stephen

--
Stephen Duncan Jr
www.stephenduncanjr.com

Re: XInclude Multiple Elements

Posted by Stephen Duncan Jr <st...@gmail.com>.
Does anyone use XInclude?  I'd like to hear about any successful usage at all.

Stephen Duncan Jr
www.stephenduncanjr.com

Re: XInclude Multiple Elements

Posted by Stephen Duncan Jr <st...@gmail.com>.
I'm also open to ideas on useful ways to use XInclude while only
targeting a single element.  I haven't been able to come up with a way
that that's useful enough for me, but maybe I'm not thinking about the
problem from the proper point of view.  For the schema, I could have
the types as a single include, but for solrconfig.xml I'd want to
share almost all of the configuration except for a few customizations,
and all the elements are within the root.  I can't see how to use
XInclude to be useful for this case.

I'd appreciate any examples people of have of how their using XInclude
for configuration sharing.

Thanks,
Stephen

Stephen Duncan Jr
www.stephenduncanjr.com


On Wed, Jul 13, 2011 at 7:13 PM, Stephen Duncan Jr
<st...@gmail.com> wrote:
> I've spent some time looking at various conversations on this problem,
> but I can't find a solution that works.  XInclude has to point a valid
> XML document, with a single root element.  It should be possible to
> use xpointer to specify children elements to include, but as far as I
> can tell, the xpointer support doesn't include any scheme complex
> enough to express "all the child elements of a given element", which
> is what I would like.
>
> So, here's what I have (it's more complicated than necessary for this
> example because I also want it to support includes for both the root
> level and a sub-level to use when doing schema, as I want to do
> particular fields, but not all of them in the include file):
>
> <<includes.dtd>>
>
> <!ELEMENT includes (include+)>
> <!ELEMENT include ANY>
> <!ATTLIST include id ID #REQUIRED>
>
> <<solrconfigIncludes.xml>>
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE includes SYSTEM "includes.dtd">
> <includes>
>        <include id="root">
> ...several elements to be included at the root of a solrconfig.xml file...
>        </include>
> </includes>
>
> <<solrconfig.xml>>
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <config>
>        <xi:include href="../../conf/solrconfigIncludes.xml" xpointer="root"
> xmlns:xi="http://www.w3.org/2001/XInclude" />
>
>        <requestHandler name="dismax" class="solr.SearchHandler">
>                <lst name="defaults">
>                        <str name="defType">dismax</str>
>                        <str name="echoParams">explicit</str>
>                        <float name="tie">0.01</float>
>                        <str name="qf">all_text</str>
>                        <str name="pf">all_text</str>
>                        <str name="bf">recip(ms(NOW,dateoccurredboost),3.16e-11,1,1)</str>
>                        <str name="fl"></str>
>                        <str name="mm">1&lt;-1 4&lt;-2</str>
>                        <int name="ps">100</int>
>                        <str name="q.alt">*:*</str>
>                        <str name="hl.fl">highlight</str>
>                </lst>
>
>                <arr name="last-components">
>                        <str>spellcheck</str>
>                </arr>
>        </requestHandler>
> </config>
>
> So, that xpointer="root", unfortunately, grabs the <include id="root">
> element, but what I need, of course, is the contents of that element
> (the children).
>
> I see this post:
> http://lucene.472066.n3.nabble.com/including-external-files-in-config-by-corename-td698324.html
> that implies you can use #xpointer(/*/node()) to get all elements of
> the root node (like if I changed my example to only have one include,
> and just used multiple files, which is fine if it works), however my
> testing gave this error: ERROR org.apache.solr.core.CoreContainer -
> org.xml.sax.SAXParseException: Fragment identifiers must not be used.
> The 'href' attribute value
> '../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not
> permitted.  I tried several other variations of trying to come up with
> pointers using node() or *, none of which worked.
>
> And I also see this post:
> http://lucene.472066.n3.nabble.com/solrconfig-xml-and-xinclude-td984058.html
> that shows off a cumbersome way to list out each child element by
> index number using the element scheme, which I assume works, but is
> way to cumbersome to use.
>
> Does any have any success using XInclude to include more than one
> element?  I'm open to any and all approaches to having
> partially-common configuration between cores.
>
> Thanks,
> Stephen
>
> --
> Stephen Duncan Jr
> www.stephenduncanjr.com
>

Re: XInclude Multiple Elements

Posted by Amit Nithian <an...@gmail.com>.
Way back when I opened an issue about using XML entity includes in
Solr as a way to break up the config. I have found problems with
XInclude having multiple elements to include because the file is not
well formed. From what I have read, if you make this well formed, you
end up with a document that's not what you expect.

For example:
my schema.xml has
<fields>
...
<xinclude href="more_fields.xml .../>
</fields>

more_fields.xml
<field name="..."...

which isn't well formed. You could make it well formed:
<fields>
<field name ="..."...
</fields>
but then I think you end up with nested <fields> element which doesn't
work (and btw I still keep getting the blasted failed to parse error
which isn't very helpful). Looking at this made me wonder if entity
includes work with Solr 4 and indeed they do! They aren't as flexible
as XIncludes but for the purpose of breaking up an XML file into
smaller pieces, it works beautifully and as you would expect.

You can simply declare your entities at the top as shown in the
earlier thread and then include them where you need. I've been using
this for years and it works fairly well.

Cheers!
Amit


On Thu, May 31, 2012 at 7:01 AM, Bogdan Nicolau <bo...@gmail.com> wrote:
> I've also tried a lot of tricks to get xpointer working with multiple child
> elements, to no success.
> In the end, I've resorted to a less pretty, other-way-around solution. I do
> something like this:
> solrconfig_common.xml -> no xml declaration, no root tag, no nothing
> <etc></etc>
> <etc2></etc2>
> ...
> For each file that I need the common stuff into, I'd do something like this:
> solrconfig_master.xml/solrconfig_slave.xml/etc.
> <?xml version="1.0" encoding="UTF-8" ?>
> <!DOCTYPE config [
>         &lt;!ENTITY solrconfigcommon SYSTEM
> &quot;solrconfig_common.xml&quot;>
>         ]>
>
> <config>
>     &solrconfigcommon;
>
> </config>
>
> Solr starts with 0 warnings, the configuration is properly loaded, etc.
> Property substitution also works, including inside the
> solrconfig_common.xml. Hope it helps anyone.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/XInclude-Multiple-Elements-tp3167658p3987029.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: XInclude Multiple Elements

Posted by Bogdan Nicolau <bo...@gmail.com>.
I've also tried a lot of tricks to get xpointer working with multiple child
elements, to no success. 
In the end, I've resorted to a less pretty, other-way-around solution. I do
something like this:
solrconfig_common.xml -> no xml declaration, no root tag, no nothing
<etc></etc>
<etc2></etc2>
...
For each file that I need the common stuff into, I'd do something like this:
solrconfig_master.xml/solrconfig_slave.xml/etc.
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE config [
        &lt;!ENTITY solrconfigcommon SYSTEM
&quot;solrconfig_common.xml&quot;>
        ]>

<config>
    &solrconfigcommon;

</config>

Solr starts with 0 warnings, the configuration is properly loaded, etc.
Property substitution also works, including inside the
solrconfig_common.xml. Hope it helps anyone.

--
View this message in context: http://lucene.472066.n3.nabble.com/XInclude-Multiple-Elements-tp3167658p3987029.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: XInclude Multiple Elements

Posted by Michael Sokolov <so...@ifactory.com>.
The various XInclude specs were never really fully implemented by XML 
parsers.   IMO it's really best for including whole XML files. If I 
remember right, the situation is that the xpointer() scheme (the most 
flexible) wasn't implemented.  There are two other schemes for 
addressing content within a document.  One of them (the sort of 
"default" scheme - see 
http://www.w3.org/TR/2003/REC-xptr-framework-20030325/) relies on 
identifiers in the document, but to get these identifiers "identified", 
you have to run the document through a DTD or Schema validation step as 
part of parsing.  I never did get that to work, but if you're diligent 
it seems possible.  The other scheme (the element() scheme) allows you 
to address child nodes by simple paths - like XPath, but much more 
limited syntax (see http://www.w3.org/TR/2003/REC-xptr-element-20030325/)

-Mike
> I did attempt the xpointer="xpointer(//requestHandler)" syntax, and
> received this error: 2011-07-13 18:49:06,640 [main] WARN
> org.apache.solr.core.Config - XML parse warning in
> "solrres:/solrconfig.xml", line 3, column 133: SchemeUnsupported: The
> XPointer scheme 'xpointer' is not supported. This matches what the
> wiki page indicates, and the Xerces FAQ confirms, which is that Xerces
> does not support the xpointer() scheme.  I was not able to find any
> indication that there was any Java libraries available that do support
> the xpointer scheme.  If anyone knows of one, and how to configure
> Solr to use it, that would almost certainly fix my problem.
>
> I do understand that Solr isn't doing anything special with XInclude.
> But from my attempts to understand the state of XInclude on the JVM, I
> am unable to identify a useful technique for taking advantage of it to
> share configuration in Solr.  My hope was that someone who had used it
> successfully could indicate either something I missed about how to
> make it work, or a useful pattern for working within the limitations
> of the available functionality.
>
> Stephen Duncan Jr
> www.stephenduncanjr.com


Re: XInclude Multiple Elements

Posted by Stephen Duncan Jr <st...@gmail.com>.
On Mon, Jul 18, 2011 at 8:06 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> Can you post the details of your JVM / ServletContainer and the full stack
> trace of the exception?  My understanding is that fragment identifiers are
> a mandatory part of the xinclude/xpointer specs.
>
> It would also be good to know if you tried the explicit "xpointer"
> attribute approach on the xinclude syntax also mentioned in that thread...
>
> I think it owuld be something like...
>
> <xi:include href="solrconfigIncludes.xml" xpointer="xpointer(//requestHandler)" />
>
>
> In general, Solr really isn't doing anything special with XInclude ...
> it's all just delegated to the XML Libraries.  You might want to start by
> ignoring solr, and reading up on XInclude/XPointer tutorials in general,
> and experimenting with command line xml tools to figure out the syntax you
> need to get the "final" xml structures you want -- then aply that
> knowledge to the solr config files.
>
>
> -Hoss

This is running on java 1.6.0_26, and jetty 7.4.4.v20110707.

The stack trace in the case of the use of the fragment is:

2011-07-13 18:52:42,953 [main] ERROR org.apache.solr.core.Config -
Exception during parsing file:
solrconfig.xml:org.xml.sax.SAXParseException: Fragment identifiers
must not be used. The 'href' attribute value
'../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not
permitted.
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(Unknown
Source)
	at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(Unknown
Source)
	at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportError(Unknown
Source)
	at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.reportFatalError(Unknown
Source)
	at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.handleIncludeElement(Unknown
Source)
	at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.emptyElement(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown
Source)
	at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
Source)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
	at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown
Source)
	at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
	at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
	at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source)
	at org.apache.solr.core.Config.<init>(Config.java:159)
	at org.apache.solr.core.SolrConfig.<init>(SolrConfig.java:131)
	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:435)
	at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
	at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
	at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
	at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
	at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
	at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:742)
	at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:245)
	at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1208)
	at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:586)
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:449)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
	at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:89)
	at org.eclipse.jetty.server.Server.doStart(Server.java:258)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:58)
	at com.issinc.cidne.solr.App.main(App.java:41)


I did attempt the xpointer="xpointer(//requestHandler)" syntax, and
received this error: 2011-07-13 18:49:06,640 [main] WARN
org.apache.solr.core.Config - XML parse warning in
"solrres:/solrconfig.xml", line 3, column 133: SchemeUnsupported: The
XPointer scheme 'xpointer' is not supported. This matches what the
wiki page indicates, and the Xerces FAQ confirms, which is that Xerces
does not support the xpointer() scheme.  I was not able to find any
indication that there was any Java libraries available that do support
the xpointer scheme.  If anyone knows of one, and how to configure
Solr to use it, that would almost certainly fix my problem.

I do understand that Solr isn't doing anything special with XInclude.
But from my attempts to understand the state of XInclude on the JVM, I
am unable to identify a useful technique for taking advantage of it to
share configuration in Solr.  My hope was that someone who had used it
successfully could indicate either something I missed about how to
make it work, or a useful pattern for working within the limitations
of the available functionality.

Stephen Duncan Jr
www.stephenduncanjr.com

Re: XInclude Multiple Elements

Posted by Chris Hostetter <ho...@fucit.org>.
: I see this post:
: http://lucene.472066.n3.nabble.com/including-external-files-in-config-by-corename-td698324.html
: that implies you can use #xpointer(/*/node()) to get all elements of
: the root node (like if I changed my example to only have one include,
: and just used multiple files, which is fine if it works), however my
: testing gave this error: ERROR org.apache.solr.core.CoreContainer -
: org.xml.sax.SAXParseException: Fragment identifiers must not be used.
: The 'href' attribute value
: '../../conf/solrconfigIncludes.xml#xpointer(root/node())' is not
: permitted.  I tried several other variations of trying to come up with
: pointers using node() or *, none of which worked.

Can you post the details of your JVM / ServletContainer and the full stack 
trace of the exception?  My understanding is that fragment identifiers are 
a mandatory part of the xinclude/xpointer specs.

It would also be good to know if you tried the explicit "xpointer" 
attribute approach on the xinclude syntax also mentioned in that thread...

I think it owuld be something like...

<xi:include href="solrconfigIncludes.xml" xpointer="xpointer(//requestHandler)" /> 


In general, Solr really isn't doing anything special with XInclude ... 
it's all just delegated to the XML Libraries.  You might want to start by 
ignoring solr, and reading up on XInclude/XPointer tutorials in general, 
and experimenting with command line xml tools to figure out the syntax you 
need to get the "final" xml structures you want -- then aply that 
knowledge to the solr config files.


-Hoss