You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ryan McKinley <ry...@gmail.com> on 2008/02/04 23:58:39 UTC

Re: For an "XML" fieldtype

Depends what you are trying to do.

Is there anything wrong with just using string or text fieldType?

If you use the XML writer, it will get returned xml encodedd (> becomes 
&gt etc).  I think if you use the JSON writer, it is only escaped for json.

what is missing?  what problem are you hitting?

ryan


Frédéric Glorieux wrote:
> Hi all,
> 
> Sorry to repost on this issue.
> Is there a regular way to use a field to store XML source of a document? 
> If not, is a fieldType the solution ?
> 
> Or, is it a "solr-user" question ?
> 
> Sorry if I have post in the bad place.
> 


Re: For an "XML" fieldtype

Posted by "Frédéric Glorieux (École nationale des chartes)" <fr...@enc.sorbonne.fr>.
Thanks Chris,

> this idea has been discussed before, most notably in this thread...
> 
> http://www.nabble.com/Indexing-XML-files-to7705775.html
> ...as discussed there, the crux of the isue is not a special fieldtype, 
> but a custom ResponseWriter that outputs the XML you want, and leaves any 
> field values you want unescaped (assuming you trust them to be wellformed)  
> how you decide what field values to leave unescaped could either be 
> hardcoded, or driven by the FieldType of each field (in which case you 
> might write an XmlField that subclasses StrField, but you wouldn't need to 
> override any methods -- just see that the FieldType is XmlField and use 
> that as your guide.


Sorry to haven't find this link. I discovered that I have done exactly 
the same as mirko-9
<http://www.nabble.com/Re%3A-Indexing-XML-files-p7742668.html>
xmlWriter.writePrim("xml", name, f.stringValue(), false);

So, this a good way to implement our need, but, there's good reasons to 
not commit it to Solr core : XmlResponseWriter schema, code injection 
risks. Such prudence make us very confident in Solr.

> : I would be glad that this class could be commited, so that I do not need to
> : keep it up to date with future Solr release.
> 
> as long as you stick to the contracts of FieldType and/or ResponseWriter 
> you don't need to worry -- these are published SolrPlugin APIs that Solr 
> won't break ... we expect people to implment them, and people can expect 
> their plugins to work when they upgrade Solr.



--
Frédéric Glorieux

Re: For an "XML" fieldtype

Posted by Chris Hostetter <ho...@fucit.org>.
: > Is there anything wrong with just using string or text fieldType?
: > If you use the XML writer, it will get returned xml encodedd (> becomes &gt
: > etc).
: 
: This is quite the only change I done to StrField, so I get back the original
: XML string stored, and could directly transform it with XSL.

this idea has been discussed before, most notably in this thread...

http://www.nabble.com/Indexing-XML-files-to7705775.html

...as discussed there, the crux of the isue is not a special fieldtype, 
but a custom ResponseWriter that outputs the XML you want, and leaves any 
field values you want unescaped (assuming you trust them to be wellformed)  
how you decide what field values to leave unescaped could either be 
hardcoded, or driven by the FieldType of each field (in which case you 
might write an XmlField that subclasses StrField, but you wouldn't need to 
override any methods -- just see that the FieldType is XmlField and use 
that as your guide.

: I would be glad that this class could be commited, so that I do not need to
: keep it up to date with future Solr release.

as long as you stick to the contracts of FieldType and/or ResponseWriter 
you don't need to worry -- these are published SolrPlugin APIs that Solr 
won't break ... we expect people to implment them, and people can expect 
their plugins to work when they upgrade Solr.

http://wiki.apache.org/solr/SolrPlugins


-Hoss


Re: For an "XML" fieldtype

Posted by Frédéric Glorieux <fr...@enc.sorbonne.fr>.
Hi Ryan

Thanks  for answer,

> Depends what you are trying to do.
> 
> Is there anything wrong with just using string or text fieldType?
> If you use the XML writer, it will get returned xml encodedd (> becomes 
> &gt etc).

This is quite the only change I done to StrField, so I get back the 
original XML string stored, and could directly transform it with XSL.

> I think if you use the JSON writer, it is only escaped for json.

I haven't tested json writer, but could verify before proposing the class.

> what is missing?  what problem are you hitting?

I would be glad that this class could be commited, so that I do not need 
to keep it up to date with future Solr release.

-- 
Frédéric Glorieux
École nationale des chartes
Direction des nouvelles technologies et de l'informatique