You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jmuguruza <jm...@gmail.com> on 2012/01/12 21:19:39 UTC

a way to marshall xml doc into a SolrInputDocument

If I have individual files in the expected Solr format (having just ONE doc
per file):

<add>
  <doc>
    <field name="id">GB18030TEST</field>
    <field name="name">Test with some GB18030 encoded characters</field>
    <field name="features">No accents here</field>
    <field name="features">ÕâÊÇÒ»¸ö¹¦ÄÜ</field>
    <field name="price">0</field>
  </doc>
</add>

Is not there a way to easily marshal that file into a SolrInputDocument? Do
I have to do the parsing myself?
 
I need them in java pojo cause I want to modify some fields before indexing.
I would think that is possible with built in methods in Solr but cannot find
a way.

thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/a-way-to-marshall-xml-doc-into-a-SolrInputDocument-tp3654777p3654777.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: a way to marshall xml doc into a SolrInputDocument

Posted by jmuguruza <jm...@gmail.com>.
even if they could (not sure if they could be done there, as they involve
properly formatting some fields so dates are in correct format etc, and
maybe the format is checked first) I would prefer to do it in the solrj side
as the code will be much simpler for me.

thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/a-way-to-marshall-xml-doc-into-a-SolrInputDocument-tp3654777p3655033.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: a way to marshall xml doc into a SolrInputDocument

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Can those modifications be made on the server side? If so, you could create
an UpdateRequestProcessor. See
http://wiki.apache.org/solr/UpdateRequestProcessor

On Thu, Jan 12, 2012 at 5:19 PM, jmuguruza <jm...@gmail.com> wrote:

> If I have individual files in the expected Solr format (having just ONE doc
> per file):
>
> <add>
>  <doc>
>    <field name="id">GB18030TEST</field>
>    <field name="name">Test with some GB18030 encoded characters</field>
>    <field name="features">No accents here</field>
>    <field name="features">ÕâÊÇÒ»¸ö¹¦ÄÜ</field>
>    <field name="price">0</field>
>  </doc>
> </add>
>
> Is not there a way to easily marshal that file into a SolrInputDocument? Do
> I have to do the parsing myself?
>
> I need them in java pojo cause I want to modify some fields before
> indexing.
> I would think that is possible with built in methods in Solr but cannot
> find
> a way.
>
> thanks
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/a-way-to-marshall-xml-doc-into-a-SolrInputDocument-tp3654777p3654777.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: a way to marshall xml doc into a SolrInputDocument

Posted by Chris Hostetter <ho...@fucit.org>.
: Anyway thanks, seems I'll have to code it myself, not hard, just tedious. 

you could probably re-use a *log* of what's in XMLLoader -- certinaly 
easier then starting from scratch -- i just don't know if you'll be able 
to drop it in and use the API as is.


-Hoss

Re: a way to marshall xml doc into a SolrInputDocument

Posted by jmuguruza <jm...@gmail.com>.
Chris Hostetter-3 wrote
> 
> but you're the first person i've ever seen ask about 
> serializng to Solr's XML format on the client, then parse it again, then 
> send the SolrInputDocument to Solr (seems like a lot of 
> gratuitious serialize/desrialze/serialise/etc...)
> -Hoss
> 

Yes, , but I am not doing the first serialization, the xml files are my
starting point, so someone else (with their own tools) have generated them.
That is why I need to parse them.

Anyway thanks, seems I'll have to code it myself, not hard, just tedious. 

--
View this message in context: http://lucene.472066.n3.nabble.com/a-way-to-marshall-xml-doc-into-a-SolrInputDocument-tp3654777p3656090.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: a way to marshall xml doc into a SolrInputDocument

Posted by Chris Hostetter <ho...@fucit.org>.
: Is not there a way to easily marshal that file into a SolrInputDocument? Do
: I have to do the parsing myself?
:  
: I need them in java pojo cause I want to modify some fields before indexing.
: I would think that is possible with built in methods in Solr but cannot find
: a way.

the class that does this in Solr is the XMLLoader, but it's not really 
designed to be used client side.  

There are SolrJ methods to serialze *too* xml from a POJO, or sometimes 
clients use external systems to generate the XML and cache it, and then 
stream it to Solr, but you're the first person i've ever seen ask about 
serializng to Solr's XML format on the client, then parse it again, then 
send the SolrInputDocument to Solr (seems like a lot of 
gratuitious serialize/desrialze/serialise/etc...)


-Hoss