You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sc...@asia.com on 2010/06/23 13:59:00 UTC
Import XML files different format?
Hi,
I'm new to solr. It looks great.
I would like to add a XML document in the following format in solr:
<?xml version="1.0" encoding="utf-8"?>
<race>
<go>
<id><![CDATA[...]]></id>
<title><![CDATA[...]]></title>
<url><![CDATA[...]]></url>
<content><![CDATA[...]]></content>
<city><![CDATA[...]]></city>
<postcode><![CDATA[...]]></postcode>
<contract><![CDATA[...]]></contract>
<category><![CDATA[...]]></category>
<date><![CDATA[...]]></date>
<time><![CDATA[...]]></time>
</go>
etc...
</race>
Is there a way to do this? If yes how?
Or i need to convert it with some scripts to this:
<add>
<doc>
<field name="authors">Patrick Eagar</field>
<field name="subject">Sports</field>
etc...
Thanks for your help
Regards
Re: Import XML files different format?
Posted by sc...@asia.com.
Thanks Eric for your answer.
I'll try to use DIH via data-config.xml as i might index other content with different XML structure in the futur...
Will i need to have different data-config for each XML strucure content file? And then manualy cange between them?
-----Original Message-----
From: Erik Hatcher <er...@gmail.com>
To: solr-user@lucene.apache.org
Sent: Wed, Jun 23, 2010 2:19 pm
Subject: Re: Import XML files different format?
You can use DataImportHandler's XML/XPath capabilities to do this:
<http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource >
or you could, of course, convert your XML to Solr's XML format.
Another fine option for what this data looks like, CSV format.
I'd imagine you have the orginal data in a relational database though?
Erik
On Jun 23, 2010, at 7:59 AM, scrapy@asia.com wrote:
> Hi,
>
> I'm new to solr. It looks great.
>
> I would like to add a XML document in the following format in solr:
>
> <?xml version="1.0" encoding="utf-8"?>
> <race>
> <go>
> <id><![CDATA[...]]></id>
> <title><![CDATA[...]]></title>
> <url><![CDATA[...]]></url>
> <content><![CDATA[...]]></content>
> <city><![CDATA[...]]></city>
> <postcode><![CDATA[...]]></postcode>
> <contract><![CDATA[...]]></contract>
> <category><![CDATA[...]]></category>
> <date><![CDATA[...]]></date>
> <time><![CDATA[...]]></time>
> </go>
>
> etc...
> </race>
>
>
>
> Is there a way to do this? If yes how?
>
> Or i need to convert it with some scripts to this:
>
> <add>
> <doc>
> <field name="authors">Patrick Eagar</field>
> <field name="subject">Sports</field>
> etc...
>
>
> Thanks for your help
>
> Regards
Re: Import XML files different format?
Posted by Erik Hatcher <er...@gmail.com>.
You can use DataImportHandler's XML/XPath capabilities to do this:
<http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource
>
or you could, of course, convert your XML to Solr's XML format.
Another fine option for what this data looks like, CSV format.
I'd imagine you have the orginal data in a relational database though?
Erik
On Jun 23, 2010, at 7:59 AM, scrapy@asia.com wrote:
> Hi,
>
> I'm new to solr. It looks great.
>
> I would like to add a XML document in the following format in solr:
>
> <?xml version="1.0" encoding="utf-8"?>
> <race>
> <go>
> <id><![CDATA[...]]></id>
> <title><![CDATA[...]]></title>
> <url><![CDATA[...]]></url>
> <content><![CDATA[...]]></content>
> <city><![CDATA[...]]></city>
> <postcode><![CDATA[...]]></postcode>
> <contract><![CDATA[...]]></contract>
> <category><![CDATA[...]]></category>
> <date><![CDATA[...]]></date>
> <time><![CDATA[...]]></time>
> </go>
>
> etc...
> </race>
>
>
>
> Is there a way to do this? If yes how?
>
> Or i need to convert it with some scripts to this:
>
> <add>
> <doc>
> <field name="authors">Patrick Eagar</field>
> <field name="subject">Sports</field>
> etc...
>
>
> Thanks for your help
>
> Regards