You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Nathan Adams <na...@umich.edu> on 2009/01/29 00:34:19 UTC
DIH handling of missing files
I am constructing documents from a JDBC datasource and a HTTP datasource
(see data-config file below.) My problem is that I cannot know if a
particular HTTP URL is available at index time, so I need DIH to
continue processing even if the HTTP location returns a 404.
onError="continue" does not appear to help in this case. Should it?
<dataConfig>
<dataSource type="JdbcDataSource" name="db"
driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?????"
user="???" password="???"/>
<dataSource type="HttpDataSource" name="http"/>
<document name="resources">
<entity name="metadata" dataSource="db" pk="RESOURCEID"
query="select * from ????" onError="continue">
<entity name="xmltext"
url="http://???.com/${metadata.RESOURCEID}.xml" forEach="/content"
dataSource="http" processor="XPathEntityProcessor" onError="continue">
<field column="FULLTEXT" xpath="/content"/>
</entity>
</entity>
</document>
</dataConfig>
Thanks,
Nathan
RE: DIH handling of missing files
Posted by Nathan Adams <na...@umich.edu>.
Which appears to be v1.3, which explains the problem. Thanks!
________________________________
From: Nathan Adams [mailto:natad@umich.edu]
Sent: Thu 01/29/2009 8:28 AM
To: solr-user@lucene.apache.org
Subject: RE: DIH handling of missing files
I'm running the example from the DIH wiki page:
http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar
-Nathan
________________________________
From: Noble Paul ??????? ?????? [mailto:noble.paul@gmail.com]
Sent: Wed 01/28/2009 11:32 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH handling of missing files
onError="continue" must help .
which version of DIH are you using? onError is a Solr 1.4 feature
--Noble
On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams <na...@umich.edu> wrote:
> I am constructing documents from a JDBC datasource and a HTTP datasource
> (see data-config file below.) My problem is that I cannot know if a
> particular HTTP URL is available at index time, so I need DIH to
> continue processing even if the HTTP location returns a 404.
> onError="continue" does not appear to help in this case. Should it?
>
> <dataConfig>
>
> <dataSource type="JdbcDataSource" name="db"
> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?????"
> user="???" password="???"/>
>
> <dataSource type="HttpDataSource" name="http"/>
>
> <document name="resources">
>
> <entity name="metadata" dataSource="db" pk="RESOURCEID"
> query="select * from ????" onError="continue">
>
> <entity name="xmltext"
> url="http://???.com/$ <http:///???.com/$> <http:///???.com/$> {metadata.RESOURCEID}.xml" forEach="/content"
> dataSource="http" processor="XPathEntityProcessor" onError="continue">
>
> <field column="FULLTEXT" xpath="/content"/>
>
> </entity>
>
> </entity>
>
> </document>
>
> </dataConfig>
>
> Thanks,
> Nathan
>
--
--Noble Paul
RE: DIH handling of missing files
Posted by Nathan Adams <na...@umich.edu>.
I'm running the example from the DIH wiki page:
http://wiki.apache.org/solr-data/attachments/DataImportHandler/attachments/example-solr-home.jar
-Nathan
________________________________
From: Noble Paul ??????? ?????? [mailto:noble.paul@gmail.com]
Sent: Wed 01/28/2009 11:32 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH handling of missing files
onError="continue" must help .
which version of DIH are you using? onError is a Solr 1.4 feature
--Noble
On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams <na...@umich.edu> wrote:
> I am constructing documents from a JDBC datasource and a HTTP datasource
> (see data-config file below.) My problem is that I cannot know if a
> particular HTTP URL is available at index time, so I need DIH to
> continue processing even if the HTTP location returns a 404.
> onError="continue" does not appear to help in this case. Should it?
>
> <dataConfig>
>
> <dataSource type="JdbcDataSource" name="db"
> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?????"
> user="???" password="???"/>
>
> <dataSource type="HttpDataSource" name="http"/>
>
> <document name="resources">
>
> <entity name="metadata" dataSource="db" pk="RESOURCEID"
> query="select * from ????" onError="continue">
>
> <entity name="xmltext"
> url="http://???.com/$ <http:///???.com/$> {metadata.RESOURCEID}.xml" forEach="/content"
> dataSource="http" processor="XPathEntityProcessor" onError="continue">
>
> <field column="FULLTEXT" xpath="/content"/>
>
> </entity>
>
> </entity>
>
> </document>
>
> </dataConfig>
>
> Thanks,
> Nathan
>
--
--Noble Paul
Re: DIH handling of missing files
Posted by Noble Paul നോബിള് नोब्ळ् <no...@gmail.com>.
onError="continue" must help .
which version of DIH are you using? onError is a Solr 1.4 feature
--Noble
On Thu, Jan 29, 2009 at 5:04 AM, Nathan Adams <na...@umich.edu> wrote:
> I am constructing documents from a JDBC datasource and a HTTP datasource
> (see data-config file below.) My problem is that I cannot know if a
> particular HTTP URL is available at index time, so I need DIH to
> continue processing even if the HTTP location returns a 404.
> onError="continue" does not appear to help in this case. Should it?
>
> <dataConfig>
>
> <dataSource type="JdbcDataSource" name="db"
> driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@?????"
> user="???" password="???"/>
>
> <dataSource type="HttpDataSource" name="http"/>
>
> <document name="resources">
>
> <entity name="metadata" dataSource="db" pk="RESOURCEID"
> query="select * from ????" onError="continue">
>
> <entity name="xmltext"
> url="http://???.com/${metadata.RESOURCEID}.xml" forEach="/content"
> dataSource="http" processor="XPathEntityProcessor" onError="continue">
>
> <field column="FULLTEXT" xpath="/content"/>
>
> </entity>
>
> </entity>
>
> </document>
>
> </dataConfig>
>
> Thanks,
> Nathan
>
--
--Noble Paul