You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sanjeet Kumar <sa...@gmail.com> on 2018/01/28 05:23:27 UTC
Facing issue while writing more than one DIH for a core.
Hi All,
Below is the DIH configurations for the Data import handlers for a core.
<dataConfig>
*For DIH-1:*
<dataSource type="URLDataSource"/>
<entity
name="feed_import"
url="https://stackoverflow.com/feeds/tag/solr"
processor="XPathEntityProcessor"
dataSource="URLDataSource"
forEach="/feed|/feed/entry"
transformer="HTMLStripTransformer,RegexTransformer">
<!-- want to set static value "Feed" of all documents for this DIH -->
*<**field name="dih_type" value="Feed"/>*
<!-- Keep only the final numeric part of the URL -->
<field name="id" column="id" xpath="/feed/entry/id" regex=".*/" replaceWith=""
prefix="feed."/>
<field name="title" column="title" xpath="/feed/entry/title"/>
<!-- Use transformers to convert HTML into plain text.
There is also an UpdateRequestProcess to trim remaining spaces.
-->
<field name="body" column="body" xpath="/feed/entry/summary" stripHTML="true"
regex="( |\n)+" replaceWith=" "/>
<field name="source" column="source" xpath
="/feed/entry/link[@rel='alternate']/@href"/>
<field name="created_on" column="created_on" xpath="/feed/entry/published"/>
<field name="updated_on" column="updated_on" xpath="/feed/entry/updated"/>
</entity>
*For DiH-2:*
<entity
name="solr_import"
processor="SolrEntityProcessor"
url="http://127.0.0.1:9983/solr/briefs2 "
query="*:*"
fl="id,title,lead,d_company,d_industry,d_location,d_created_on,d_updated_on">
<!-- want to set static value "Solr" of all documents for this DIH -->
*<**field name="dih_type" value="Solr"/>*
<field name="id" column="id" prefix="solr_" />
<field name="title" column="title" />
<field name="body" column="body" />
<field name="d_company" column="d_company" />
<field name="d_industry" column="d_industry" />
<field name="d_location" column="d_location" />
<field name="created_on" column="updated_on" />
<field name="updated_on" column="updated_on" />
</entity>
</dataConfig>
*The problems i am facing is follows:*
1. *I am not able to set field without column attribute.*
* <**field name="dih_type" value="Feed"/>*
* <field name="dih_type" value="Solr"/> Is there any other way to do this?*
2. *How can i set authentication details for both Data import Handlers?*
Regards,
Sanjeet.