You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Sanjeet Kumar <sa...@gmail.com> on 2018/01/28 05:23:27 UTC

Facing issue while writing more than one DIH for a core.

Hi All,

Below is the DIH configurations for the Data import handlers for a core.

<dataConfig>


*For DIH-1:*

<dataSource type="URLDataSource"/>

<entity
  name="feed_import"
  url="https://stackoverflow.com/feeds/tag/solr"
  processor="XPathEntityProcessor"
  dataSource="URLDataSource"
  forEach="/feed|/feed/entry"
  transformer="HTMLStripTransformer,RegexTransformer">

  <!-- want to set static value "Feed" of all documents for this DIH -->

  *<**field name="dih_type" value="Feed"/>*

<!-- Keep only the final numeric part of the URL -->
<field name="id" column="id" xpath="/feed/entry/id" regex=".*/" replaceWith=""
prefix="feed."/>

<field name="title" column="title" xpath="/feed/entry/title"/>

<!-- Use transformers to convert HTML into plain text.
There is also an UpdateRequestProcess to trim remaining spaces.
-->
<field name="body" column="body" xpath="/feed/entry/summary" stripHTML="true"
regex="( |\n)+" replaceWith=" "/>

<field name="source" column="source" xpath
="/feed/entry/link[@rel='alternate']/@href"/>

<field name="created_on" column="created_on" xpath="/feed/entry/published"/>
<field name="updated_on" column="updated_on" xpath="/feed/entry/updated"/>
</entity>

*For DiH-2:*

<entity
  name="solr_import"
  processor="SolrEntityProcessor"
  url="http://127.0.0.1:9983/solr/briefs2 "
  query="*:*"
  fl="id,title,lead,d_company,d_industry,d_location,d_created_on,d_updated_on">

<!-- want to set static value "Solr" of all documents for this DIH -->

  *<**field name="dih_type" value="Solr"/>*


<field name="id" column="id" prefix="solr_" />
<field name="title" column="title" />
<field name="body" column="body" />

<field name="d_company" column="d_company" />
<field name="d_industry" column="d_industry" />
<field name="d_location" column="d_location" />

<field name="created_on" column="updated_on" />
<field name="updated_on" column="updated_on" />
</entity>

</dataConfig>


*The problems i am facing is follows:*

1. *I am not able to set field without column attribute.*

*       <**field name="dih_type" value="Feed"/>*

* <field name="dih_type" value="Solr"/> Is there any other way to do this?*

2. *How can i set authentication details for both Data import Handlers?*


Regards,

Sanjeet.