You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pranav Prakash <pr...@gmail.com> on 2012/06/12 21:47:52 UTC

Unexpected DIH behavior for onError attribute

It seems that upon setting onError=skip, the DIH does not proceed to next
records in the db, and only unto those entries which were prior to an
error-causing record are being updated/added.

My db has 70K records. Of which record #17188 is illegal. When I had set
onError=abort, entire DIH operation was rolled back and nothing gets
added/updated. Upon setting onError=skip, only 17187 records were
added/updated. Upon setting onError=contine, only 17187 records were
added/updated.

Am I missing something or this is expected behavior?

*Pranav Prakash*

"temet nosce"

Re: Unexpected DIH behavior for onError attribute

Posted by Jack Krupansky <ja...@basetechnology.com>.
Make sure you have the onError=skip on the proper entity.

-- Jack Krupansky

-----Original Message----- 
From: Pranav Prakash 
Sent: Tuesday, June 12, 2012 3:47 PM 
To: solr-user@lucene.apache.org 
Subject: Unexpected DIH behavior for onError attribute 

It seems that upon setting onError=skip, the DIH does not proceed to next
records in the db, and only unto those entries which were prior to an
error-causing record are being updated/added.

My db has 70K records. Of which record #17188 is illegal. When I had set
onError=abort, entire DIH operation was rolled back and nothing gets
added/updated. Upon setting onError=skip, only 17187 records were
added/updated. Upon setting onError=contine, only 17187 records were
added/updated.

Am I missing something or this is expected behavior?

*Pranav Prakash*

"temet nosce"

Re: Unexpected DIH behavior for onError attribute

Posted by Gora Mohanty <go...@mimirtech.com>.
On 13 June 2012 10:45, Pranav Prakash <pr...@gmail.com> wrote:
> My DIH Config file goes as follows. We have two db hosts, one of which
> contains blocks of content and the other contain transcripts of those
> content blocks. The makeDynamicTranscript function is used to create row
> names like transcript_en, transcript_es and so on, which are dynamic fields
> in Solr with appropriate tokenizers.
[...]

This looks fine. Have you looked in the Solr logs
for more information? Is it possible that the error
is causing some connection issue? What is the
error exactly, and is it happening on the SELECT
in the inner entity, or on the outer one?

Regards,
Gora

Re: Unexpected DIH behavior for onError attribute

Posted by Pranav Prakash <pr...@gmail.com>.
My DIH Config file goes as follows. We have two db hosts, one of which
contains blocks of content and the other contain transcripts of those
content blocks. The makeDynamicTranscript function is used to create row
names like transcript_en, transcript_es and so on, which are dynamic fields
in Solr with appropriate tokenizers.

<dataConfig>

  <script>
    <![CDATA[
    function makeDynamicTranscript(row){
      var lang = row.get('language');
      if(lang != '**' && lang != '!!'){
        row.put('transcript_'+lang, row.get('text'));
       }
       else{
        row.put('transcript', row.get('text'));
       }
       row.remove('text');
      return row;
    }
    ]]>
  </script>

  <dataSource type="JdbcDataSource"
    driver="com.mysql.jdbc.Driver"
    url="jdbc:mysql://localhost/solr_devel"
    user="username"
    password="password"
    name="metadata" />

  <dataSource type="JdbcDataSource"
    driver="com.mysql.jdbc.Driver"
    url="jdbc:mysql://transcripts/transcripts_db"
    user="username"
    password="password"
    convertType="true"
    name="transcript" />

  <document>

    <entity name="document"
      dataSource="metadata"
*      onError="skip"*
      transformer="RegexTransformer"
      query="SELECT * FROM content_blocks">

      <!-- explicitly defining only those fields which do
           not match the db dolumn name -->
      <field column="user_tag_ids" name="user_tag_ids"
        splitBy="," sourceColName="user_tag_ids"/>
      <field column="community_tag_ids" name="community_tag_ids" />
      <field column="user_tags" name="user_tags"
        splitBy="\|\|" sourceColName="user_tags"/>
      <field column="community_tags" name="community_tags"
        splitBy="\|\|" sourceColName="community_tags"/>

      <entity name="transcript"
        dataSource="transcript"
        transformer="script:makeDynamicTranscript"
*        onError="skip"*
        query="SELECT text , '${document.language}' as language
        FROM transcripts
        WHERE content_id = '${document.id}'">
      </entity>

    </entity>
  </document>
</dataConfig>


What I am expecting is that any record which is illegal would be skipped
and next records would be imported. This however, does not happen.

*Pranav Prakash*

"temet nosce"



On Wed, Jun 13, 2012 at 9:15 AM, Gora Mohanty <go...@mimirtech.com> wrote:

> On 13 June 2012 01:17, Pranav Prakash <pr...@gmail.com> wrote:
> > It seems that upon setting onError=skip, the DIH does not proceed to next
> > records in the db, and only unto those entries which were prior to an
> > error-causing record are being updated/added.
> [...]
>
> Please show us your DIH configuration file,
> remembering to sanitise usernames/passwords
> used for database access.
>
> Also, you might want to look into the Solr
> log files to see if there are any errors
> reported there.
>
> Regards,
> Gora
>

Re: Unexpected DIH behavior for onError attribute

Posted by Gora Mohanty <go...@mimirtech.com>.
On 13 June 2012 01:17, Pranav Prakash <pr...@gmail.com> wrote:
> It seems that upon setting onError=skip, the DIH does not proceed to next
> records in the db, and only unto those entries which were prior to an
> error-causing record are being updated/added.
[...]

Please show us your DIH configuration file,
remembering to sanitise usernames/passwords
used for database access.

Also, you might want to look into the Solr
log files to see if there are any errors
reported there.

Regards,
Gora