You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Masters <ro...@yahoo.com> on 2009/07/14 13:16:06 UTC

Data Import ID Problem

Hi All,

I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:

<snip>
    <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
                <field column="INDEX_ID" name="id" />
                <field column="THING_ID" name="dbid" />
</snip>

When searching it seems the CONCATted string is turned into some sort of charcter array(?):

<snip>
<doc> 
  <strname="dbid">1</str> 
  <strname="id">[B@108759d</str>  
   </snip>

   Everything is OK if I add a document via SolrJ:

    <snip>
      SolrInputDocument doc = 
      doc.addField( 
      doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
   </snip>

   Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?

   Thanks, rotis


      

Re: Data Import ID Problem

Posted by Chris Masters <ro...@yahoo.com>.
Sorry - The solrJ snippet shoud read:

<snip>
SolrInputDocument doc = 
doc.addField( 
doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());</snip>



----- Original Message ----
From: Chris Masters <ro...@yahoo.com>
To: solr-user@lucene.apache.org
Sent: Tuesday, July 14, 2009 12:16:06 PM
Subject: Data Import ID Problem


Hi All,

I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:

<snip>
    <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
                <field column="INDEX_ID" name="id" />
                <field column="THING_ID" name="dbid" />
</snip>

When searching it seems the CONCATted string is turned into some sort of charcter array(?):

<snip>
<doc> 
  <strname="dbid">1</str> 
  <strname="id">[B@108759d</str>  
   </snip>

   Everything is OK if I add a document via SolrJ:

    <snip>
      SolrInputDocument doc = 
      doc.addField( 
      doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
   </snip>

   Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?

   Thanks, rotis


      

Re: Data Import ID Problem

Posted by Chris Masters <ro...@yahoo.com>.
MySQL -> com.mysql.jdbc.Driver (mysql-connector-java-5.1.7.jar).

mysql concat -> http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_concat

Fix is to use CAST like:

SELECT CONCAT(CAST(THING.ID AS CHAR),TYPE) AS INDEX_ID...

Thanks for the nudge 'Noble Paul'!



----- Original Message ----
From: Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>
To: solr-user@lucene.apache.org
Sent: Tuesday, July 14, 2009 3:53:44 PM
Subject: Re: Data Import ID Problem

DIH is getting the field as it as a byte[] ? which db and which driver
are you using?

On Tue, Jul 14, 2009 at 4:46 PM, Chris Masters<ro...@yahoo.com> wrote:
>
> Hi All,
>
> I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
>
> <snip>
>     <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
>                 <field column="INDEX_ID" name="id" />
>                 <field column="THING_ID" name="dbid" />
> </snip>
>
> When searching it seems the CONCATted string is turned into some sort of charcter array(?):
>
> <snip>
> <doc>
>   <strname="dbid">1</str>
>   <strname="id">[B@108759d</str>
>    </snip>
>
>    Everything is OK if I add a document via SolrJ:
>
>     <snip>
>       SolrInputDocument doc =
>       doc.addField(
>       doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
>    </snip>
>
>    Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
>
>    Thanks, rotis
>
>
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com



      

Re: Data Import ID Problem

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>.
DIH is getting the field as it as a byte[] ? which db and which driver
are you using?

On Tue, Jul 14, 2009 at 4:46 PM, Chris Masters<ro...@yahoo.com> wrote:
>
> Hi All,
>
> I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
>
> <snip>
>     <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
>                 <field column="INDEX_ID" name="id" />
>                 <field column="THING_ID" name="dbid" />
> </snip>
>
> When searching it seems the CONCATted string is turned into some sort of charcter array(?):
>
> <snip>
> <doc>
>   <strname="dbid">1</str>
>   <strname="id">[B@108759d</str>
>    </snip>
>
>    Everything is OK if I add a document via SolrJ:
>
>     <snip>
>       SolrInputDocument doc =
>       doc.addField(
>       doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
>    </snip>
>
>    Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
>
>    Thanks, rotis
>
>
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com