You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Chris Masters <ro...@yahoo.com> on 2009/07/14 13:16:06 UTC
Data Import ID Problem
Hi All,
I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
<snip>
<entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
<field column="INDEX_ID" name="id" />
<field column="THING_ID" name="dbid" />
</snip>
When searching it seems the CONCATted string is turned into some sort of charcter array(?):
<snip>
<doc>
<strname="dbid">1</str>
<strname="id">[B@108759d</str>
</snip>
Everything is OK if I add a document via SolrJ:
<snip>
SolrInputDocument doc =
doc.addField(
doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
</snip>
Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
Thanks, rotis
Re: Data Import ID Problem
Posted by Chris Masters <ro...@yahoo.com>.
Sorry - The solrJ snippet shoud read:
<snip>
SolrInputDocument doc =
doc.addField(
doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());</snip>
----- Original Message ----
From: Chris Masters <ro...@yahoo.com>
To: solr-user@lucene.apache.org
Sent: Tuesday, July 14, 2009 12:16:06 PM
Subject: Data Import ID Problem
Hi All,
I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
<snip>
<entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
<field column="INDEX_ID" name="id" />
<field column="THING_ID" name="dbid" />
</snip>
When searching it seems the CONCATted string is turned into some sort of charcter array(?):
<snip>
<doc>
<strname="dbid">1</str>
<strname="id">[B@108759d</str>
</snip>
Everything is OK if I add a document via SolrJ:
<snip>
SolrInputDocument doc =
doc.addField(
doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
</snip>
Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
Thanks, rotis
Re: Data Import ID Problem
Posted by Chris Masters <ro...@yahoo.com>.
MySQL -> com.mysql.jdbc.Driver (mysql-connector-java-5.1.7.jar).
mysql concat -> http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_concat
Fix is to use CAST like:
SELECT CONCAT(CAST(THING.ID AS CHAR),TYPE) AS INDEX_ID...
Thanks for the nudge 'Noble Paul'!
----- Original Message ----
From: Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com>
To: solr-user@lucene.apache.org
Sent: Tuesday, July 14, 2009 3:53:44 PM
Subject: Re: Data Import ID Problem
DIH is getting the field as it as a byte[] ? which db and which driver
are you using?
On Tue, Jul 14, 2009 at 4:46 PM, Chris Masters<ro...@yahoo.com> wrote:
>
> Hi All,
>
> I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
>
> <snip>
> <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
> <field column="INDEX_ID" name="id" />
> <field column="THING_ID" name="dbid" />
> </snip>
>
> When searching it seems the CONCATted string is turned into some sort of charcter array(?):
>
> <snip>
> <doc>
> <strname="dbid">1</str>
> <strname="id">[B@108759d</str>
> </snip>
>
> Everything is OK if I add a document via SolrJ:
>
> <snip>
> SolrInputDocument doc =
> doc.addField(
> doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
> </snip>
>
> Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
>
> Thanks, rotis
>
>
>
>
--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Data Import ID Problem
Posted by Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com>.
DIH is getting the field as it as a byte[] ? which db and which driver
are you using?
On Tue, Jul 14, 2009 at 4:46 PM, Chris Masters<ro...@yahoo.com> wrote:
>
> Hi All,
>
> I have a problem when importing data using the data import handler. I import documents from multiple tables so table.id is not unique - to get round this I concatenate the type like this:
>
> <snip>
> <entity name="mydoc" query="select CONCAT(THING.ID,TYPE) AS INDEX_ID,THING.ID AS THING_ID,TYPE,TITLE,SUMMARY FROM THING">
> <field column="INDEX_ID" name="id" />
> <field column="THING_ID" name="dbid" />
> </snip>
>
> When searching it seems the CONCATted string is turned into some sort of charcter array(?):
>
> <snip>
> <doc>
> <strname="dbid">1</str>
> <strname="id">[B@108759d</str>
> </snip>
>
> Everything is OK if I add a document via SolrJ:
>
> <snip>
> SolrInputDocument doc =
> doc.addField(
> doc.addField( newSolrInputDocument();"id", myThing.getId() + TCSearch.SEARCH_TYPE_THING);"dbid", myThing.getId());
> </snip>
>
> Obviously this will cause problems as I remove documents by consturcting the ID and using deleteById. Any ideas?
>
> Thanks, rotis
>
>
>
>
--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com