You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by geeky2 <ge...@hotmail.com> on 2012/01/13 20:16:41 UTC

linking query in DIH fails with sql syntax error when specific fields contain bad data

hello all,


some of my records contain bad data i the orb_itm_id column.

example:

select * from prtxtps_prt_summ where orb_itm_id like '''%';

prd_gro_id         spp_id  orb_itm_id         ds_tx     rnk_no
0022       335         ' LONG. (TERMINAL ATTACH       )' LONG. (TERMINAL
ATTACH)   0
0042       596         ', FAN MOTOR CAPACITOR S       TRAP     0


this is causing the indexing process to fail on the bad records - with a sql
syntax error


is there a way i can trap for this and cleans the "'" before the sql is
constructed?

mark

<dataConfig>
    <dataSource driver="com.informix.jdbc.IfxDriver"
url="jdbc:informix-sqli://hawaii.sears.com:1525/lis0d000d:informixserver=ifmx00289_tcp"
user="pdrp" password="parts234" batchSize="10"/>
    <document>
        <entity transformer="TemplateTransformer" name="core1-parts"
query="select *, 1 as item_type from prtxtps_prt_summ">
            <field column ="id" name="id"
template="${core1-parts.prd_gro_id},${core1-parts.spp_id},${core1-parts.orb_itm_id}"/>
            <field column="orb_itm_id"     name="itemNo"/>
            <field column="prd_gro_id"     name="groupId"/>
            <field column="ds_tx"          name="itemDesc"/>
            <field column="spp_id"         name="supplierId"/>
            <field column="rnk_no"         name="rankNo"/>
            <field column="item_type"      name="itemType"/>
            <entity name="partAttributes" query="SELECT pa.uom_hi,
pa.att_val_hi, pa.uom_low, pa.att_val_low, a.att_nm FROM prtxtpa_att_val pa,
prtxtat_att a WHERE pa.att_id = a.att_id and pa.orb_itm_id =
'${core1-parts.orb_itm_id}' and pa.spp_id = '${core1-parts.spp_id}' and
pa.prd_gro_id = '${core1-parts.prd_gro_id}' and pa.att_val_hi is not NULL">
                <field column="uom_hi" name="partAttrib_uomHigh"/>
                <field column="att_val_hi" name="partAttrib_highValue"/>
                <field column="uom_low" name="partAttrib_uomLow"/>
                <field column="att_val_low" name="partAttrib_lowValue"/>
                <field column="att_nm" name="partAttrib_attributeName"/>
            </entity>
        </entity>
   </document>
</dataConfig>





2012-01-13 12:27:38,912 SEVERE
[org.apache.solr.handler.dataimport.DataImporter] (Thread-27) Full Import
failed:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
to execute query: SELECT pa.uom_hi, pa.att_val_hi, pa.uom_low,
pa.att_val_low, a.att_nm FROM prtxtpa_att_val pa, prtxtat_att a WHERE
pa.att_id = a.att_id and pa.orb_itm_id = '' LONG. (TERMINAL ATTACH' and
pa.spp_id = '335' and pa.prd_gro_id = '0022' and pa.att_val_hi is not NULL
Processing Document # 119
    at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
    at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
    at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
    at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
    at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
    at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
    at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
    at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
    at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:617)
    at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
    at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
    at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
    at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
    at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
Caused by: java.sql.SQLException: A syntax error has occurred.







--
View this message in context: http://lucene.472066.n3.nabble.com/linking-query-in-DIH-fails-with-sql-syntax-error-when-specific-fields-contain-bad-data-tp3657482p3657482.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: linking query in DIH fails with sql syntax error when specific fields contain bad data

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
One more idea. If your data fits into heap you can join data by DIH.
see Example 2 in
http://wiki.apache.org/solr/DataImportHandler#CachedSqlEntityProcessor

but be aware that  where="xid=x.id" were broken in trunk around r1201784.
see
https://issues.apache.org/jira/browse/SOLR-2933
but it works in 3.x.

Regards

On Sat, Jan 14, 2012 at 12:45 AM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> Hello,
>
> I'm afraid you can only vote
> https://issues.apache.org/jira/browse/SOLR-1262
>
> Regards
>
>
> On Fri, Jan 13, 2012 at 11:16 PM, geeky2 <ge...@hotmail.com> wrote:
>
>>
>> hello all,
>>
>>
>> some of my records contain bad data i the orb_itm_id column.
>>
>> example:
>>
>> select * from prtxtps_prt_summ where orb_itm_id like '''%';
>>
>> prd_gro_id         spp_id  orb_itm_id         ds_tx     rnk_no
>> 0022       335         ' LONG. (TERMINAL ATTACH       )' LONG. (TERMINAL
>> ATTACH)   0
>> 0042       596         ', FAN MOTOR CAPACITOR S       TRAP     0
>>
>>
>> this is causing the indexing process to fail on the bad records - with a
>> sql
>> syntax error
>>
>>
>> is there a way i can trap for this and cleans the "'" before the sql is
>> constructed?
>>
>> mark
>>
>> <dataConfig>
>>    <dataSource driver="com.informix.jdbc.IfxDriver"
>> url="jdbc:informix-sqli://
>> hawaii.sears.com:1525/lis0d000d:informixserver=ifmx00289_tcp"
>> user="pdrp" password="parts234" batchSize="10"/>
>>    <document>
>>        <entity transformer="TemplateTransformer" name="core1-parts"
>> query="select *, 1 as item_type from prtxtps_prt_summ">
>>            <field column ="id" name="id"
>>
>> template="${core1-parts.prd_gro_id},${core1-parts.spp_id},${core1-parts.orb_itm_id}"/>
>>            <field column="orb_itm_id"     name="itemNo"/>
>>            <field column="prd_gro_id"     name="groupId"/>
>>            <field column="ds_tx"          name="itemDesc"/>
>>            <field column="spp_id"         name="supplierId"/>
>>            <field column="rnk_no"         name="rankNo"/>
>>            <field column="item_type"      name="itemType"/>
>>            <entity name="partAttributes" query="SELECT pa.uom_hi,
>> pa.att_val_hi, pa.uom_low, pa.att_val_low, a.att_nm FROM prtxtpa_att_val
>> pa,
>> prtxtat_att a WHERE pa.att_id = a.att_id and pa.orb_itm_id =
>> '${core1-parts.orb_itm_id}' and pa.spp_id = '${core1-parts.spp_id}' and
>> pa.prd_gro_id = '${core1-parts.prd_gro_id}' and pa.att_val_hi is not
>> NULL">
>>                <field column="uom_hi" name="partAttrib_uomHigh"/>
>>                <field column="att_val_hi" name="partAttrib_highValue"/>
>>                <field column="uom_low" name="partAttrib_uomLow"/>
>>                <field column="att_val_low" name="partAttrib_lowValue"/>
>>                <field column="att_nm" name="partAttrib_attributeName"/>
>>            </entity>
>>        </entity>
>>   </document>
>> </dataConfig>
>>
>>
>>
>>
>>
>> 2012-01-13 12:27:38,912 SEVERE
>> [org.apache.solr.handler.dataimport.DataImporter] (Thread-27) Full Import
>> failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
>> Unable
>> to execute query: SELECT pa.uom_hi, pa.att_val_hi, pa.uom_low,
>> pa.att_val_low, a.att_nm FROM prtxtpa_att_val pa, prtxtat_att a WHERE
>> pa.att_id = a.att_id and pa.orb_itm_id = '' LONG. (TERMINAL ATTACH' and
>> pa.spp_id = '335' and pa.prd_gro_id = '0022' and pa.att_val_hi is not NULL
>> Processing Document # 119
>>    at
>>
>> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>>    at
>>
>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
>>    at
>>
>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
>>    at
>>
>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
>>    at
>>
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>>    at
>>
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>>    at
>>
>> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
>>    at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
>>    at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:617)
>>    at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
>>    at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
>>    at
>>
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
>>    at
>>
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
>>    at
>>
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
>> Caused by: java.sql.SQLException: A syntax error has occurred.
>>
>>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/linking-query-in-DIH-fails-with-sql-syntax-error-when-specific-fields-contain-bad-data-tp3657482p3657482.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Lucid Certified
> Apache Lucene/Solr Developer
> Grid Dynamics
>
> <http://www.griddynamics.com>
>  <mk...@griddynamics.com>
>
>


-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>

Re: linking query in DIH fails with sql syntax error when specific fields contain bad data

Posted by Mikhail Khludnev <mk...@griddynamics.com>.
Hello,

I'm afraid you can only vote https://issues.apache.org/jira/browse/SOLR-1262

Regards

On Fri, Jan 13, 2012 at 11:16 PM, geeky2 <ge...@hotmail.com> wrote:

>
> hello all,
>
>
> some of my records contain bad data i the orb_itm_id column.
>
> example:
>
> select * from prtxtps_prt_summ where orb_itm_id like '''%';
>
> prd_gro_id         spp_id  orb_itm_id         ds_tx     rnk_no
> 0022       335         ' LONG. (TERMINAL ATTACH       )' LONG. (TERMINAL
> ATTACH)   0
> 0042       596         ', FAN MOTOR CAPACITOR S       TRAP     0
>
>
> this is causing the indexing process to fail on the bad records - with a
> sql
> syntax error
>
>
> is there a way i can trap for this and cleans the "'" before the sql is
> constructed?
>
> mark
>
> <dataConfig>
>    <dataSource driver="com.informix.jdbc.IfxDriver"
> url="jdbc:informix-sqli://
> hawaii.sears.com:1525/lis0d000d:informixserver=ifmx00289_tcp"
> user="pdrp" password="parts234" batchSize="10"/>
>    <document>
>        <entity transformer="TemplateTransformer" name="core1-parts"
> query="select *, 1 as item_type from prtxtps_prt_summ">
>            <field column ="id" name="id"
>
> template="${core1-parts.prd_gro_id},${core1-parts.spp_id},${core1-parts.orb_itm_id}"/>
>            <field column="orb_itm_id"     name="itemNo"/>
>            <field column="prd_gro_id"     name="groupId"/>
>            <field column="ds_tx"          name="itemDesc"/>
>            <field column="spp_id"         name="supplierId"/>
>            <field column="rnk_no"         name="rankNo"/>
>            <field column="item_type"      name="itemType"/>
>            <entity name="partAttributes" query="SELECT pa.uom_hi,
> pa.att_val_hi, pa.uom_low, pa.att_val_low, a.att_nm FROM prtxtpa_att_val
> pa,
> prtxtat_att a WHERE pa.att_id = a.att_id and pa.orb_itm_id =
> '${core1-parts.orb_itm_id}' and pa.spp_id = '${core1-parts.spp_id}' and
> pa.prd_gro_id = '${core1-parts.prd_gro_id}' and pa.att_val_hi is not NULL">
>                <field column="uom_hi" name="partAttrib_uomHigh"/>
>                <field column="att_val_hi" name="partAttrib_highValue"/>
>                <field column="uom_low" name="partAttrib_uomLow"/>
>                <field column="att_val_low" name="partAttrib_lowValue"/>
>                <field column="att_nm" name="partAttrib_attributeName"/>
>            </entity>
>        </entity>
>   </document>
> </dataConfig>
>
>
>
>
>
> 2012-01-13 12:27:38,912 SEVERE
> [org.apache.solr.handler.dataimport.DataImporter] (Thread-27) Full Import
> failed:org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable
> to execute query: SELECT pa.uom_hi, pa.att_val_hi, pa.uom_low,
> pa.att_val_low, a.att_nm FROM prtxtpa_att_val pa, prtxtat_att a WHERE
> pa.att_id = a.att_id and pa.orb_itm_id = '' LONG. (TERMINAL ATTACH' and
> pa.spp_id = '335' and pa.prd_gro_id = '0022' and pa.att_val_hi is not NULL
> Processing Document # 119
>    at
>
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
>    at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
>    at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
>    at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
>    at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>    at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>    at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:238)
>    at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:591)
>    at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:617)
>    at
>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:267)
>    at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:186)
>    at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:359)
>    at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:427)
>    at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:408)
> Caused by: java.sql.SQLException: A syntax error has occurred.
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/linking-query-in-DIH-fails-with-sql-syntax-error-when-specific-fields-contain-bad-data-tp3657482p3657482.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

<http://www.griddynamics.com>
 <mk...@griddynamics.com>