You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rafael Taboada <ka...@gmail.com> on 2012/06/04 18:13:10 UTC

Can't index sub-entitties in DIH

Hi folks,

I'm using DIH in order to index my Oracle database but I'm having some
issues.

Right now, I can't index fields from sub-entities. I have in my
db-data-config.xml:

<dataConfig>
   <dataSource type="JdbcDataSource" name="jdbc"
driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@localhost:1521:mydata"
user="oracle" password="oracle" />
   <document>
      <entity name="documento" query="SELECT
iddocumento,nrodocumento,tipodocumento FROM documento">
         <field column="iddocumento" name="iddocumento" />
         <field column="nrodocumento" name="nrodocumento" />
         <field column="tipodocumento" name="tipodocumento" />
         <entity name="tipodocumento" query="SELECT nombre FROM
tipodocumento where idtipodocumento = '${documento.tipodocumento}'">
            <field column="nombre" name="nombre" />
         </entity>
      </entity>
   </document>
</dataConfig>

My schema.xml is:

<field name="iddocumento" type="string" indexed="true" stored="true"
required="true" />
      <field name="nrodocumento" type="string" indexed="true" stored="true"
/>
      <field name="tipodocumento" type="string" indexed="true"
stored="true" />
      <field name="nombre" type="string" indexed="true" stored="true" />

I can't index NOMBRE field. Is this because it belongs to a sub-entity?

Thanks for your help

-- 
Rafael Taboada

Re: Can't index sub-entitties in DIH

Posted by Rahul Warawdekar <ra...@gmail.com>.
Hi,

One of the possibilities for this kind of issue to occur may be the case
sensitivity of column names in Oracle.
Can you apply a transformer and check the entity map which actually
contains the keys and their values ?
Also, please try specifying upper case field names for Oracle and try if
that works.
something like

        <entity name="tipodocumento" query="SELECT *NOMBRE* FROM
 tipodocumento where *IDTIPODOCUMENTO* = '${documento.*TIPODOCUMENTO*}'">
            <field column="*NOMBRE*" name="nombre" />
         </entity>

On Tue, Jun 5, 2012 at 9:57 AM, Rafael Taboada <ka...@gmail.com>wrote:

> Hi Gora,
>
>
> > Your configuration files look fine. It would seem that something
> > is going wrong with the SELECT in Oracle, or with the JDBC
> > driver used to access Oracle. Could you try:
>
> * Manually doing the SELECT for the entity, and sub-entity
> >  to ensure that things are working.
> >
>
> The SELECTs are working OK.
>
>
>
> > * Check the JDBC settings.
> >
>
> I'm using tha last version of jdbc6.jar for Oracle 11g. It seems JDBC
> setting is OK because solr brings data.
>
>
>
> > Sorry, I do not have access to Oracle so that I cannot try this
> > out myself.
> >
> > Also, have you checked the Solr logs for any error messages?
> > Finally, I just noticed that you have extra quotes in:
> > ...where usuario_idusuario = '${usuario.idusuario}'"
> > I doubt that is the cause of your problem, but you could try
> > removing them.
> >
>
> If I remove quotes, there is an error about this:
>
> SEVERE: Full Import failed:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
>  Processing Document # 1
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
> at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
> Caused by: java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
>  Processing Document # 1
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
> ... 3 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable to execute query: SELECT nombre FROM tipodocumento WHERE
> idtipodocumento =  Processing Document # 1
> at
>
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
> at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)
> at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
> ... 5 more
> Caused by: java.sql.SQLSyntaxErrorException: ORA-00936: missing expression
>
> at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
> at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
> at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
> at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
> at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
> at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
> at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
> at
> oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:873)
> at
>
> oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
> at
>
> oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
> at
>
> oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1909)
> at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1871)
> at
>
> oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:318)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:246
> My config files using Oracle are:
>
>
> db-data-config.xml
> <dataConfig>
>    <dataSource driver="oracle.jdbc.OracleDriver"
> url="jdbc:oracle:thin:@localhost:1521:solr" user="solr" password="solr" />
>    <document>
>        <entity name="documento" query="SELECT
> iddocumento,nrodocumento,asunto,autor,tipodocumento FROM documento">
>             <field column="iddocumento" name="iddocumento" />
>            <field column="nrodocumento" name="nrodocumento" />
>             <field column="asunto" name="asuntodocumento" />
>            <field column="autor" name="autor" />
>             <field column="tipodocumento" name="tipodocumento" />
>             <entity name="tipodocumento1" query="SELECT nombre FROM
> tipodocumento WHERE idtipodocumento = '${documento.tipodocumento}'">
>               <field column="nombre" name="nombre" />
>             </entity>
>        </entity>
>    </document>
> </dataConfig>
>
> schema.xml
> <?xml version="1.0" encoding="UTF-8" ?>
> <schema name="db" version="1.1">
>  <types>
>    <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true"/>
>    <fieldType name="sint" class="solr.SortableIntField"
> sortMissingLast="true" omitNorms="true"/>
>  </types>
>
>  <fields>
>    <field name="iddocumento" type="sint" indexed="true" stored="true"
> required="true" />
>   <field name="nrodocumento" type="string" indexed="true" stored="true" />
>   <field name="asuntodocumento" type="string" indexed="true"
> stored="true"/>
>   <field name="autor" type="string" indexed="true" stored="true"/>
>   <field name="tipodocumento" type="string" indexed="true" stored="true"/>
>   <field name="nombre" type="string" indexed="true" stored="true"/>
>  </fields>
>
>  <!-- Field to use to determine and enforce document uniqueness.
>      Unless this field is marked with required="false", it will be a
> required field
>   -->
>  <uniqueKey>iddocumento</uniqueKey>
>
>  <!-- field for the QueryParser to use when an explicit fieldname is absent
> -->
>  <defaultSearchField>nrodocumento</defaultSearchField>
>
>  <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
>  <solrQueryParser defaultOperator="OR"/>
>
> </schema>
>
> solrconfig.xml
> <?xml version="1.0" encoding="UTF-8" ?>
> <config>
>  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
>  <!--  The DirectoryFactory to use for indexes.
>        solr.StandardDirectoryFactory, the default, is filesystem based.
>        solr.RAMDirectoryFactory is memory based, not persistent, and
> doesn't work with replication. -->
>  <directoryFactory name="DirectoryFactory"
> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>
>
>  <updateHandler class="solr.DirectUpdateHandler2">
>    <autoCommit>
>         <maxTime>30000</maxTime>
>       </autoCommit>
>  </updateHandler>
>
>  <requestDispatcher handleSelect="true" >
>    <requestParsers enableRemoteStreaming="false"
> multipartUploadLimitInKB="2048" />
>  </requestDispatcher>
>
>  <requestHandler name="standard" class="solr.StandardRequestHandler"
> default="true" />
>  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
>  <requestHandler name="/admin/"
> class="org.apache.solr.handler.admin.AdminHandlers" />
>  <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>    <lst name="defaults">
>    <str name="config">db-data-config.xml</str>
>    </lst>
>  </requestHandler>
>  <!-- config for the admin interface -->
>  <admin>
>    <defaultQuery>solr</defaultQuery>
>  </admin>
>
> </config>
>
> Thanks for your help.
>
> --
> Rafael Taboada
>
> /*
>  * Phone >> 992 741 026
>  */
>



-- 
Thanks and Regards
Rahul A. Warawdekar

Re: Can't index sub-entitties in DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
On 5 June 2012 20:05, Rafael Taboada <ka...@gmail.com> wrote:
> Hi James.
>
> Thanks for your advice.
>
> As I said, alias works for me. I use joins instead of sub-entities...
> Heavily...
> These config files work for me...
[...]

How about NULL values in the column that
you are doing a left outer join on? Cannot
test this right now, but I believe that a left
outer join behaves differently from a DIH
entity/sub-entity when it comes to NULLs.

Regards,
Gora

Re: Can't index sub-entitties in DIH

Posted by Rafael Taboada <ka...@gmail.com>.
Hi James.

Thanks for your advice.

As I said, alias works for me. I use joins instead of sub-entities...
Heavily...
These config files work for me...

db-data-config.xml
<dataConfig>
   <dataSource type="JdbcDataSource" name="jdbc"
driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@localhost:1521:solr"
user="solr" password="solr" />
   <document>
      <entity name="documento" query="SELECT
d.iddocumento,d.nrodocumento,d.asunto AS asuntodocumento,d.autor,d.estado
AS estadodocumento,d.fechacreacion AS
fechacreaciondocumento,td.idtipodocumento,td.nombre AS
nombretipodocumento,e.idexpediente,e.nroexpediente,e.nrointerno,e.asunto AS
asuntoexpediente,e.clienterazonsocial,e.clienteapellidomaterno,e.clienteapellidopaterno,e.clientenombres,e.clientedireccionprincipal,e.estado
AS estadoexpediente,e.fechacreacion AS
fechacreacionexpediente,p.idproceso,p.nombre AS nombreproceso,o.idusuario
AS idpropietario,o.nombres AS nombrespropietario,o.apellidos AS
apellidospropietario,u.idunidad,u.nombre AS nombreunidad FROM documento d
LEFT OUTER JOIN usuario o ON (d.propietario = o.idusuario) LEFT OUTER JOIN
unidad u ON (o.idunidad = u.idunidad) LEFT OUTER JOIN tipodocumento td ON
(d.tipodocumento = td.idtipodocumento) LEFT OUTER JOIN expediente e ON
(d.expediente = e.idexpediente) LEFT OUTER JOIN proceso p ON (e.proceso =
p.idproceso)">
         <field column="iddocumento" name="iddocumento" />
         <field column="nrodocumento" name="nrodocumento" />
         <field column="asuntodocumento" name="asuntodocumento" />
         <field column="autor" name="autor" />
         <field column="estadodocumento" name="estadodocumento" />
         <field column="fechacreaciondocumento"
name="fechacreaciondocumento" />
         <field column="idtipodocumento" name="idtipodocumento" />
         <field column="nombretipodocumento" name="nombretipodocumento" />
         <field column="idexpediente" name="idexpediente" />
         <field column="nroexpediente" name="nroexpediente" />
         <field column="nrointerno" name="nrointerno" />
         <field column="asuntoexpediente" name="asuntoexpediente" />
         <field column="clienterazonsocial" name="clienterazonsocial" />
         <field column="clienteapellidomaterno"
name="clienteapellidomaterno" />
         <field column="clienteapellidopaterno"
name="clienteapellidopaterno" />
         <field column="clientenombres" name="clientenombres" />
         <field column="clientedireccionprincipal"
name="clientedireccionprincipal" />
         <field column="estadoexpediente" name="estadoexpediente" />
         <field column="fechacreacionexpediente"
name="fechacreacionexpediente" />
         <field column="idproceso" name="idproceso" />
         <field column="nombreproceso" name="nombreproceso" />
         <field column="idpropietario" name="idpropietario" />
         <field column="nombrespropietario" name="nombrespropietario" />
         <field column="apellidospropietario" name="apellidospropietario" />
         <field column="idunidad" name="idunidad" />
         <field column="nombreunidad" name="nombreunidad" />
      </entity>
   </document>
</dataConfig>

schema.xml
<?xml version="1.0" ?>
<schema name="siged" version="1.1">
   <types>
      <fieldtype name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true" />
      <fieldType name="tint" class="solr.TrieIntField" precisionStep="8"
positionIncrementGap="0" />
      <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6"
positionIncrementGap="0" />
      <fieldType name="text_general" class="solr.TextField"
positionIncrementGap="100">
         <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
            <filter class="solr.LowerCaseFilterFactory"/>
         </analyzer>
         <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
            <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
         </analyzer>
      </fieldType>
   </types>

   <fields>
      <!-- SIGED Documentos -->
      <field name="iddocumento" type="tint" indexed="true" stored="true"
required="true" />
      <field name="nrodocumento" type="text_general" indexed="true"
stored="true" />
      <field name="asuntodocumento" type="text_general" indexed="true"
stored="true" />
      <field name="autor" type="text_general" indexed="true" stored="true"
/>
      <field name="estadodocumento" type="string" indexed="true"
stored="true" />
      <field name="fechacreaciondocumento" type="string" indexed="true"
stored="true" />
      <field name="idtipodocumento" type="tint" indexed="true"
stored="true" />
      <field name="nombretipodocumento" type="text_general" indexed="true"
stored="true" />
      <field name="idexpediente" type="tint" indexed="true" stored="true" />
      <field name="nroexpediente" type="text_general" indexed="true"
stored="true" />
      <field name="nrointerno" type="text_general" indexed="true"
stored="true" />
      <field name="asuntoexpediente" type="text_general" indexed="true"
stored="true" />
      <field name="clienterazonsocial" type="text_general" indexed="true"
stored="true" />
      <field name="clienteapellidomaterno" type="text_general"
indexed="true" stored="true" />
      <field name="clienteapellidopaterno" type="text_general"
indexed="true" stored="true" />
      <field name="clientenombres" type="text_general" indexed="true"
stored="true" />
      <field name="clientedireccionprincipal" type="text_general"
indexed="true" stored="true" />
      <field name="estadoexpediente" type="string" indexed="true"
stored="true" />
      <field name="fechacreacionexpediente" type="string" indexed="true"
stored="true" />
      <field name="idproceso" type="tint" indexed="true" stored="true" />
      <field name="nombreproceso" type="text_general" indexed="true"
stored="true" />
      <field name="idpropietario" type="tint" indexed="true" stored="true"
/>
      <field name="nombrespropietario" type="text_general" indexed="true"
stored="true" />
      <field name="apellidospropietario" type="text_general" indexed="true"
stored="true" />
      <field name="idunidad" type="tint" indexed="true" stored="true" />
      <field name="nombreunidad" type="text_general" indexed="true"
stored="true" />
   </fields>

   <!-- field to use to determine and enforce document uniqueness. -->
   <uniqueKey>iddocumento</uniqueKey>

   <!-- field for the QueryParser to use when an explicit fieldname is
absent -->
   <defaultSearchField>nroexpediente</defaultSearchField>

   <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
   <solrQueryParser defaultOperator="OR"/>
</schema>


solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
  <!--  The DirectoryFactory to use for indexes.
        solr.StandardDirectoryFactory, the default, is filesystem based.
        solr.RAMDirectoryFactory is memory based, not persistent, and
doesn't work with replication. -->
  <directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>


  <updateHandler class="solr.DirectUpdateHandler2">
  </updateHandler>

  <requestDispatcher handleSelect="true" >
    <requestParsers enableRemoteStreaming="false"
multipartUploadLimitInKB="2048" />
  </requestDispatcher>

  <requestHandler name="standard" class="solr.StandardRequestHandler"
default="true" />
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
  <requestHandler name="/admin/"
class="org.apache.solr.handler.admin.AdminHandlers" />
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>
  <!-- config for the admin interface -->
  <admin>
    <defaultQuery>solr</defaultQuery>
  </admin>

</config>



On Tue, Jun 5, 2012 at 9:22 AM, Dyer, James <Ja...@ingrambook.com>wrote:

> I sucessfully use Oracle with DIH although none of my imports have
> sub-entities.  (slight difference, I'm on ojdbc5.jar w/10g...).  It may be
> you have a driver that doesn't play well with DIH in some cases.  You might
> want to try these possible workarounds:
>
> - rename the columns in SELECT with "AS" clauses.
> - in cases the columns are the same in SELECT as what you have in
> schema.xml, omit the <field /> tags  (see
> http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config)
>
> These are shot-in-the-dark guesses.  I wouldn't expect this to matter but
> you might as well try it.
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
>
> -----Original Message-----
> From: Rafael Taboada [mailto:kaliman.forever@gmail.com]
> Sent: Tuesday, June 05, 2012 8:58 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Can't index sub-entitties in DIH
>
> Hi Gora,
>
>
> > Your configuration files look fine. It would seem that something
> > is going wrong with the SELECT in Oracle, or with the JDBC
> > driver used to access Oracle. Could you try:
>
> * Manually doing the SELECT for the entity, and sub-entity
> >  to ensure that things are working.
> >
>
> The SELECTs are working OK.
>
>
>
> > * Check the JDBC settings.
> >
>
> I'm using tha last version of jdbc6.jar for Oracle 11g. It seems JDBC
> setting is OK because solr brings data.
>
>
>
> > Sorry, I do not have access to Oracle so that I cannot try this
> > out myself.
> >
> > Also, have you checked the Solr logs for any error messages?
> > Finally, I just noticed that you have extra quotes in:
> > ...where usuario_idusuario = '${usuario.idusuario}'"
> > I doubt that is the cause of your problem, but you could try
> > removing them.
> >
>
> If I remove quotes, there is an error about this:
>
> SEVERE: Full Import failed:java.lang.RuntimeException:
> java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
>  Processing Document # 1
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
> at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
> Caused by: java.lang.RuntimeException:
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
>  Processing Document # 1
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
> ... 3 more
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> Unable to execute query: SELECT nombre FROM tipodocumento WHERE
> idtipodocumento =  Processing Document # 1
> at
>
> org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
> at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
> at
>
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)
> at
>
> org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
> at
>
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
> ... 5 more
> Caused by: java.sql.SQLSyntaxErrorException: ORA-00936: missing expression
>
> at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
> at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
> at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
> at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
> at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
> at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
> at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
> at
> oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:873)
> at
>
> oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
> at
>
> oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
> at
>
> oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1909)
> at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1871)
> at
>
> oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:318)
> at
>
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:246
> My config files using Oracle are:
>
>
> db-data-config.xml
> <dataConfig>
>    <dataSource driver="oracle.jdbc.OracleDriver"
> url="jdbc:oracle:thin:@localhost:1521:solr" user="solr" password="solr" />
>    <document>
>        <entity name="documento" query="SELECT
> iddocumento,nrodocumento,asunto,autor,tipodocumento FROM documento">
>            <field column="iddocumento" name="iddocumento" />
>            <field column="nrodocumento" name="nrodocumento" />
>            <field column="asunto" name="asuntodocumento" />
>            <field column="autor" name="autor" />
>            <field column="tipodocumento" name="tipodocumento" />
>            <entity name="tipodocumento1" query="SELECT nombre FROM
> tipodocumento WHERE idtipodocumento = '${documento.tipodocumento}'">
>               <field column="nombre" name="nombre" />
>            </entity>
>        </entity>
>    </document>
> </dataConfig>
>
> schema.xml
> <?xml version="1.0" encoding="UTF-8" ?>
> <schema name="db" version="1.1">
>  <types>
>    <fieldType name="string" class="solr.StrField" sortMissingLast="true"
> omitNorms="true"/>
>    <fieldType name="sint" class="solr.SortableIntField"
> sortMissingLast="true" omitNorms="true"/>
>  </types>
>
>  <fields>
>   <field name="iddocumento" type="sint" indexed="true" stored="true"
> required="true" />
>   <field name="nrodocumento" type="string" indexed="true" stored="true" />
>   <field name="asuntodocumento" type="string" indexed="true"
> stored="true"/>
>   <field name="autor" type="string" indexed="true" stored="true"/>
>   <field name="tipodocumento" type="string" indexed="true" stored="true"/>
>   <field name="nombre" type="string" indexed="true" stored="true"/>
>  </fields>
>
>  <!-- Field to use to determine and enforce document uniqueness.
>      Unless this field is marked with required="false", it will be a
> required field
>   -->
>  <uniqueKey>iddocumento</uniqueKey>
>
>  <!-- field for the QueryParser to use when an explicit fieldname is absent
> -->
>  <defaultSearchField>nrodocumento</defaultSearchField>
>
>  <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
>  <solrQueryParser defaultOperator="OR"/>
>
> </schema>
>
> solrconfig.xml
> <?xml version="1.0" encoding="UTF-8" ?>
> <config>
>  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
>  <!--  The DirectoryFactory to use for indexes.
>        solr.StandardDirectoryFactory, the default, is filesystem based.
>        solr.RAMDirectoryFactory is memory based, not persistent, and
> doesn't work with replication. -->
>  <directoryFactory name="DirectoryFactory"
> class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>
>
>
>  <updateHandler class="solr.DirectUpdateHandler2">
>   <autoCommit>
>         <maxTime>30000</maxTime>
>       </autoCommit>
>  </updateHandler>
>
>  <requestDispatcher handleSelect="true" >
>    <requestParsers enableRemoteStreaming="false"
> multipartUploadLimitInKB="2048" />
>  </requestDispatcher>
>
>  <requestHandler name="standard" class="solr.StandardRequestHandler"
> default="true" />
>  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
>  <requestHandler name="/admin/"
> class="org.apache.solr.handler.admin.AdminHandlers" />
>  <requestHandler name="/dataimport"
> class="org.apache.solr.handler.dataimport.DataImportHandler">
>    <lst name="defaults">
>    <str name="config">db-data-config.xml</str>
>    </lst>
>  </requestHandler>
>  <!-- config for the admin interface -->
>  <admin>
>    <defaultQuery>solr</defaultQuery>
>  </admin>
>
> </config>
>
> Thanks for your help.
>
> --
> Rafael Taboada
>
> /*
>  * Phone >> 992 741 026
>  */
>



-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */

RE: Can't index sub-entitties in DIH

Posted by "Dyer, James" <Ja...@ingrambook.com>.
I sucessfully use Oracle with DIH although none of my imports have sub-entities.  (slight difference, I'm on ojdbc5.jar w/10g...).  It may be you have a driver that doesn't play well with DIH in some cases.  You might want to try these possible workarounds:

- rename the columns in SELECT with "AS" clauses.
- in cases the columns are the same in SELECT as what you have in schema.xml, omit the <field /> tags  (see http://wiki.apache.org/solr/DataImportHandler#A_shorter_data-config)

These are shot-in-the-dark guesses.  I wouldn't expect this to matter but you might as well try it.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Rafael Taboada [mailto:kaliman.forever@gmail.com] 
Sent: Tuesday, June 05, 2012 8:58 AM
To: solr-user@lucene.apache.org
Subject: Re: Can't index sub-entitties in DIH

Hi Gora,


> Your configuration files look fine. It would seem that something
> is going wrong with the SELECT in Oracle, or with the JDBC
> driver used to access Oracle. Could you try:

* Manually doing the SELECT for the entity, and sub-entity
>  to ensure that things are working.
>

The SELECTs are working OK.



> * Check the JDBC settings.
>

I'm using tha last version of jdbc6.jar for Oracle 11g. It seems JDBC
setting is OK because solr brings data.



> Sorry, I do not have access to Oracle so that I cannot try this
> out myself.
>
> Also, have you checked the Solr logs for any error messages?
> Finally, I just noticed that you have extra quotes in:
> ...where usuario_idusuario = '${usuario.idusuario}'"
> I doubt that is the cause of your problem, but you could try
> removing them.
>

If I remove quotes, there is an error about this:

SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
 Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
 Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: SELECT nombre FROM tipodocumento WHERE
idtipodocumento =  Processing Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
... 5 more
Caused by: java.sql.SQLSyntaxErrorException: ORA-00936: missing expression

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:873)
at
oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
at
oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
at
oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1909)
at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1871)
at
oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:318)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:246
My config files using Oracle are:


db-data-config.xml
<dataConfig>
    <dataSource driver="oracle.jdbc.OracleDriver"
url="jdbc:oracle:thin:@localhost:1521:solr" user="solr" password="solr" />
    <document>
        <entity name="documento" query="SELECT
iddocumento,nrodocumento,asunto,autor,tipodocumento FROM documento">
            <field column="iddocumento" name="iddocumento" />
            <field column="nrodocumento" name="nrodocumento" />
            <field column="asunto" name="asuntodocumento" />
            <field column="autor" name="autor" />
            <field column="tipodocumento" name="tipodocumento" />
            <entity name="tipodocumento1" query="SELECT nombre FROM
tipodocumento WHERE idtipodocumento = '${documento.tipodocumento}'">
               <field column="nombre" name="nombre" />
            </entity>
        </entity>
    </document>
</dataConfig>

schema.xml
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="db" version="1.1">
  <types>
    <fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>
    <fieldType name="sint" class="solr.SortableIntField"
sortMissingLast="true" omitNorms="true"/>
 </types>

 <fields>
   <field name="iddocumento" type="sint" indexed="true" stored="true"
required="true" />
   <field name="nrodocumento" type="string" indexed="true" stored="true" />
   <field name="asuntodocumento" type="string" indexed="true"
stored="true"/>
   <field name="autor" type="string" indexed="true" stored="true"/>
   <field name="tipodocumento" type="string" indexed="true" stored="true"/>
   <field name="nombre" type="string" indexed="true" stored="true"/>
 </fields>

 <!-- Field to use to determine and enforce document uniqueness.
      Unless this field is marked with required="false", it will be a
required field
   -->
 <uniqueKey>iddocumento</uniqueKey>

 <!-- field for the QueryParser to use when an explicit fieldname is absent
-->
 <defaultSearchField>nrodocumento</defaultSearchField>

 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>

</schema>

solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
  <!--  The DirectoryFactory to use for indexes.
        solr.StandardDirectoryFactory, the default, is filesystem based.
        solr.RAMDirectoryFactory is memory based, not persistent, and
doesn't work with replication. -->
  <directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>


  <updateHandler class="solr.DirectUpdateHandler2">
   <autoCommit>
         <maxTime>30000</maxTime>
       </autoCommit>
  </updateHandler>

  <requestDispatcher handleSelect="true" >
    <requestParsers enableRemoteStreaming="false"
multipartUploadLimitInKB="2048" />
  </requestDispatcher>

  <requestHandler name="standard" class="solr.StandardRequestHandler"
default="true" />
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
  <requestHandler name="/admin/"
class="org.apache.solr.handler.admin.AdminHandlers" />
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>
  <!-- config for the admin interface -->
  <admin>
    <defaultQuery>solr</defaultQuery>
  </admin>

</config>

Thanks for your help.

-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */

Re: Can't index sub-entitties in DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
On 5 June 2012 20:08, Rafael Taboada <ka...@gmail.com> wrote:
> Hi Gora,
>
> Yes, I restart Solr for each change I do.
>
> Thanks for your help...
>
> An small question.... Is DIH work well with Oracle database? Using all the
> features It can do?

Unfortunately, I have never used DIH with Oracle. However,
this should be a simple enough use case that it should just
work. I think that we must be missing something obvious.

For the sub-entity with Oracle case, what message do you
get when the data-import concludes? Is the number of
indexed documents correct? Are there any relevant
messages in the Solr log files?

Regards,
Gora

Re: Can't index sub-entitties in DIH

Posted by Rafael Taboada <ka...@gmail.com>.
Hi Gora,

Yes, I restart Solr for each change I do.

Thanks for your help...

An small question.... Is DIH work well with Oracle database? Using all the
features It can do?


On Tue, Jun 5, 2012 at 9:32 AM, Gora Mohanty <go...@mimirtech.com> wrote:

> Hi,
>
> Sorry, I am stumped, and cannot help further without
> access to Oracle. Please disregard the bit about the
> quotes: I was reading a single quote followed by a
> double quote as three single quotes. There was no
> issue there.
>
> Since your configurations for Oracle, and mysql are
> different, are you using different Solr cores/instances,
> or making sure to restart Solr in between configuration
> changes?
>
> Regards,
> Gora
>



-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */

Re: Can't index sub-entitties in DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
Hi,

Sorry, I am stumped, and cannot help further without
access to Oracle. Please disregard the bit about the
quotes: I was reading a single quote followed by a
double quote as three single quotes. There was no
issue there.

Since your configurations for Oracle, and mysql are
different, are you using different Solr cores/instances,
or making sure to restart Solr in between configuration
changes?

Regards,
Gora

Re: Can't index sub-entitties in DIH

Posted by Rafael Taboada <ka...@gmail.com>.
Hi Gora,


> Your configuration files look fine. It would seem that something
> is going wrong with the SELECT in Oracle, or with the JDBC
> driver used to access Oracle. Could you try:

* Manually doing the SELECT for the entity, and sub-entity
>  to ensure that things are working.
>

The SELECTs are working OK.



> * Check the JDBC settings.
>

I'm using tha last version of jdbc6.jar for Oracle 11g. It seems JDBC
setting is OK because solr brings data.



> Sorry, I do not have access to Oracle so that I cannot try this
> out myself.
>
> Also, have you checked the Solr logs for any error messages?
> Finally, I just noticed that you have extra quotes in:
> ...where usuario_idusuario = '${usuario.idusuario}'"
> I doubt that is the cause of your problem, but you could try
> removing them.
>

If I remove quotes, there is an error about this:

SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
 Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:264)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:375)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:445)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:426)
Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: SELECT nombre FROM tipodocumento WHERE idtipodocumento =
 Processing Document # 1
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:621)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:327)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
... 3 more
Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
Unable to execute query: SELECT nombre FROM tipodocumento WHERE
idtipodocumento =  Processing Document # 1
at
org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:253)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.pullRow(EntityProcessorWrapper.java:330)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:296)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:683)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:709)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:619)
... 5 more
Caused by: java.sql.SQLSyntaxErrorException: ORA-00936: missing expression

at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:445)
at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)
at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:879)
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:450)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:192)
at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)
at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:193)
at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:873)
at
oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1167)
at
oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1289)
at
oracle.jdbc.driver.OracleStatement.executeInternal(OracleStatement.java:1909)
at oracle.jdbc.driver.OracleStatement.execute(OracleStatement.java:1871)
at
oracle.jdbc.driver.OracleStatementWrapper.execute(OracleStatementWrapper.java:318)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:246
My config files using Oracle are:


db-data-config.xml
<dataConfig>
    <dataSource driver="oracle.jdbc.OracleDriver"
url="jdbc:oracle:thin:@localhost:1521:solr" user="solr" password="solr" />
    <document>
        <entity name="documento" query="SELECT
iddocumento,nrodocumento,asunto,autor,tipodocumento FROM documento">
            <field column="iddocumento" name="iddocumento" />
            <field column="nrodocumento" name="nrodocumento" />
            <field column="asunto" name="asuntodocumento" />
            <field column="autor" name="autor" />
            <field column="tipodocumento" name="tipodocumento" />
            <entity name="tipodocumento1" query="SELECT nombre FROM
tipodocumento WHERE idtipodocumento = '${documento.tipodocumento}'">
               <field column="nombre" name="nombre" />
            </entity>
        </entity>
    </document>
</dataConfig>

schema.xml
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="db" version="1.1">
  <types>
    <fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>
    <fieldType name="sint" class="solr.SortableIntField"
sortMissingLast="true" omitNorms="true"/>
 </types>

 <fields>
   <field name="iddocumento" type="sint" indexed="true" stored="true"
required="true" />
   <field name="nrodocumento" type="string" indexed="true" stored="true" />
   <field name="asuntodocumento" type="string" indexed="true"
stored="true"/>
   <field name="autor" type="string" indexed="true" stored="true"/>
   <field name="tipodocumento" type="string" indexed="true" stored="true"/>
   <field name="nombre" type="string" indexed="true" stored="true"/>
 </fields>

 <!-- Field to use to determine and enforce document uniqueness.
      Unless this field is marked with required="false", it will be a
required field
   -->
 <uniqueKey>iddocumento</uniqueKey>

 <!-- field for the QueryParser to use when an explicit fieldname is absent
-->
 <defaultSearchField>nrodocumento</defaultSearchField>

 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>

</schema>

solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
  <!--  The DirectoryFactory to use for indexes.
        solr.StandardDirectoryFactory, the default, is filesystem based.
        solr.RAMDirectoryFactory is memory based, not persistent, and
doesn't work with replication. -->
  <directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>


  <updateHandler class="solr.DirectUpdateHandler2">
   <autoCommit>
         <maxTime>30000</maxTime>
       </autoCommit>
  </updateHandler>

  <requestDispatcher handleSelect="true" >
    <requestParsers enableRemoteStreaming="false"
multipartUploadLimitInKB="2048" />
  </requestDispatcher>

  <requestHandler name="standard" class="solr.StandardRequestHandler"
default="true" />
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
  <requestHandler name="/admin/"
class="org.apache.solr.handler.admin.AdminHandlers" />
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>
  <!-- config for the admin interface -->
  <admin>
    <defaultQuery>solr</defaultQuery>
  </admin>

</config>

Thanks for your help.

-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */

Re: Can't index sub-entitties in DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
On 5 June 2012 08:26, Rafael Taboada <ka...@gmail.com> wrote:
> Hi Gora.
>
> My apologizes for this subject. It wasn't my intention to be rude or
> something like that.

No need to apologize. I was hoping to make it easier
for people to help you.

> As I said about mapping database columns with solr fields, it works in
> mysql. Maybe I'm wrong with some config file because I'm new with Solr. By
> the way, it is a great product.
>
> But I've just tried about sub-entities using mysql. And all work perfect.
> It shows me that I am doing something wrong with Oracle database at work.
>
> I attach with this mail files I used with mysql. Mapping columns is OK,
> sub-entities is OK.
[...]

Your configuration files look fine. It would seem that something
is going wrong with the SELECT in Oracle, or with the JDBC
driver used to access Oracle. Could you try:
* Manually doing the SELECT for the entity, and sub-entity
  to ensure that things are working.
* Check the JDBC settings.
Sorry, I do not have access to Oracle so that I cannot try this
out myself.

Also, have you checked the Solr logs for any error messages?
Finally, I just noticed that you have extra quotes in:
...where usuario_idusuario = '${usuario.idusuario}'"
I doubt that is the cause of your problem, but you could try
removing them.

Regards,
Gora

Re: Can't index sub-entitties in DIH

Posted by Rafael Taboada <ka...@gmail.com>.
Hi Gora.

My apologizes for this subject. It wasn't my intention to be rude or
something like that.

As I said about mapping database columns with solr fields, it works in
mysql. Maybe I'm wrong with some config file because I'm new with Solr. By
the way, it is a great product.

But I've just tried about sub-entities using mysql. And all work perfect.
It shows me that I am doing something wrong with Oracle database at work.

I attach with this mail files I used with mysql. Mapping columns is OK,
sub-entities is OK.

Thanks so much for your opinion.

Kinds regards

Rafael

db-data-config.xml
<dataConfig>
    <dataSource driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/solr" user="solr" password="solr" />
    <document>
        <entity name="usuario" query="select idusuario,nombres,apellidos
from usuario">
            <field column="idusuario" name="idusuario" />
            <field column="nombres" name="nombres111" />
            <field column="apellidos" name="apellidos111" />
               <entity name="documento" query="select iddocumento,nombre
from documento where usuario_idusuario = '${usuario.idusuario}'">
                  <field column="iddocumento" name="iddocumento" />
                  <field column="nombre" name="nombredocumento" />
               </entity>
        </entity>
    </document>
</dataConfig>

schema.xml
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="db" version="1.1">
  <types>
    <fieldType name="string" class="solr.StrField" sortMissingLast="true"
omitNorms="true"/>
    <fieldType name="sint" class="solr.SortableIntField"
sortMissingLast="true" omitNorms="true"/>
 </types>

 <fields>
   <field name="idusuario" type="sint" indexed="true" stored="true"
required="true" />
   <field name="nombres111" type="string" indexed="true" stored="true" />
   <field name="apellidos111" type="string" indexed="true" stored="true"/>
   <field name="iddocumento" type="string" indexed="true" stored="true"/>
   <field name="nombredocumento" type="string" indexed="true"
stored="true"/>
 </fields>

 <!-- Field to use to determine and enforce document uniqueness.
      Unless this field is marked with required="false", it will be a
required field
   -->
 <uniqueKey>idusuario</uniqueKey>

 <!-- field for the QueryParser to use when an explicit fieldname is absent
-->
 <defaultSearchField>nombres111</defaultSearchField>

 <!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
 <solrQueryParser defaultOperator="OR"/>

</schema>

solrconfig.xml
<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_36</luceneMatchVersion>
  <!--  The DirectoryFactory to use for indexes.
        solr.StandardDirectoryFactory, the default, is filesystem based.
        solr.RAMDirectoryFactory is memory based, not persistent, and
doesn't work with replication. -->
  <directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.StandardDirectoryFactory}"/>


  <updateHandler class="solr.DirectUpdateHandler2" />

  <requestDispatcher handleSelect="true" >
    <requestParsers enableRemoteStreaming="false"
multipartUploadLimitInKB="2048" />
  </requestDispatcher>

  <requestHandler name="standard" class="solr.StandardRequestHandler"
default="true" />
  <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" />
  <requestHandler name="/admin/"
class="org.apache.solr.handler.admin.AdminHandlers" />
  <requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
    <str name="config">db-data-config.xml</str>
    </lst>
  </requestHandler>
  <!-- config for the admin interface -->
  <admin>
    <defaultQuery>solr</defaultQuery>
  </admin>

</config>



On Mon, Jun 4, 2012 at 12:31 PM, Gora Mohanty <go...@mimirtech.com> wrote:

> On 4 June 2012 22:27, Rafael Taboada <ka...@gmail.com> wrote:
> >
> > Hi folks,
> >
> > I've just solved using outer joins like this:
> >
> > <entity name="documento" query="SELECT
> > d.iddocumento,d.nrodocumento,td.idtipodocumento,td.nombre AS
> > nombretipodocumento FROM documento d LEFT OUTER JOIN tipodocumento td ON
> > (d.tipodocumento = td.idtipodocumento)">
> >         <field column="iddocumento" name="iddocumento" />
> >         <field column="nrodocumento" name="nrodocumento" />
> >         <field column="idtipodocumento" name="idtipodocumento" />
> >         <field column="nombretipodocumento" name="nombretipodocumento" />
> >
> > Any idea why I can't index using sub-entities?
> [...]
>
> Dear Rafael,
>
> No offence meant, but please take a look at
> http://wiki.apache.org/solr/UsingMailingLists
> This is by no means meant to discourage you
> from asking questions on this list, but to
> request due diligence from your side so that
> people can help you better.
>
> In this particular case, given that you have
> changed your top-level SELECT, you might
> wish to inform people about whether you have
> made corresponding changes for sub-entities.
>
> Have you tried out the various SELECT
> statements in the entity, and sub-entity,
> outside of Solr, to ensure that they provide
> results?
>
> Again, with due respect, it is unlikely that
> a now-mature product like Solr has issues
> like what you have been asking about, e.g.:
> - Mapping column names to Solr field
>  names
> - Having sub-entities indexed properly.
> At least from personal experience, we have
> worked with both of these use cases without
> any issues. Thankfully, Oracle was not
> involved in our processes, but it would be
> difficult to imagine that there is an issue
> with Oracle in this regard.
>
> Regards,
> Gora
>



-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */

Re: Can't index sub-entitties in DIH

Posted by Gora Mohanty <go...@mimirtech.com>.
On 4 June 2012 22:27, Rafael Taboada <ka...@gmail.com> wrote:
>
> Hi folks,
>
> I've just solved using outer joins like this:
>
> <entity name="documento" query="SELECT
> d.iddocumento,d.nrodocumento,td.idtipodocumento,td.nombre AS
> nombretipodocumento FROM documento d LEFT OUTER JOIN tipodocumento td ON
> (d.tipodocumento = td.idtipodocumento)">
>         <field column="iddocumento" name="iddocumento" />
>         <field column="nrodocumento" name="nrodocumento" />
>         <field column="idtipodocumento" name="idtipodocumento" />
>         <field column="nombretipodocumento" name="nombretipodocumento" />
>
> Any idea why I can't index using sub-entities?
[...]

Dear Rafael,

No offence meant, but please take a look at
http://wiki.apache.org/solr/UsingMailingLists
This is by no means meant to discourage you
from asking questions on this list, but to
request due diligence from your side so that
people can help you better.

In this particular case, given that you have
changed your top-level SELECT, you might
wish to inform people about whether you have
made corresponding changes for sub-entities.

Have you tried out the various SELECT
statements in the entity, and sub-entity,
outside of Solr, to ensure that they provide
results?

Again, with due respect, it is unlikely that
a now-mature product like Solr has issues
like what you have been asking about, e.g.:
- Mapping column names to Solr field
  names
- Having sub-entities indexed properly.
At least from personal experience, we have
worked with both of these use cases without
any issues. Thankfully, Oracle was not
involved in our processes, but it would be
difficult to imagine that there is an issue
with Oracle in this regard.

Regards,
Gora

Re: Can't index sub-entitties in DIH

Posted by Rafael Taboada <ka...@gmail.com>.
Hi folks,

I've just solved using outer joins like this:

<entity name="documento" query="SELECT
d.iddocumento,d.nrodocumento,td.idtipodocumento,td.nombre AS
nombretipodocumento FROM documento d LEFT OUTER JOIN tipodocumento td ON
(d.tipodocumento = td.idtipodocumento)">
         <field column="iddocumento" name="iddocumento" />
         <field column="nrodocumento" name="nrodocumento" />
         <field column="idtipodocumento" name="idtipodocumento" />
         <field column="nombretipodocumento" name="nombretipodocumento" />

Any idea why I can't index using sub-entities?

Thanks in advance

On Mon, Jun 4, 2012 at 11:13 AM, Rafael Taboada
<ka...@gmail.com>wrote:

> Hi folks,
>
> I'm using DIH in order to index my Oracle database but I'm having some
> issues.
>
> Right now, I can't index fields from sub-entities. I have in my
> db-data-config.xml:
>
> <dataConfig>
>    <dataSource type="JdbcDataSource" name="jdbc"
> driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@localhost:1521:mydata"
> user="oracle" password="oracle" />
>    <document>
>       <entity name="documento" query="SELECT
> iddocumento,nrodocumento,tipodocumento FROM documento">
>          <field column="iddocumento" name="iddocumento" />
>          <field column="nrodocumento" name="nrodocumento" />
>          <field column="tipodocumento" name="tipodocumento" />
>          <entity name="tipodocumento" query="SELECT nombre FROM
> tipodocumento where idtipodocumento = '${documento.tipodocumento}'">
>             <field column="nombre" name="nombre" />
>          </entity>
>       </entity>
>    </document>
> </dataConfig>
>
> My schema.xml is:
>
> <field name="iddocumento" type="string" indexed="true" stored="true"
> required="true" />
>       <field name="nrodocumento" type="string" indexed="true"
> stored="true" />
>       <field name="tipodocumento" type="string" indexed="true"
> stored="true" />
>       <field name="nombre" type="string" indexed="true" stored="true" />
>
> I can't index NOMBRE field. Is this because it belongs to a sub-entity?
>
> Thanks for your help
>
> --
> Rafael Taboada
>
>


-- 
Rafael Taboada

/*
 * Phone >> 992 741 026
 */