You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Eugeny Balakhonov <c0...@gmail.com> on 2011/08/10 22:52:57 UTC
Solr 3.3: DIH configuration for Oracle
Hello, all!
I want to create a good DIH configuration for my Oracle database with deltas
support. Unfortunately I am not able to do it well as DIH has the strange
restrictions.
I want to explain a problem on a simple example. In a reality my database
has very difficult structure.
Initial conditions: Two tables with following easy structure:
Table1
- ID_RECORD (Primary key)
- DATA_FIELD1
- ..
- DATA_FIELD2
- LAST_CHANGE_TIME
Table2
- ID_RECORD (Primary key)
- PARENT_ID_RECORD (Foreign key to Table1.ID_RECORD)
- DATA_FIELD1
- ..
- DATA_FIELD2
- LAST_CHANGE_TIME
In performance reasons it is necessary to do selection of the given tables
by means of one request (via inner join).
My db-data-config.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
<dataSource jndiName="jdbc/DB1" type="JdbcDataSource" user=""
password=""/>
<document>
<entity name="ent" pk="T1_ID_RECORD, T2_ID_RECORD"
query="select * from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD"
deltaQuery="select t1.ID_RECORD T1_ID_RECORD, t1.ID_RECORD
T2_ID_RECORD
from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD
where TABLE1.LAST_CHANGE_TIME >
to_date('${dataimporter.last_index_time}', 'YYYY-MM-DD HH24:MI:SS')
or TABLE2.LAST_CHANGE_TIME >
to_date('${dataimporter.last_index_time}', 'YYYY-MM-DD HH24:MI:SS')"
deltaImportQuery="select * from TABLE1 t1 inner join TABLE2 t2
on t1.ID_RECORD = t2.PARENT_ID_RECORD
where t1.ID_RECORD = ${dataimporter.delta.T1_ID_RECORD} and
t2.ID_RECORD = ${dataimporter.delta.T2_ID_RECORD}"
/>
</document>
</dataConfig>
In result I have following error:
java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
declared primary key pk='T1_ID_RECORD, T2_ID_RECORD'
I have analyzed the source code of DIH. I found that in the DocBuilder class
collectDelta() method works with value of entity attribute "pk" as with
simple string. But in my case this is array with two values: T1_ID_RECORD,
T2_ID_RECORD
What do I do wrong?
Thanks,
Eugeny
Re: Solr 3.3: DIH configuration for Oracle
Posted by Shawn Heisey <so...@elyograg.org>.
On 8/10/2011 2:52 PM, Eugeny Balakhonov wrote:
> java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
> declared primary key pk='T1_ID_RECORD, T2_ID_RECORD'
>
> I have analyzed the source code of DIH. I found that in the DocBuilder class
> collectDelta() method works with value of entity attribute "pk" as with
> simple string. But in my case this is array with two values: T1_ID_RECORD,
> T2_ID_RECORD
Whatever you declare as the DIH primary key must exist as a field name
in the result set, or Solr will complain. I had a perfectly working
config in 1.4.1, with identical text in query and deltaImportQuery. It
didn't work when I tried to upgrade to 3.1. The problem was that I was
using a deltaQuery that just returned MAX(did), to tell Solr that
something needed to be done. I had to add "AS did" to the deltaQuery so
that it matched my primary key. I am controlling the delta-import from
outside Solr, so I do not need to use the result set from deltaQuery.
The point is to pick something that will exist in all of your result
sets. You might need to include an "AS xxx" (with something you choose
for xxx) in your queries and use the xxx value as your pk. Because you
have only provided a simple example, I can't really tell you what you
should use.
The pk value is only used to coordinate your queries. It only has
meaning in the DIH, not the Solr index. Uniqueness in the Solr index is
controlled by the uniqueKey value in schema.xml. In my case, pk and
uniqueKey are not the same field.
Side note: I'm not much of an expert, so I can't guarantee I can help
further. I will give it a try, though.
Thanks,
Shawn
Re: Solr 3.3: DIH configuration for Oracle
Posted by Alexey Serba <as...@gmail.com>.
Why do you need to collect both primary keys T1_ID_RECORD and
T2_ID_RECORD in your delta query. Isn't T2_ID_RECORD primary key value
enough to get all data from both tables? (you have table1-table2
relation as 1-N, right?)
On Thu, Aug 11, 2011 at 12:52 AM, Eugeny Balakhonov <c0...@gmail.com> wrote:
> Hello, all!
>
>
>
> I want to create a good DIH configuration for my Oracle database with deltas
> support. Unfortunately I am not able to do it well as DIH has the strange
> restrictions.
>
> I want to explain a problem on a simple example. In a reality my database
> has very difficult structure.
>
>
>
> Initial conditions: Two tables with following easy structure:
>
>
>
> Table1
>
> - ID_RECORD (Primary key)
>
> - DATA_FIELD1
>
> - ..
>
> - DATA_FIELD2
>
> - LAST_CHANGE_TIME
>
> Table2
>
> - ID_RECORD (Primary key)
>
> - PARENT_ID_RECORD (Foreign key to Table1.ID_RECORD)
>
> - DATA_FIELD1
>
> - ..
>
> - DATA_FIELD2
>
> - LAST_CHANGE_TIME
>
>
>
> In performance reasons it is necessary to do selection of the given tables
> by means of one request (via inner join).
>
>
>
> My db-data-config.xml file:
>
>
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <dataConfig>
>
> <dataSource jndiName="jdbc/DB1" type="JdbcDataSource" user=""
> password=""/>
>
> <document>
>
> <entity name="ent" pk="T1_ID_RECORD, T2_ID_RECORD"
>
> query="select * from TABLE1 t1 inner join TABLE2 t2 on
> t1.ID_RECORD = t2.PARENT_ID_RECORD"
>
> deltaQuery="select t1.ID_RECORD T1_ID_RECORD, t1.ID_RECORD
> T2_ID_RECORD
>
> from TABLE1 t1 inner join TABLE2 t2 on
> t1.ID_RECORD = t2.PARENT_ID_RECORD
>
> where TABLE1.LAST_CHANGE_TIME >
> to_date('${dataimporter.last_index_time}', 'YYYY-MM-DD HH24:MI:SS')
>
> or TABLE2.LAST_CHANGE_TIME >
> to_date('${dataimporter.last_index_time}', 'YYYY-MM-DD HH24:MI:SS')"
>
> deltaImportQuery="select * from TABLE1 t1 inner join TABLE2 t2
> on t1.ID_RECORD = t2.PARENT_ID_RECORD
>
> where t1.ID_RECORD = ${dataimporter.delta.T1_ID_RECORD} and
> t2.ID_RECORD = ${dataimporter.delta.T2_ID_RECORD}"
>
> />
>
> </document>
>
> </dataConfig>
>
>
>
> In result I have following error:
>
>
>
> java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
> declared primary key pk='T1_ID_RECORD, T2_ID_RECORD'
>
>
>
> I have analyzed the source code of DIH. I found that in the DocBuilder class
> collectDelta() method works with value of entity attribute "pk" as with
> simple string. But in my case this is array with two values: T1_ID_RECORD,
> T2_ID_RECORD
>
>
>
> What do I do wrong?
>
>
>
> Thanks,
>
> Eugeny
>
>
>
>