You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Joel Nylund <jn...@yahoo.com> on 2009/11/23 20:49:21 UTC
help with dataimport delta query
Hi, I have solr all working nicely, except im trying to get deltas to
work on my data import handler
Here is a simplification of my data import config, I have a table
called "Book" which has categories, im doing subquries for the
category info and calling a javascript helper. This all works
perfectly for the regular query.
I added these lines for the delta stuff:
deltaImportQuery="SELECT f.id,f.title
FROM Book f
f.id='${dataimporter.delta.job_jobs_id}'"
deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
lastModifiedDate > '${dataimporter.last_index_time}'" >
basically im trying to rows that lastModifiedDate is newer than the
last index (or deltaindex).
I run:
http://localhost:8983/solr/dataimport?command=delta-import
And it says in logs:
Nov 23, 2009 2:33:02 PM
org.apache.solr.handler.dataimport.DataImporter doDeltaImport
INFO: Starting Delta Import
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
doDelta
INFO: Starting delta collection.
Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={command=delta-import}
status=0 QTime=0
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Running ModifiedRowKey() for Entity: category
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed DeletedRowKey for Entity: category rows obtained : 0
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed parentDeltaQuery for Entity: category
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Running ModifiedRowKey() for Entity: item
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed DeletedRowKey for Entity: item rows obtained : 0
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
collectDelta
INFO: Completed parentDeltaQuery for Entity: item
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
doDelta
INFO: Delta Import completed successfully
Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
execute
INFO: Time taken = 0:0:0.21
But the browser says no documents added/modified (even though one
record in db is a match)
Is there a way to turn debugging so I can see the queries the DIH is
sending to the db?
Any other ideas of what I could be doing wrong?
thanks
Joel
<document name="doc">
<entity name="item"
query="SELECT f.id, f.title
FROM Book f
WHERE f.inMyList=1"
deltaImportQuery="SELECT f.id,f.title
FROM Book f
f.id='${dataimporter.delta.job_jobs_id}'"
deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
lastModifiedDate > '${dataimporter.last_index_time}'" >
<field column="id" name="id" />
<field column="title" name="title" />
<entity name="category"
transformer="script:SplitAndPrettyCategory" query="select fc.bookId,
group_concat(cr.name) as categoryName,
from BookCat fc
where fc.bookId = '${item.id}' AND
group by fc.bookId">
<field column="categoryType" name="categoryType" />
</entity>
</entity>
</document>
Re: help with dataimport delta query
Posted by Joel Nylund <jn...@yahoo.com>.
got to love it when yahoo thinks your own mail is spam, anyone have
any ideas how to get logging to work with 1.4.
I went to the admin panel and set all logging to finest.
In my jetty std out I see no SQL for any of the dataimport handler
run. I see
Nov 23, 2009 9:26:27 PM
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 6
Nov 23, 2009 9:26:32 PM
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity category with URL: jdbc:mysql://
localhost/feeddb
Nov 23, 2009 9:26:32 PM
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 5
But no sql, from looking at the source, it looks like it should be
logging the sql if Im in debug mode.
any ideas, I think I am losing my mind.
my full import works, but the delta does nothing
thanks
Joel
On Nov 23, 2009, at 2:49 PM, Joel Nylund wrote:
> Hi, I have solr all working nicely, except im trying to get deltas
> to work on my data import handler
>
> Here is a simplification of my data import config, I have a table
> called "Book" which has categories, im doing subquries for the
> category info and calling a javascript helper. This all works
> perfectly for the regular query.
>
> I added these lines for the delta stuff:
>
> deltaImportQuery="SELECT f.id,f.title
> FROM Book f
> f.id='${dataimporter.delta.job_jobs_id}'"
> deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
> lastModifiedDate > '${dataimporter.last_index_time}'" >
>
> basically im trying to rows that lastModifiedDate is newer than the
> last index (or deltaindex).
>
> I run:
> http://localhost:8983/solr/dataimport?command=delta-import
>
> And it says in logs:
>
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DataImporter doDeltaImport
> INFO: Starting Delta Import
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties
> INFO: Read dataimport.properties
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder doDelta
> INFO: Starting delta collection.
> Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/dataimport params={command=delta-import}
> status=0 QTime=0
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Running ModifiedRowKey() for Entity: category
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed DeletedRowKey for Entity: category rows obtained : 0
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed parentDeltaQuery for Entity: category
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Running ModifiedRowKey() for Entity: item
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed DeletedRowKey for Entity: item rows obtained : 0
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder collectDelta
> INFO: Completed parentDeltaQuery for Entity: item
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder doDelta
> INFO: Delta Import completed successfully
> Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder execute
> INFO: Time taken = 0:0:0.21
>
> But the browser says no documents added/modified (even though one
> record in db is a match)
>
> Is there a way to turn debugging so I can see the queries the DIH is
> sending to the db?
>
> Any other ideas of what I could be doing wrong?
>
> thanks
> Joel
>
>
> <document name="doc">
> <entity name="item"
> query="SELECT f.id, f.title
> FROM Book f
> WHERE f.inMyList=1"
> deltaImportQuery="SELECT f.id,f.title
> FROM Book f
> f.id='${dataimporter.delta.job_jobs_id}'"
> deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
> lastModifiedDate > '${dataimporter.last_index_time}'" >
>
> <field column="id" name="id" />
> <field column="title" name="title" />
> <entity name="category"
> transformer="script:SplitAndPrettyCategory" query="select fc.bookId,
> group_concat(cr.name) as categoryName,
> from BookCat fc
> where fc.bookId = '${item.id}' AND
> group by fc.bookId">
> <field column="categoryType" name="categoryType" />
> </entity>
> </entity>
> </document>
>
>
Re: help with dataimport delta query
Posted by Joel Nylund <jn...@yahoo.com>.
Thanks that was it, well really this part:
${dataimporter.delta.job_jobs_id}
I thought the jobs_id was part of the DIH, but I guess it was just the example, duh!
thanks
Joel
--- On Tue, 11/24/09, Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com> wrote:
> From: Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com>
> Subject: Re: help with dataimport delta query
> To: solr-user@lucene.apache.org
> Date: Tuesday, November 24, 2009, 12:15 AM
> I guess the field names do not match
> in the deltaQuery you are selecting the field id
>
> and in the deltaImportQuery you us the field as
> ${dataimporter.delta.job_jobs_id}
> I guess it should be ${dataimporter.delta.id}
>
> On Tue, Nov 24, 2009 at 1:19 AM, Joel Nylund <jn...@yahoo.com>
> wrote:
> > Hi, I have solr all working nicely, except im trying
> to get deltas to work
> > on my data import handler
> >
> > Here is a simplification of my data import config, I
> have a table called
> > "Book" which has categories, im doing subquries for
> the category info and
> > calling a javascript helper. This all works perfectly
> for the regular query.
> >
> > I added these lines for the delta stuff:
> >
> > deltaImportQuery="SELECT f.id,f.title
> > FROM Book f
> >
> f.id='${dataimporter.delta.job_jobs_id}'"
> > deltaQuery="SELECT id FROM
> `Book` WHERE fm.inMyList=1 AND
> > lastModifiedDate >
> '${dataimporter.last_index_time}'" >
> >
> > basically im trying to rows that lastModifiedDate is
> newer than the last
> > index (or deltaindex).
> >
> > I run:
> > http://localhost:8983/solr/dataimport?command=delta-import
> >
> > And it says in logs:
> >
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DataImporter
> > doDeltaImport
> > INFO: Starting Delta Import
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.SolrWriter
> > readIndexerProperties
> > INFO: Read dataimport.properties
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > doDelta
> > INFO: Starting delta collection.
> > Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore
> execute
> > INFO: [] webapp=/solr path=/dataimport
> params={command=delta-import}
> > status=0 QTime=0
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Running ModifiedRowKey() for Entity: category
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed ModifiedRowKey for Entity: category
> rows obtained : 0
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed DeletedRowKey for Entity: category
> rows obtained : 0
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed parentDeltaQuery for Entity: category
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Running ModifiedRowKey() for Entity: item
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed ModifiedRowKey for Entity: item rows
> obtained : 0
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed DeletedRowKey for Entity: item rows
> obtained : 0
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > collectDelta
> > INFO: Completed parentDeltaQuery for Entity: item
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > doDelta
> > INFO: Delta Import completed successfully
> > Nov 23, 2009 2:33:02 PM
> org.apache.solr.handler.dataimport.DocBuilder
> > execute
> > INFO: Time taken = 0:0:0.21
> >
> > But the browser says no documents added/modified (even
> though one record in
> > db is a match)
> >
> > Is there a way to turn debugging so I can see the
> queries the DIH is sending
> > to the db?
> >
> > Any other ideas of what I could be doing wrong?
> >
> > thanks
> > Joel
> >
> >
> > <document name="doc">
> > <entity name="item"
> > query="SELECT f.id, f.title
> > FROM Book f
> > WHERE f.inMyList=1"
> > deltaImportQuery="SELECT
> f.id,f.title
> > FROM Book f
> >
> f.id='${dataimporter.delta.job_jobs_id}'"
> > deltaQuery="SELECT id FROM
> `Book` WHERE fm.inMyList=1 AND
> > lastModifiedDate >
> '${dataimporter.last_index_time}'" >
> >
> > <field column="id" name="id" />
> > <field column="title" name="title"
> />
> > <entity name="category"
> > transformer="script:SplitAndPrettyCategory"
> query="select fc.bookId,
> > group_concat(cr.name) as categoryName,
> > from BookCat fc
> > where fc.bookId = '${item.id}'
> AND
> > group by fc.bookId">
> > <field
> column="categoryType" name="categoryType" />
> > </entity>
> > </entity>
> > </document>
> >
> >
> >
>
>
>
> --
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
>
Re: help with dataimport delta query
Posted by Noble Paul നോബിള് नोब्ळ् <no...@corp.aol.com>.
I guess the field names do not match
in the deltaQuery you are selecting the field id
and in the deltaImportQuery you us the field as
${dataimporter.delta.job_jobs_id}
I guess it should be ${dataimporter.delta.id}
On Tue, Nov 24, 2009 at 1:19 AM, Joel Nylund <jn...@yahoo.com> wrote:
> Hi, I have solr all working nicely, except im trying to get deltas to work
> on my data import handler
>
> Here is a simplification of my data import config, I have a table called
> "Book" which has categories, im doing subquries for the category info and
> calling a javascript helper. This all works perfectly for the regular query.
>
> I added these lines for the delta stuff:
>
> deltaImportQuery="SELECT f.id,f.title
> FROM Book f
> f.id='${dataimporter.delta.job_jobs_id}'"
> deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
> lastModifiedDate > '${dataimporter.last_index_time}'" >
>
> basically im trying to rows that lastModifiedDate is newer than the last
> index (or deltaindex).
>
> I run:
> http://localhost:8983/solr/dataimport?command=delta-import
>
> And it says in logs:
>
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> INFO: Starting Delta Import
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Starting delta collection.
> Nov 23, 2009 2:33:02 PM org.apache.solr.core.SolrCore execute
> INFO: [] webapp=/solr path=/dataimport params={command=delta-import}
> status=0 QTime=0
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: category
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: category rows obtained : 0
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: category rows obtained : 0
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: category
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: item
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: item rows obtained : 0
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: item rows obtained : 0
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: item
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Delta Import completed successfully
> Nov 23, 2009 2:33:02 PM org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:0:0.21
>
> But the browser says no documents added/modified (even though one record in
> db is a match)
>
> Is there a way to turn debugging so I can see the queries the DIH is sending
> to the db?
>
> Any other ideas of what I could be doing wrong?
>
> thanks
> Joel
>
>
> <document name="doc">
> <entity name="item"
> query="SELECT f.id, f.title
> FROM Book f
> WHERE f.inMyList=1"
> deltaImportQuery="SELECT f.id,f.title
> FROM Book f
> f.id='${dataimporter.delta.job_jobs_id}'"
> deltaQuery="SELECT id FROM `Book` WHERE fm.inMyList=1 AND
> lastModifiedDate > '${dataimporter.last_index_time}'" >
>
> <field column="id" name="id" />
> <field column="title" name="title" />
> <entity name="category"
> transformer="script:SplitAndPrettyCategory" query="select fc.bookId,
> group_concat(cr.name) as categoryName,
> from BookCat fc
> where fc.bookId = '${item.id}' AND
> group by fc.bookId">
> <field column="categoryType" name="categoryType" />
> </entity>
> </entity>
> </document>
>
>
>
--
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com