You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Florian Aumeier <fa...@mediaventures.de> on 2008/10/14 13:35:17 UTC

error with delta import

Hi,

I have some problems with delta-import. Here are the infos I have.

The result from the web API, apparantly everything is fine:
<response>
−
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
−
<lst name="initArgs">
−
<lst name="defaults">
<str name="config">db-psql-data-config.xml</str>
</lst>
</lst>
<str name="status">idle</str>
<str name="importResponse"/>
−
<lst name="statusMessages">
<str name="Time Elapsed">0:29:30.615</str>
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">16194</str>
<str name="Total Documents Processed">0</str>
<str name="Total Documents Skipped">0</str>
<str name="Delta Dump started">2008-10-14 11:23:31</str>
<str name="Identifying Delta">2008-10-14 11:23:31</str>
<str name="Deltas Obtained">2008-10-14 11:32:16</str>
<str name="Building documents">2008-10-14 11:32:16</str>
<str name="Total Changed Documents">16194</str>
</lst>
−
<str name="WARNING">
This response format is experimental. It is likely to change in the future.
</str>
</response>

 From the log:
INFO: Starting Delta Import
Oct 14, 2008 11:23:31 AM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/dataimport params={command=delta-import} 
status=0 QTime=1
Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerProperties
INFO: Read dataimport.properties
Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.DocBuilder 
doDelta
INFO: Starting delta collection.
Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running ModifiedRowKey() for Entity: articles
Oct 14, 2008 11:23:31 AM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity articles with URL: 
jdbc:postgresql://bm02:5432/bm
Oct 14, 2008 11:23:35 AM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 3694
Oct 14, 2008 11:29:16 AM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 16194
Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running DeletedRowKey() for Entity: articles
Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed parentDeltaQuery for Entity: articles
Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
SEVERE: Delta Import Failed
java.lang.NullPointerException
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:136)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)


Any help and or hints is appreciated
Florian



Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Lance Norskog schrieb:
> If you make a database view with the query, it is easy to examine the data you want to index. Then, your solr import query would just pull the view.  The Solr setup file is much simpler this way.
>   
I will try and let you know.


RE: error with delta import

Posted by Lance Norskog <go...@gmail.com>.
If you make a database view with the query, it is easy to examine the data you want to index. Then, your solr import query would just pull the view.  The Solr setup file is much simpler this way.

-----Original Message-----
From: Noble Paul നോബിള്‍ नोब्ळ् [mailto:noble.paul@gmail.com] 
Sent: Wednesday, October 15, 2008 2:46 AM
To: solr-user@lucene.apache.org
Subject: Re: error with delta import

The delta implementation is a bit fragile in DIH for complex queries

I recommend you do delta-import using a full-import

.................


Re: where's the bottleneck

Posted by Yonik Seeley <yo...@apache.org>.
On Thu, Oct 30, 2008 at 1:02 AM, Barnett, Jeffrey
<je...@yale.edu> wrote:
> I thought it was turned off already.  ( Lucene vs Solr ?) Where do I make this change?

Comment out this part in your solrconfig.xml

<autoCommit>
  <maxDocs>20000</maxDocs>
  <maxTime>40000</maxTime>
</autoCommit>

-Yonik

> -----Original Message-----
> From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
> Sent: Wednesday, October 29, 2008 11:28 PM
> To: solr-user@lucene.apache.org
> Subject: Re: where's the bottleneck
>
> On Wed, Oct 29, 2008 at 9:48 PM, Barnett, Jeffrey
> <je...@yale.edu> wrote:
>> Reported import rates start a 70 docs per second, and decrease as more records are added.
>
> It might just be segment merges (that takes more time as segments grow in size).
> From the solrconfig.xml I see you have autocommit turned on... try
> with it off and see if it helps.
>
> -Yonik
>

RE: where's the bottleneck

Posted by "Barnett, Jeffrey" <je...@yale.edu>.
I thought it was turned off already.  ( Lucene vs Solr ?) Where do I make this change?

-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: Wednesday, October 29, 2008 11:28 PM
To: solr-user@lucene.apache.org
Subject: Re: where's the bottleneck

On Wed, Oct 29, 2008 at 9:48 PM, Barnett, Jeffrey
<je...@yale.edu> wrote:
> Reported import rates start a 70 docs per second, and decrease as more records are added.

It might just be segment merges (that takes more time as segments grow in size).
>From the solrconfig.xml I see you have autocommit turned on... try
with it off and see if it helps.

-Yonik

Re: where's the bottleneck

Posted by Yonik Seeley <yo...@apache.org>.
On Wed, Oct 29, 2008 at 9:48 PM, Barnett, Jeffrey
<je...@yale.edu> wrote:
> Reported import rates start a 70 docs per second, and decrease as more records are added.

It might just be segment merges (that takes more time as segments grow in size).
>From the solrconfig.xml I see you have autocommit turned on... try
with it off and see if it helps.

-Yonik

where's the bottleneck

Posted by "Barnett, Jeffrey" <je...@yale.edu>.
I saw a similar subject posted earlier.  This is not a continuation of that thread, but the problem is similar.  I have a large, fast, dedicated machine, that despite boosting various parameters in solrconfig.xml (attached) and in the JVM, utilizes at most 10% of the cpu while importing: (from top)

5817 vufind    46  17    4 4721M 4691M cpu/35  18.8H  8.85% /usr/jdk/instances/jdk1.6.0/bin/sparcv9/java -Xms4096m -Xmx4096m -Xmn2g -XX:+UseParallelGC -XX:+AggressiveOpts

There is 0.0% reported iowait time, 32GB real memory, and virtually no other processes running.  The index is relatively large (8M docs, 30GB), but not extreme by the standards of others it see in this list.  Reported import rates start a 70 docs per second, and decrease as more records are added.  Why is the program not using the resources it has been given?

OS: Solaris 10
Java 1.6
Solr: 1.3

Re: error with delta import

Posted by Chris Hostetter <ho...@fucit.org>.
: The case in point is DIH. DIH uses the standard DOM parser that comes
: w/ JDK. If it reads the xml properly do we need to complain?.  I guess
: that data-config.xml may not be used for any other purposes.

that's a vague statement as well ... there is no such thing as "the 
standard DOM parser that comes w/ JDK" ... that's an implementation detail 
of the JRE, and differnet JRE providers might use different parsers in 
their DocumentBuilders, some of which might be stricter then others.

*AND* even the choice of DocumentBuilder and DocumentBuilder factory can 
be decided at runtime -- so even if someone uses the same JRE as you, 
their servlet container might be registering it's own 
DocumentBuilderFactory.

So it's not safe to make any assumptions that just because the 
javax.xml.parsers.DocumentBuilder used in one Solr deployment cleanly 
parses a mallform XML file that it will work on any other machine.

: 
: 
: On Wed, Oct 22, 2008 at 10:10 PM, Walter Underwood
: <wu...@netflix.com> wrote:
: > On 10/22/08 8:57 AM, "Steven A Rowe" <sa...@syr.edu> wrote:
: >
: >> Telling people that it's not a problem (or required!) to write non-well-formed
: >> XML, because a particular XML parser can't accept well-formed XML is kind of
: >> insidious.
: >
: > I'm with you all the way on this.
: >
: > A parser which accepts non-well-formed XML is not an XML parser, since the
: > XML spec requires reporting a fatal error.
: >
: > It is really easy to test these things. Modern browsers have good XML
: > parsers, so put your test case in a "test.xml" file and open it in a
: > browser. If it isn't well-formed, you'll get an error.
: >
: > Here is my test XML:
: >
: > <root attribute="<"/>
: >
: > Here is what Firefox 3.0.3 says about that:
: >
: > XML Parsing Error: not well-formed
: > Location: file:///Users/wunderwood/Desktop/test.xml
: > Line Number 1, Column 18:
: >
: > <root attribute="<"/>
: > -----------------^
: >
: > wunder
: >
: >
: 
: 
: 
: -- 
: --Noble Paul
: 



-Hoss


Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
The case in point is DIH. DIH uses the standard DOM parser that comes
w/ JDK. If it reads the xml properly do we need to complain?.  I guess
that data-config.xml may not be used for any other purposes.


On Wed, Oct 22, 2008 at 10:10 PM, Walter Underwood
<wu...@netflix.com> wrote:
> On 10/22/08 8:57 AM, "Steven A Rowe" <sa...@syr.edu> wrote:
>
>> Telling people that it's not a problem (or required!) to write non-well-formed
>> XML, because a particular XML parser can't accept well-formed XML is kind of
>> insidious.
>
> I'm with you all the way on this.
>
> A parser which accepts non-well-formed XML is not an XML parser, since the
> XML spec requires reporting a fatal error.
>
> It is really easy to test these things. Modern browsers have good XML
> parsers, so put your test case in a "test.xml" file and open it in a
> browser. If it isn't well-formed, you'll get an error.
>
> Here is my test XML:
>
> <root attribute="<"/>
>
> Here is what Firefox 3.0.3 says about that:
>
> XML Parsing Error: not well-formed
> Location: file:///Users/wunderwood/Desktop/test.xml
> Line Number 1, Column 18:
>
> <root attribute="<"/>
> -----------------^
>
> wunder
>
>



-- 
--Noble Paul

Re: error with delta import

Posted by Walter Underwood <wu...@netflix.com>.
On 10/22/08 8:57 AM, "Steven A Rowe" <sa...@syr.edu> wrote:

> Telling people that it's not a problem (or required!) to write non-well-formed
> XML, because a particular XML parser can't accept well-formed XML is kind of
> insidious.

I'm with you all the way on this.

A parser which accepts non-well-formed XML is not an XML parser, since the
XML spec requires reporting a fatal error.

It is really easy to test these things. Modern browsers have good XML
parsers, so put your test case in a "test.xml" file and open it in a
browser. If it isn't well-formed, you'll get an error.

Here is my test XML:

<root attribute="<"/>

Here is what Firefox 3.0.3 says about that:

XML Parsing Error: not well-formed
Location: file:///Users/wunderwood/Desktop/test.xml
Line Number 1, Column 18:

<root attribute="<"/>
-----------------^

wunder


RE: error with delta import

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Shalin,

I wasn't talking about the behavior of parsers in the wild, but rather about the XML specification (paraphrasing):

1. An XML document is not well-formed unless it matches the production labeled document.
2. Violations of well-formedness constraints are fatal errors.
3. Once a fatal error is detected, an XML parser MUST NOT continue normal processing.

So although there are undoubtedly parsers that will parse '<' in attribute values, in so doing, these parsers are non-conformant with the XML specification.  This is important only to the extent that people who create documents that target non-conforming features of parsers can't reliably expect these documents to be parsed by conformant parsers; XML's write-once-parse-anywhere promise thereby inexorably evaporates.

Telling people that it's not a problem (or required!) to write non-well-formed XML, because a particular XML parser can't accept well-formed XML is kind of insidious.  I for one will not stand idly by and permit this outrage to remain unchallenged!!!

:)

Steve

On 10/22/2008 at 4:01 AM, Shalin Shekhar Mangar wrote:
> Actually, most XML parsers don't require you to escape such
> characters in attributes. You are welcome to try this out,
> just look at the example-DIH :)
> 
> On Tue, Oct 21, 2008 at 11:11 PM, Steven A Rowe
> <sa...@syr.edu> wrote:
> 
> > Wow, I really should read more closely before I respond - I see now,
> > Noble, that you were talking about DIH's ability to parse escaped '<'s
> > in attribute values, rather than about whether '<' was an acceptable
> > character in attribute values.
> > 
> > I should repurpose my remarks to note to Shalin, though, that all
> > (conformant) XML parsers have to be able to handle escaped '<'s in
> > attribute values, since an XML document with a '<' in an attribute
> > value is not well-formed.
> > 
> > Steve
> > 
> > On 10/21/2008 at 1:10 PM, Steven A Rowe wrote:
> > > On 10/21/2008 at 12:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> > > > On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
> > > <sh...@gmail.com> wrote:
> > > > > Your data-config looks fine except for one thing --
> you do not need
> > to
> > > > > escape '<' character in an XML attribute. It maybe throwing off the
> > > > > parsing code in DataImportHandler.
> > > > 
> > > > not really '<' is fine in attribute
> > > 
> > > Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not*
> > > fine in an attribute value - from
> > > <http://www.w3.org/TR/REC-xml/#NT-AttValue>:
> > > 
> > >   [10]  AttValue ::= '"' ([^<&"] | Reference)* '"'
> > >                  |   "'" ([^<&'] | Reference)* "'"
> > > 
> > > where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is:
> > > 
> > >   [41] Attribute ::= Name Eq AttValue
> > > 
> > > Steve

Re: error with delta import

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Actually, most XML parsers don't require you to escape such characters in
attributes. You are welcome to try this out, just look at the example-DIH :)

On Tue, Oct 21, 2008 at 11:11 PM, Steven A Rowe <sa...@syr.edu> wrote:

> Wow, I really should read more closely before I respond - I see now, Noble,
> that you were talking about DIH's ability to parse escaped '<'s in attribute
> values, rather than about whether '<' was an acceptable character in
> attribute values.
>
> I should repurpose my remarks to note to Shalin, though, that all
> (conformant) XML parsers have to be able to handle escaped '<'s in attribute
> values, since an XML document with a '<' in an attribute value is not
> well-formed.
>
> Steve
>
> On 10/21/2008 at 1:10 PM, Steven A Rowe wrote:
> > On 10/21/2008 at 12:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> > > On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
> > <sh...@gmail.com> wrote:
> > > > Your data-config looks fine except for one thing -- you do not need
> to
> > > > escape '<' character in an XML attribute. It maybe throwing off the
> > > > parsing code in DataImportHandler.
> > >
> > > not really '<' is fine in attribute
> >
> > Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not*
> > fine in an attribute value - from
> > <http://www.w3.org/TR/REC-xml/#NT-AttValue>:
> >
> >   [10]  AttValue ::= '"' ([^<&"] | Reference)* '"'
> >                  |   "'" ([^<&'] | Reference)* "'"
> >
> > where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is:
> >
> >   [41] Attribute ::= Name Eq AttValue
> >
> > Steve
>



-- 
Regards,
Shalin Shekhar Mangar.

RE: error with delta import

Posted by Steven A Rowe <sa...@syr.edu>.
Wow, I really should read more closely before I respond - I see now, Noble, that you were talking about DIH's ability to parse escaped '<'s in attribute values, rather than about whether '<' was an acceptable character in attribute values.

I should repurpose my remarks to note to Shalin, though, that all (conformant) XML parsers have to be able to handle escaped '<'s in attribute values, since an XML document with a '<' in an attribute value is not well-formed.

Steve

On 10/21/2008 at 1:10 PM, Steven A Rowe wrote:
> On 10/21/2008 at 12:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> > On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
> <sh...@gmail.com> wrote:
> > > Your data-config looks fine except for one thing -- you do not need to
> > > escape '<' character in an XML attribute. It maybe throwing off the
> > > parsing code in DataImportHandler.
> > 
> > not really '<' is fine in attribute
> 
> Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not*
> fine in an attribute value - from
> <http://www.w3.org/TR/REC-xml/#NT-AttValue>:
> 
>   [10]  AttValue ::= '"' ([^<&"] | Reference)* '"'
>                  |   "'" ([^<&'] | Reference)* "'"
> 
> where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is:
> 
>   [41] Attribute ::= Name Eq AttValue
> 
> Steve

RE: error with delta import

Posted by Steven A Rowe <sa...@syr.edu>.
On 10/21/2008 at 12:14 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote:
> On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar <sh...@gmail.com> wrote:
> > Your data-config looks fine except for one thing -- you do not need to
> > escape '<' character in an XML attribute. It maybe throwing off the
> > parsing code in DataImportHandler.
>
> not really '<' is fine in attribute

Noble, I think you're wrong - AFAICT from the XML spec., '<' is *not* fine in an attribute value - from <http://www.w3.org/TR/REC-xml/#NT-AttValue>:

  [10]  AttValue ::= '"' ([^<&"] | Reference)* '"' 
                 |   "'" ([^<&'] | Reference)* "'"

where an attribute <http://www.w3.org/TR/REC-xml/#dt-stag> is:

  [41] Attribute ::= Name Eq AttValue

Steve

Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Tue, Oct 21, 2008 at 12:56 AM, Shalin Shekhar Mangar
<sh...@gmail.com> wrote:
> Your data-config looks fine except for one thing -- you do not need to
> escape '<' character in an XML attribute. It maybe throwing off the parsing
> code in DataImportHandler.
not really '<' is fine in attribute

>
> Another question, does the full-import work fine?
>
> On Mon, Oct 20, 2008 at 7:31 PM, Florian Aumeier
> <fa...@mediaventures.de>wrote:
>
>> sorry to bother you again, but the delta import still does not work for me
>> :-(
>>
>> We tried:
>> * delta-import by full-import
>>   <entity name="articles-delta rootEntity="false"
>> query="<your-delta-query-here>"> with entity=articles-delta&clean=false
>>
>> * delta-import by full-import with simplified query
>>
>> * delta-import with simplified query
>>       <entity name="articles-delta" pk="article_ref" deltaQuery="SELECT *
>> FROM full_text_view WHERE article_id &lt; 300">
>>
>> * replaced files below with files from nightly-build 15.10.08 and rerun the
>> delta and full imports as described above
>> dist/apache-solr-dataimporthandler-1.3.0.jar
>> dist/solrj-lib/slf4j-api-1.5.3.jar
>> dist/solrj-lib/slf4j-jdk14-1.5.3.jar
>>
>>
>> No matter what we do, we always end up in a situation, when the dataimport
>> status looks fine:
>>
>> <lst name="statusMessages">
>> <str name="Time Elapsed">0:0:8.442</str>
>> <str name="Total Requests made to DataSource">1</str>
>> <str name="Total Rows Fetched">218</str>
>> <str name="Total Documents Skipped">0</str>
>> <str name="Delta Dump started">2008-10-20 15:31:54</str>
>> <str name="Identifying Delta">2008-10-20 15:31:54</str>
>> <str name="Deltas Obtained">2008-10-20 15:31:57</str>
>> <str name="Building documents">2008-10-20 15:31:57</str>
>> <str name="Total Changed Documents">218</str>
>>
>> but the log reads:
>> Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
>> INFO: [test] webapp=/solr path=/dataimport params={command=delta-import}
>> status=0 QTime=0
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter
>> doDeltaImport
>> INFO: Starting Delta Import
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
>> doDelta
>> INFO: Starting delta collection.
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Running ModifiedRowKey() for Entity: articles-full
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
>> call
>> INFO: Creating a connection for entity articles-full with URL:
>> jdbc:postgresql://blogmonitor02:5432/blogmonitor
>> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
>> call
>> INFO: Time taken for getConnection(): 5
>> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained :
>> 218
>> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Running DeletedRowKey() for Entity: articles-full
>> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
>> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed parentDeltaQuery for Entity: articles-full
>> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter
>> doDeltaImport
>> SEVERE: Delta Import Failed
>> java.lang.NullPointerException
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>
>> here is the full data-config:
>>
>> <dataConfig>
>>  <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
>>       url="jdbc:postgresql://bm02:5432/bm" user="bm" />
>>
>>  <document name="articles">
>>   <entity name="articles-full" pk="id" query="SELECT * FROM full_text_view
>> where article_id &lt; 200" deltaQuery="SELECT * FROM full_text_view WHERE
>> article_id &lt; 300">
>>     <field column="article_id" name="a_id" />
>>     <field column="normalized_text" name="norm_text" />
>>     <field column="article_ref" name="id" />
>>     <field column="article_stub" name="stub" />
>>     <field column="id_blogs" name="blog_id" />
>>     <field column="article_title" name="a_title" />
>>     <field column="article_url" name="article_url" />
>>     <field column="ts" name="ts" />
>>     <field column="rank" name="rank" />
>>       <field column="blog_ref" name="blog_ref" />
>>       <field column="blog_title" name="b_title" />
>>       <field column="blog_subtitle" name="subtitle" />
>>         <field column="blog_url" name="blog_url" />
>>     </entity>
>>
>>   </document>
>>
>> </dataConfig>
>>
>> what are we doing wrong?
>> Florian
>>
>>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>



-- 
--Noble Paul

Re: error with delta import

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
Your data-config looks fine except for one thing -- you do not need to
escape '<' character in an XML attribute. It maybe throwing off the parsing
code in DataImportHandler.

Another question, does the full-import work fine?

On Mon, Oct 20, 2008 at 7:31 PM, Florian Aumeier
<fa...@mediaventures.de>wrote:

> sorry to bother you again, but the delta import still does not work for me
> :-(
>
> We tried:
> * delta-import by full-import
>   <entity name="articles-delta rootEntity="false"
> query="<your-delta-query-here>"> with entity=articles-delta&clean=false
>
> * delta-import by full-import with simplified query
>
> * delta-import with simplified query
>       <entity name="articles-delta" pk="article_ref" deltaQuery="SELECT *
> FROM full_text_view WHERE article_id &lt; 300">
>
> * replaced files below with files from nightly-build 15.10.08 and rerun the
> delta and full imports as described above
> dist/apache-solr-dataimporthandler-1.3.0.jar
> dist/solrj-lib/slf4j-api-1.5.3.jar
> dist/solrj-lib/slf4j-jdk14-1.5.3.jar
>
>
> No matter what we do, we always end up in a situation, when the dataimport
> status looks fine:
>
> <lst name="statusMessages">
> <str name="Time Elapsed">0:0:8.442</str>
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">218</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Delta Dump started">2008-10-20 15:31:54</str>
> <str name="Identifying Delta">2008-10-20 15:31:54</str>
> <str name="Deltas Obtained">2008-10-20 15:31:57</str>
> <str name="Building documents">2008-10-20 15:31:57</str>
> <str name="Total Changed Documents">218</str>
>
> but the log reads:
> Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
> INFO: [test] webapp=/solr path=/dataimport params={command=delta-import}
> status=0 QTime=0
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> INFO: Starting Delta Import
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Starting delta collection.
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: articles-full
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity articles-full with URL:
> jdbc:postgresql://blogmonitor02:5432/blogmonitor
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 5
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained :
> 218
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running DeletedRowKey() for Entity: articles-full
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: articles-full
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> SEVERE: Delta Import Failed
> java.lang.NullPointerException
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>       at
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>
> here is the full data-config:
>
> <dataConfig>
>  <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
>       url="jdbc:postgresql://bm02:5432/bm" user="bm" />
>
>  <document name="articles">
>   <entity name="articles-full" pk="id" query="SELECT * FROM full_text_view
> where article_id &lt; 200" deltaQuery="SELECT * FROM full_text_view WHERE
> article_id &lt; 300">
>     <field column="article_id" name="a_id" />
>     <field column="normalized_text" name="norm_text" />
>     <field column="article_ref" name="id" />
>     <field column="article_stub" name="stub" />
>     <field column="id_blogs" name="blog_id" />
>     <field column="article_title" name="a_title" />
>     <field column="article_url" name="article_url" />
>     <field column="ts" name="ts" />
>     <field column="rank" name="rank" />
>       <field column="blog_ref" name="blog_ref" />
>       <field column="blog_title" name="b_title" />
>       <field column="blog_subtitle" name="subtitle" />
>         <field column="blog_url" name="blog_url" />
>     </entity>
>
>   </document>
>
> </dataConfig>
>
> what are we doing wrong?
> Florian
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
hello everybody

thank you all for your help and ideas it works now.
>> what are we doing wrong?
>> Florian
>>     

actually, I am not sure what we did wrong. After we started it again 
from scratch and with the simplified query it all worked as expected.

Regards
Florian

Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
you are still doing a delta import . with the modified data-config you
must do a command=full-import


On Mon, Oct 20, 2008 at 7:31 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> sorry to bother you again, but the delta import still does not work for me
> :-(
>
> We tried:
> * delta-import by full-import
>   <entity name="articles-delta rootEntity="false"
> query="<your-delta-query-here>"> with entity=articles-delta&clean=false
>
> * delta-import by full-import with simplified query
>
> * delta-import with simplified query
>       <entity name="articles-delta" pk="article_ref" deltaQuery="SELECT *
> FROM full_text_view WHERE article_id &lt; 300">
>
> * replaced files below with files from nightly-build 15.10.08 and rerun the
> delta and full imports as described above
> dist/apache-solr-dataimporthandler-1.3.0.jar
> dist/solrj-lib/slf4j-api-1.5.3.jar
> dist/solrj-lib/slf4j-jdk14-1.5.3.jar
>
>
> No matter what we do, we always end up in a situation, when the dataimport
> status looks fine:
>
> <lst name="statusMessages">
> <str name="Time Elapsed">0:0:8.442</str>
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">218</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Delta Dump started">2008-10-20 15:31:54</str>
> <str name="Identifying Delta">2008-10-20 15:31:54</str>
> <str name="Deltas Obtained">2008-10-20 15:31:57</str>
> <str name="Building documents">2008-10-20 15:31:57</str>
> <str name="Total Changed Documents">218</str>
>
> but the log reads:
> Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
> INFO: [test] webapp=/solr path=/dataimport params={command=delta-import}
> status=0 QTime=0
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> INFO: Starting Delta Import
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Starting delta collection.
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: articles-full
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity articles-full with URL:
> jdbc:postgresql://blogmonitor02:5432/blogmonitor
> Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 5
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained : 218
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running DeletedRowKey() for Entity: articles-full
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: articles-full
> Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> SEVERE: Delta Import Failed
> java.lang.NullPointerException
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>       at
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>
> here is the full data-config:
>
> <dataConfig>
>  <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
>       url="jdbc:postgresql://bm02:5432/bm" user="bm" />
>
>  <document name="articles">
>   <entity name="articles-full" pk="id" query="SELECT * FROM full_text_view
> where article_id &lt; 200" deltaQuery="SELECT * FROM full_text_view WHERE
> article_id &lt; 300">
>     <field column="article_id" name="a_id" />
>     <field column="normalized_text" name="norm_text" />
>     <field column="article_ref" name="id" />
>     <field column="article_stub" name="stub" />
>     <field column="id_blogs" name="blog_id" />
>     <field column="article_title" name="a_title" />
>     <field column="article_url" name="article_url" />
>     <field column="ts" name="ts" />
>     <field column="rank" name="rank" />
>       <field column="blog_ref" name="blog_ref" />
>       <field column="blog_title" name="b_title" />
>       <field column="blog_subtitle" name="subtitle" />
>         <field column="blog_url" name="blog_url" />
>     </entity>
>
>   </document>
>
> </dataConfig>
>
> what are we doing wrong?
> Florian
>
>



-- 
--Noble Paul

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
sorry to bother you again, but the delta import still does not work for 
me :-(

We tried:
* delta-import by full-import
    <entity name="articles-delta rootEntity="false" 
query="<your-delta-query-here>"> with entity=articles-delta&clean=false

* delta-import by full-import with simplified query

* delta-import with simplified query
        <entity name="articles-delta" pk="article_ref" 
deltaQuery="SELECT * FROM full_text_view WHERE article_id &lt; 300">

* replaced files below with files from nightly-build 15.10.08 and rerun 
the delta and full imports as described above
dist/apache-solr-dataimporthandler-1.3.0.jar
dist/solrj-lib/slf4j-api-1.5.3.jar
dist/solrj-lib/slf4j-jdk14-1.5.3.jar


No matter what we do, we always end up in a situation, when the 
dataimport status looks fine:

<lst name="statusMessages">
<str name="Time Elapsed">0:0:8.442</str>
<str name="Total Requests made to DataSource">1</str>
<str name="Total Rows Fetched">218</str>
<str name="Total Documents Skipped">0</str>
<str name="Delta Dump started">2008-10-20 15:31:54</str>
<str name="Identifying Delta">2008-10-20 15:31:54</str>
<str name="Deltas Obtained">2008-10-20 15:31:57</str>
<str name="Building documents">2008-10-20 15:31:57</str>
<str name="Total Changed Documents">218</str>

but the log reads:
Oct 20, 2008 3:56:44 PM org.apache.solr.core.SolrCore execute
INFO: [test] webapp=/solr path=/dataimport params={command=delta-import} 
status=0 QTime=0
Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
INFO: Starting Delta Import
Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerProperties
INFO: Read dataimport.properties
Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder 
doDelta
INFO: Starting delta collection.
Oct 20, 2008 3:56:44 PM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running ModifiedRowKey() for Entity: articles-full
Oct 20, 2008 3:56:44 PM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity articles-full with URL: 
jdbc:postgresql://blogmonitor02:5432/blogmonitor
Oct 20, 2008 3:56:44 PM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 5
Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed ModifiedRowKey for Entity: articles-full rows obtained : 218
Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running DeletedRowKey() for Entity: articles-full
Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed DeletedRowKey for Entity: articles-full rows obtained : 0
Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed parentDeltaQuery for Entity: articles-full
Oct 20, 2008 3:56:46 PM org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
SEVERE: Delta Import Failed
java.lang.NullPointerException
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
        at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
        at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
        at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
        at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
        at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)

here is the full data-config:

<dataConfig>
  <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
        url="jdbc:postgresql://bm02:5432/bm" user="bm" />

  <document name="articles">
    <entity name="articles-full" pk="id" query="SELECT * FROM 
full_text_view where article_id &lt; 200" deltaQuery="SELECT * FROM 
full_text_view WHERE article_id &lt; 300">
      <field column="article_id" name="a_id" />
      <field column="normalized_text" name="norm_text" />
      <field column="article_ref" name="id" />
      <field column="article_stub" name="stub" />
      <field column="id_blogs" name="blog_id" />
      <field column="article_title" name="a_title" />
      <field column="article_url" name="article_url" />
      <field column="ts" name="ts" />
      <field column="rank" name="rank" />
        <field column="blog_ref" name="blog_ref" />
        <field column="blog_title" name="b_title" />
        <field column="blog_subtitle" name="subtitle" />
          <field column="blog_url" name="blog_url" />
      </entity>

    </document>

</dataConfig>

what are we doing wrong?
Florian


Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Noble Paul നോബിള്‍ नोब्ळ् schrieb:
> the last-index_time is available only from second time onwards that is
> . It expects a full-import to be done first
> It knows that by the presence of dataimport.properties in the  config
> directory. Did you check if it is present?
>
>   
yes, I did a check and the file is still present. It is the same file as 
used by the delta-import?

Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
the last-index_time is available only from second time onwards that is
. It expects a full-import to be done first
It knows that by the presence of dataimport.properties in the  config
directory. Did you check if it is present?


On Thu, Oct 16, 2008 at 5:33 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> Noble Paul നോബിള്‍ नोब्ळ् schrieb:
>>>
>>> Well, when doing the way you described below (full-import with the delta
>>> query), the '${dataimporter.last_index_time}' timestamp is empty:
>>>
>>
>> I guess this was fixed post 1.3 . probably you can take
>> dataimporthandler.jar from a nightly build (you may also need to add
>> slf4j.jar)
>>
>>>
> I replaced
> dist/apache-solr-dataimporthandler-1.3.0.jar
> dist/solrj-lib/slf4j-api-1.5.3.jar
> dist/solrj-lib/slf4j-jdk14-1.5.3.jar
>
> with their counterparts from the nightly build, but it did not help. Then I
> tried to enter the date kind of hard coded (now() - '12 hours'::interval).
> Everything looks fine, but there are no new documents in the index.
>
> here is the log:
>
> INFO: Starting Full Import
> Oct 16, 2008 1:07:08 PM org.apache.solr.core.SolrCore executeINFO: [test]
> webapp=/solr path=/dataimport
> params={command=full-import&clean=false&entity=articles-delta} status=0
> QTime=0
> Oct 16, 2008 1:07:08 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity articles-delta with URL:
> jdbc:postgresql://bm02:5432/bm
> Oct 16, 2008 1:07:08 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
> callINFO: Time taken for getConnection(): 45
> Oct 16, 2008 1:14:53 PM org.apache.solr.core.SolrCore execute
> INFO: [test] webapp=/solr path=/dataimport params={} status=0 QTime=1
> Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerPropertiesINFO: Read dataimport.properties
> Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter
> persistStartTime
> INFO: Wrote last indexed time to dataimport.properties
> Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.DocBuilder
> commitINFO: Full Import completed successfullyOct 16, 2008 1:16:11 PM
> org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)Oct 16,
> 2008 1:16:11 PM org.apache.solr.search.SolrIndexSearcher <init>INFO: Opening
> Searcher@3cd0d12e mainOct 16, 2008 1:16:11 PM
> org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> ... (autowarming)
> Oct 16, 2008 1:16:12 PM org.apache.solr.handler.dataimport.DocBuilder
> execute
> INFO: Time taken = 0:9:3.231
>
>



-- 
--Noble Paul

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Noble Paul നോബിള്‍ नोब्ळ् schrieb:
>> Well, when doing the way you described below (full-import with the delta
>> query), the '${dataimporter.last_index_time}' timestamp is empty:
>>     
> I guess this was fixed post 1.3 . probably you can take
> dataimporthandler.jar from a nightly build (you may also need to add
> slf4j.jar)
>   
>>
I replaced
dist/apache-solr-dataimporthandler-1.3.0.jar
dist/solrj-lib/slf4j-api-1.5.3.jar
dist/solrj-lib/slf4j-jdk14-1.5.3.jar

with their counterparts from the nightly build, but it did not help. 
Then I tried to enter the date kind of hard coded (now() - '12 
hours'::interval).
Everything looks fine, but there are no new documents in the index.

here is the log:

INFO: Starting Full Import
Oct 16, 2008 1:07:08 PM org.apache.solr.core.SolrCore executeINFO: 
[test] webapp=/solr path=/dataimport 
params={command=full-import&clean=false&entity=articles-delta} status=0 
QTime=0
Oct 16, 2008 1:07:08 PM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity articles-delta with URL: 
jdbc:postgresql://bm02:5432/bm
Oct 16, 2008 1:07:08 PM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 callINFO: Time taken 
for getConnection(): 45
Oct 16, 2008 1:14:53 PM org.apache.solr.core.SolrCore execute
INFO: [test] webapp=/solr path=/dataimport params={} status=0 QTime=1
Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter 
readIndexerPropertiesINFO: Read dataimport.properties
Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.SolrWriter 
persistStartTime
INFO: Wrote last indexed time to dataimport.properties
Oct 16, 2008 1:16:11 PM org.apache.solr.handler.dataimport.DocBuilder 
commitINFO: Full Import completed successfullyOct 16, 2008 1:16:11 PM 
org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)Oct 
16, 2008 1:16:11 PM org.apache.solr.search.SolrIndexSearcher <init>INFO: 
Opening Searcher@3cd0d12e mainOct 16, 2008 1:16:11 PM 
org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
... (autowarming)
Oct 16, 2008 1:16:12 PM org.apache.solr.handler.dataimport.DocBuilder 
execute
INFO: Time taken = 0:9:3.231


Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
On Thu, Oct 16, 2008 at 2:08 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> Noble Paul നോബിള്‍ नोब्ळ् schrieb:
>>
>> The delta implementation is a bit fragile in DIH for complex queries
>>
>>
>
> that's too bad. It's a nice interface and less complex to configure than to
> go the XML /update way.
>
>
> Well, when doing the way you described below (full-import with the delta
> query), the '${dataimporter.last_index_time}' timestamp is empty:
I guess this was fixed post 1.3 . probably you can take
dataimporthandler.jar from a nightly build (you may also need to add
slf4j.jar)
>
> Oct 16, 2008 10:14:53 AM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> SEVERE: Full Import failed
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS
> article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, au.url
> AS article_url, bu.url AS blog_url, b.title AS blog_title,b.subtitle AS
> blog_subtitle, r.rank, coalesce(a.updated,a.published,a.added) as ts, a.stub
> as article_stub FROM articles a join blogs b on a.id_blogs = b.id join urls
> au on a.id_urls = au.id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN
> ranks r on a.id = r.id_articles WHERE b.id_urls is not null AND a.hidden is
> false AND b.hidden is false AND a.ref is not null AND b.ref is not null and
> (rankid in (SELECT rankid FROM ranks order by rankid desc limit 1) OR rankid
> is null) AND coalesce(a.updated,a.published,a.added) > '' Processing
> Document # 1
>
> Regards
> Florian
>
>
>> I recommend you do delta-import using a full-import
>>
>> it can be done as follows
>> define a diffferent entity
>>
>> <dataConfig>
>> <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
>> url="jdbc:postgresql://bm02:5432/bm" user="user" />
>>
>> <document name="articles">
>>  <entity name="articles-full" ..>
>>  </entity>
>>
>>  <entity name="articles-delta rootEntity="false"
>> query="<your-delta-query-here>">
>>      <!-- this following entity can be a copy articles-full entity
>> without any delta query because rootEntity=false for
>>       articles-delta the following will be used for creating
>> documents. all other rules are same-->
>>       <entity name="anyname" ..>
>>       </entity>
>>  </entity>
>> </document>
>>
>> when you wish to do a full-import pass the request parameter
>> entity=articles-full
>>
>> for delta-import use the request parameter
>> entity=articles-delta&clean=false (command has to be full-import only)
>>
>>
>>
>> On Wed, Oct 15, 2008 at 1:42 PM, Florian Aumeier
>> <fa...@mediaventures.de> wrote:
>>
>>>
>>> Shalin Shekhar Mangar schrieb:
>>>
>>>>
>>>> You are missing the "pk" field (primary key). This is used for delta
>>>> imports.
>>>>
>>>>
>>>
>>> I added the pk field and rebuild the index yesterday. However, when I run
>>> the delta-import, I still have this error message in the log:
>>>
>>> INFO: Starting delta collection.
>>> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.DocBuilder
>>> collectDelta
>>> INFO: Running ModifiedRowKey() for Entity: articles
>>> Oct 15, 2008 9:37:27 AM
>>> org.apache.solr.handler.dataimport.JdbcDataSource$1
>>> call
>>> INFO: Creating a connection for entity articles with URL:
>>> jdbc:postgresql://bm02:5432/bm
>>> Oct 15, 2008 9:37:27 AM
>>> org.apache.solr.handler.dataimport.JdbcDataSource$1
>>> call
>>> INFO: Time taken for getConnection(): 43
>>> Oct 15, 2008 9:37:36 AM org.apache.solr.core.SolrCore execute
>>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>>> Oct 15, 2008 9:44:51 AM org.apache.solr.core.SolrCore execute
>>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>>> collectDelta
>>> INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 4584
>>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>>> collectDelta
>>> INFO: Running DeletedRowKey() for Entity: articles
>>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>>> collectDelta
>>> INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
>>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>>> collectDelta
>>> INFO: Completed parentDeltaQuery for Entity: articles
>>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DataImporter
>>> doDeltaImport
>>> SEVERE: Delta Import Failed
>>> java.lang.NullPointerException
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>>>      at
>>>
>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>> Oct 15, 2008 9:50:58 AM org.apache.solr.core.SolrCore execute
>>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>>>
>>> Regards
>>> Florian
>>>
>>>
>>
>>
>>
>>
>
>
> --
> Media Ventures GmbH Entwicklung Blogmonitor.de
>
> Jabber-ID faumeier@mabber.de
> Telefon +49 (0) 2236 480 10 22
>
>



-- 
--Noble Paul

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Noble Paul നോബിള്‍ नोब्ळ् schrieb:
> The delta implementation is a bit fragile in DIH for complex queries
>
>   
that's too bad. It's a nice interface and less complex to configure than 
to go the XML /update way.


Well, when doing the way you described below (full-import with the delta 
query), the '${dataimporter.last_index_time}' timestamp is empty:

Oct 16, 2008 10:14:53 AM org.apache.solr.handler.dataimport.DataImporter 
doFullImport
SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to 
execute query: SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS 
article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, 
au.url AS article_url, bu.url AS blog_url, b.title AS 
blog_title,b.subtitle AS blog_subtitle, r.rank, 
coalesce(a.updated,a.published,a.added) as ts, a.stub as article_stub 
FROM articles a join blogs b on a.id_blogs = b.id join urls au on 
a.id_urls = au.id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN 
ranks r on a.id = r.id_articles WHERE b.id_urls is not null AND a.hidden 
is false AND b.hidden is false AND a.ref is not null AND b.ref is not 
null and (rankid in (SELECT rankid FROM ranks order by rankid desc limit 
1) OR rankid is null) AND coalesce(a.updated,a.published,a.added) > '' 
Processing Document # 1

Regards
Florian


> I recommend you do delta-import using a full-import
>
> it can be done as follows
> define a diffferent entity
>
> <dataConfig>
> <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
> url="jdbc:postgresql://bm02:5432/bm" user="user" />
>
> <document name="articles">
>   <entity name="articles-full" ..>
>   </entity>
>
>   <entity name="articles-delta rootEntity="false"
> query="<your-delta-query-here>">
>       <!-- this following entity can be a copy articles-full entity
> without any delta query because rootEntity=false for
>        articles-delta the following will be used for creating
> documents. all other rules are same-->
>        <entity name="anyname" ..>
>        </entity>
>  </entity>
> </document>
>
> when you wish to do a full-import pass the request parameter
> entity=articles-full
>
> for delta-import use the request parameter
> entity=articles-delta&clean=false (command has to be full-import only)
>
>
>
> On Wed, Oct 15, 2008 at 1:42 PM, Florian Aumeier
> <fa...@mediaventures.de> wrote:
>   
>> Shalin Shekhar Mangar schrieb:
>>     
>>> You are missing the "pk" field (primary key). This is used for delta
>>> imports.
>>>
>>>       
>> I added the pk field and rebuild the index yesterday. However, when I run
>> the delta-import, I still have this error message in the log:
>>
>> INFO: Starting delta collection.
>> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Running ModifiedRowKey() for Entity: articles
>> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
>> call
>> INFO: Creating a connection for entity articles with URL:
>> jdbc:postgresql://bm02:5432/bm
>> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
>> call
>> INFO: Time taken for getConnection(): 43
>> Oct 15, 2008 9:37:36 AM org.apache.solr.core.SolrCore execute
>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>> Oct 15, 2008 9:44:51 AM org.apache.solr.core.SolrCore execute
>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 4584
>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Running DeletedRowKey() for Entity: articles
>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
>> collectDelta
>> INFO: Completed parentDeltaQuery for Entity: articles
>> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DataImporter
>> doDeltaImport
>> SEVERE: Delta Import Failed
>> java.lang.NullPointerException
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>>       at
>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>>       at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>>       at
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>> Oct 15, 2008 9:50:58 AM org.apache.solr.core.SolrCore execute
>> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>>
>> Regards
>> Florian
>>
>>     
>
>
>
>   


-- 
Media Ventures GmbH 
Entwicklung Blogmonitor.de

Jabber-ID faumeier@mabber.de
Telefon +49 (0) 2236 480 10 22


Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
The delta implementation is a bit fragile in DIH for complex queries

I recommend you do delta-import using a full-import

it can be done as follows
define a diffferent entity

<dataConfig>
<dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
url="jdbc:postgresql://bm02:5432/bm" user="user" />

<document name="articles">
  <entity name="articles-full" ..>
  </entity>

  <entity name="articles-delta rootEntity="false"
query="<your-delta-query-here>">
      <!-- this following entity can be a copy articles-full entity
without any delta query because rootEntity=false for
       articles-delta the following will be used for creating
documents. all other rules are same-->
       <entity name="anyname" ..>
       </entity>
 </entity>
</document>

when you wish to do a full-import pass the request parameter
entity=articles-full

for delta-import use the request parameter
entity=articles-delta&clean=false (command has to be full-import only)



On Wed, Oct 15, 2008 at 1:42 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> Shalin Shekhar Mangar schrieb:
>>
>> You are missing the "pk" field (primary key). This is used for delta
>> imports.
>>
>
> I added the pk field and rebuild the index yesterday. However, when I run
> the delta-import, I still have this error message in the log:
>
> INFO: Starting delta collection.
> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: articles
> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity articles with URL:
> jdbc:postgresql://bm02:5432/bm
> Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 43
> Oct 15, 2008 9:37:36 AM org.apache.solr.core.SolrCore execute
> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
> Oct 15, 2008 9:44:51 AM org.apache.solr.core.SolrCore execute
> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 4584
> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running DeletedRowKey() for Entity: articles
> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: articles
> Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> SEVERE: Delta Import Failed
> java.lang.NullPointerException
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
>       at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
>       at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
>       at
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
>       at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
>       at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
> Oct 15, 2008 9:50:58 AM org.apache.solr.core.SolrCore execute
> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
>
> Regards
> Florian
>



-- 
--Noble Paul

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Shalin Shekhar Mangar schrieb:
> You are missing the "pk" field (primary key). This is used for delta
> imports.
>   
I added the pk field and rebuild the index yesterday. However, when I 
run the delta-import, I still have this error message in the log:

INFO: Starting delta collection.
Oct 15, 2008 9:37:27 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running ModifiedRowKey() for Entity: articles
Oct 15, 2008 9:37:27 AM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Creating a connection for entity articles with URL: 
jdbc:postgresql://bm02:5432/bm
Oct 15, 2008 9:37:27 AM 
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO: Time taken for getConnection(): 43
Oct 15, 2008 9:37:36 AM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
Oct 15, 2008 9:44:51 AM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 4584
Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Running DeletedRowKey() for Entity: articles
Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta
INFO: Completed parentDeltaQuery for Entity: articles
Oct 15, 2008 9:50:43 AM org.apache.solr.handler.dataimport.DataImporter 
doDeltaImport
SEVERE: Delta Import Failed
java.lang.NullPointerException
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:153)
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
        at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
        at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
        at 
org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
        at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
        at 
org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
        at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
        at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
Oct 15, 2008 9:50:58 AM org.apache.solr.core.SolrCore execute
INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0

Regards
Florian

Re: error with delta import

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.
You are missing the "pk" field (primary key). This is used for delta
imports.

On Tue, Oct 14, 2008 at 6:16 PM, Florian Aumeier
<fa...@mediaventures.de>wrote:

> Noble Paul നോബിള്‍ नोब्ळ् schrieb:
>
>> apparently you have not specified the deltaQuery attribute in the entity.
>>  Check the delta-import section in the wiki
>> http://wiki.apache.org/solr/DataImportHandler
>> or you can share your data-config file and we can take a quick look
>>
>>
>>
> here is my data-config. I configured both, the deltaQuery and query entity
> in one data-config. Is this the correct usecase?
> Also, I found it easier to join the document on the database level instead
> of leaving it to solr.
>
> <dataConfig>
> <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
> url="jdbc:postgresql://bm02:5432/bm" user="user" />
>
> <document name="articles">
> <entity name="articles" deltaQuery="SELECT a.id AS article_id,a.stub AS
> article_stub,a.ref AS article_ref,a.id_blogs,a.title AS article_title,
> a.normalized_text, au.url AS article_url, bu.url AS blog_url, b.title AS
> blog_title,b.subtitle AS blog_subtitle, r.rank,
> coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
> on a.id_blogs = b.id join urls au on a.id_urls = au.id join urls bu on
> b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id = r.id_articles WHERE
> b.id_urls is not null AND a.hidden is false AND b.hidden is false AND a.ref
> is not null AND b.ref is not null AND (rankid in (SELECT rankid FROM ranks
> order by rankid desc limit 1) OR rankid is null) AND
> coalesce(a.updated,a.published,a.added) &gt;
> '${dataimporter.last_index_time}'"
> query="SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS
> article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, au.url
> AS article_url, bu.url AS blog_url, b.t\
> itle AS blog_title,b.subtitle AS blog_subtitle, r.rank,
> coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
> on a.id_blogs = b.id join urls au on a.id_urls = au\
> .id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id =
> r.id_articles WHERE b.id_urls is not null AND a.hidden is false AND b.hidden
> is false AND a.ref is not null AN\
> D b.ref is not null AND (rankid in (SELECT rankid FROM ranks order by
> rankid desc limit 1) OR rankid is null) AND
> coalesce(a.updated,a.published,a.added)">
> <field column="article_id" name="a_id" />
> <field column="normalized_text" name="norm_text" />
> <field column="article_ref" name="id" />
> <field column="article_stub" name="stub" />
> <field column="id_blogs" name="blog_id" />
> <field column="article_title" name="a_title" />
> <field column="article_url" name="article_url" />
> <field column="ts" name="ts" />
> <field column="rank" name="rank" />
> <field column="blog_ref" name="blog_ref" />
> <field column="blog_title" name="b_title" />
> <field column="blog_subtitle" name="subtitle" />
>
> <field column="blog_url" name="blog_url" />
>
> </entity>
>
> </document>
>
> </dataConfig>
>
> Florian
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
the query makes my head spin .
joining on an sql does not enable you to populate multivalued fields .
Otherwise , it is all fine

pk attribute is missing in the entity

On Tue, Oct 14, 2008 at 6:16 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> Noble Paul നോബിള്‍ नोब्ळ् schrieb:
>>
>> apparently you have not specified the deltaQuery attribute in the entity.
>>  Check the delta-import section in the wiki
>> http://wiki.apache.org/solr/DataImportHandler
>> or you can share your data-config file and we can take a quick look
>>
>>
>
> here is my data-config. I configured both, the deltaQuery and query entity
> in one data-config. Is this the correct usecase?
> Also, I found it easier to join the document on the database level instead
> of leaving it to solr.
>
> <dataConfig>
> <dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
> url="jdbc:postgresql://bm02:5432/bm" user="user" />
>
> <document name="articles">
> <entity name="articles" deltaQuery="SELECT a.id AS article_id,a.stub AS
> article_stub,a.ref AS article_ref,a.id_blogs,a.title AS article_title,
> a.normalized_text, au.url AS article_url, bu.url AS blog_url, b.title AS
> blog_title,b.subtitle AS blog_subtitle, r.rank,
> coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
> on a.id_blogs = b.id join urls au on a.id_urls = au.id join urls bu on
> b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id = r.id_articles WHERE
> b.id_urls is not null AND a.hidden is false AND b.hidden is false AND a.ref
> is not null AND b.ref is not null AND (rankid in (SELECT rankid FROM ranks
> order by rankid desc limit 1) OR rankid is null) AND
> coalesce(a.updated,a.published,a.added) &gt;
> '${dataimporter.last_index_time}'"
> query="SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS
> article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, au.url
> AS article_url, bu.url AS blog_url, b.t\
> itle AS blog_title,b.subtitle AS blog_subtitle, r.rank,
> coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs b
> on a.id_blogs = b.id join urls au on a.id_urls = au\
> .id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id =
> r.id_articles WHERE b.id_urls is not null AND a.hidden is false AND b.hidden
> is false AND a.ref is not null AN\
> D b.ref is not null AND (rankid in (SELECT rankid FROM ranks order by rankid
> desc limit 1) OR rankid is null) AND
> coalesce(a.updated,a.published,a.added)">
> <field column="article_id" name="a_id" />
> <field column="normalized_text" name="norm_text" />
> <field column="article_ref" name="id" />
> <field column="article_stub" name="stub" />
> <field column="id_blogs" name="blog_id" />
> <field column="article_title" name="a_title" />
> <field column="article_url" name="article_url" />
> <field column="ts" name="ts" />
> <field column="rank" name="rank" />
> <field column="blog_ref" name="blog_ref" />
> <field column="blog_title" name="b_title" />
> <field column="blog_subtitle" name="subtitle" />
>
> <field column="blog_url" name="blog_url" />
>
> </entity>
>
> </document>
>
> </dataConfig>
>
> Florian
>
>



-- 
--Noble Paul

Re: error with delta import

Posted by Florian Aumeier <fa...@mediaventures.de>.
Noble Paul നോബിള്‍ नोब्ळ् schrieb:
> apparently you have not specified the deltaQuery attribute in the entity.
>  Check the delta-import section in the wiki
> http://wiki.apache.org/solr/DataImportHandler
> or you can share your data-config file and we can take a quick look
>
>   
here is my data-config. I configured both, the deltaQuery and query 
entity in one data-config. Is this the correct usecase?
Also, I found it easier to join the document on the database level 
instead of leaving it to solr.

<dataConfig>
<dataSource type="JdbcDataSource" driver="org.postgresql.Driver"
url="jdbc:postgresql://bm02:5432/bm" user="user" />

<document name="articles">
<entity name="articles" deltaQuery="SELECT a.id AS article_id,a.stub AS 
article_stub,a.ref AS article_ref,a.id_blogs,a.title AS article_title, 
a.normalized_text, au.url AS article_url, bu.url AS blog_url, b.title AS 
blog_title,b.subtitle AS blog_subtitle, r.rank, 
coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs 
b on a.id_blogs = b.id join urls au on a.id_urls = au.id join urls bu on 
b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id = r.id_articles WHERE 
b.id_urls is not null AND a.hidden is false AND b.hidden is false AND 
a.ref is not null AND b.ref is not null AND (rankid in (SELECT rankid 
FROM ranks order by rankid desc limit 1) OR rankid is null) AND 
coalesce(a.updated,a.published,a.added) &gt; 
'${dataimporter.last_index_time}'"
query="SELECT a.id AS article_id,a.stub AS article_stub,a.ref AS 
article_ref,a.id_blogs,a.title AS article_title, a.normalized_text, 
au.url AS article_url, bu.url AS blog_url, b.t\
itle AS blog_title,b.subtitle AS blog_subtitle, r.rank, 
coalesce(a.updated,a.published,a.added) as ts FROM articles a join blogs 
b on a.id_blogs = b.id join urls au on a.id_urls = au\
.id join urls bu on b.id_urls = bu.id LEFT OUTER JOIN ranks r on a.id = 
r.id_articles WHERE b.id_urls is not null AND a.hidden is false AND 
b.hidden is false AND a.ref is not null AN\
D b.ref is not null AND (rankid in (SELECT rankid FROM ranks order by 
rankid desc limit 1) OR rankid is null) AND 
coalesce(a.updated,a.published,a.added)">
<field column="article_id" name="a_id" />
<field column="normalized_text" name="norm_text" />
<field column="article_ref" name="id" />
<field column="article_stub" name="stub" />
<field column="id_blogs" name="blog_id" />
<field column="article_title" name="a_title" />
<field column="article_url" name="article_url" />
<field column="ts" name="ts" />
<field column="rank" name="rank" />
<field column="blog_ref" name="blog_ref" />
<field column="blog_title" name="b_title" />
<field column="blog_subtitle" name="subtitle" />

<field column="blog_url" name="blog_url" />

</entity>

</document>

</dataConfig>

Florian


Re: error with delta import

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@gmail.com>.
apparently you have not specified the deltaQuery attribute in the entity.
 Check the delta-import section in the wiki
http://wiki.apache.org/solr/DataImportHandler
or you can share your data-config file and we can take a quick look




On Tue, Oct 14, 2008 at 5:05 PM, Florian Aumeier
<fa...@mediaventures.de> wrote:
> Hi,
>
> I have some problems with delta-import. Here are the infos I have.
>
> The result from the web API, apparantly everything is fine:
> <response>
> −
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> </lst>
> −
> <lst name="initArgs">
> −
> <lst name="defaults">
> <str name="config">db-psql-data-config.xml</str>
> </lst>
> </lst>
> <str name="status">idle</str>
> <str name="importResponse"/>
> −
> <lst name="statusMessages">
> <str name="Time Elapsed">0:29:30.615</str>
> <str name="Total Requests made to DataSource">1</str>
> <str name="Total Rows Fetched">16194</str>
> <str name="Total Documents Processed">0</str>
> <str name="Total Documents Skipped">0</str>
> <str name="Delta Dump started">2008-10-14 11:23:31</str>
> <str name="Identifying Delta">2008-10-14 11:23:31</str>
> <str name="Deltas Obtained">2008-10-14 11:32:16</str>
> <str name="Building documents">2008-10-14 11:32:16</str>
> <str name="Total Changed Documents">16194</str>
> </lst>
> −
> <str name="WARNING">
> This response format is experimental. It is likely to change in the future.
> </str>
> </response>
>
> From the log:
> INFO: Starting Delta Import
> Oct 14, 2008 11:23:31 AM org.apache.solr.core.SolrCore execute
> INFO: [db] webapp=/solr path=/dataimport params={command=delta-import}
> status=0 QTime=1
> Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.SolrWriter
> readIndexerProperties
> INFO: Read dataimport.properties
> Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.DocBuilder
> doDelta
> INFO: Starting delta collection.
> Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running ModifiedRowKey() for Entity: articles
> Oct 14, 2008 11:23:31 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Creating a connection for entity articles with URL:
> jdbc:postgresql://bm02:5432/bm
> Oct 14, 2008 11:23:35 AM org.apache.solr.handler.dataimport.JdbcDataSource$1
> call
> INFO: Time taken for getConnection(): 3694
> Oct 14, 2008 11:29:16 AM org.apache.solr.core.SolrCore execute
> INFO: [db] webapp=/solr path=/dataimport params={} status=0 QTime=0
> Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed ModifiedRowKey for Entity: articles rows obtained : 16194
> Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Running DeletedRowKey() for Entity: articles
> Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed DeletedRowKey for Entity: articles rows obtained : 0
> Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DocBuilder
> collectDelta
> INFO: Completed parentDeltaQuery for Entity: articles
> Oct 14, 2008 11:32:16 AM org.apache.solr.handler.dataimport.DataImporter
> doDeltaImport
> SEVERE: Delta Import Failed
> java.lang.NullPointerException
> at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getDeltaImportQuery(SqlEntityProcessor.java:136)
> at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.getQuery(SqlEntityProcessor.java:125)
> at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
> at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
> at
> org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:211)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:133)
> at
> org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:359)
> at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:388)
> at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>
>
> Any help and or hints is appreciated
> Florian
>
>
>



-- 
--Noble Paul