You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Hasan Diwan <ha...@gmail.com> on 2012/08/24 16:17:36 UTC

Debugging DIH

I have some data in an H2 database that I'd like to move to SOLR. I
probably should/could extract and post the contents as 1 new document per
record, but I'd like to configure the data import handler and am having
some difficulty doing so. Following the wiki instructions[1], I have the
following in my db-data-config.xml:
<dataConfig>
    <dataSource type="JdbcDataSource" driver="org.h2.Driver"
url="jdbc:h2:tcp://192.168.1.6/finance" user="sa" />
    <document>
      <entity name="receipt" query="select location as location, amount as
amount, done_on as when from RECEIPTS as r join APP_USERS as a on r.user_id
= a.id"/>
    </document>
</dataConfig>

I also have dropped the JDBC driver into db/lib, witness:
% jar tvf ./lib/h2-1.3.164.jar | grep 'Driver'
    13 Fri Feb 03 12:02:56 PST 2012 META-INF/services/java.sql.Driver
  2508 Fri Feb 03 12:02:56 PST 2012 org/h2/Driver.class
   485 Fri Feb 03 12:02:56 PST 2012 org/h2/util/DbDriverActivator.class

and I've added the appropriate fields to schema.xml:
  <field name="location" type="string" indexed="true" stored="true"/>
   <field name="amount" type="currency" indexed="true" stored="true"/>
   <field name="when" type="date" indexed="true" stored="true"/>

There's nothing in my index and 343 rows in my table. What is going on? -- H
-- 
Sent from my mobile device
Envoyait de mon portable
1. http://wiki.apache.org/solr/DIHQuickStart

Re: Debugging DIH

Posted by Walter Underwood <wu...@wunderwood.org>.
ISO 8601.

The entire standards is rather complex, so most software uses a subset of it. A useful subset is described here: http://www.w3.org/TR/NOTE-datetime

ISO 8601 does not allow "Z001" for milliseconds. The "Z" is for UTC (Zulu in military time) and follows the time portion. Milliseconds (or any other subdivision) are represented after a decimal point on the seconds value, "2012-01-01T01:01:01.001Z".

wunder

On Aug 26, 2012, at 7:34 PM, Lance Norskog wrote:

> The timestamp format is 2012-01-01T01:01:01, with an optional Z001 for
> milliseconds. The timezone is UTC. This is a standard format but I do
> not remember the name of the standard.
> 
> On Sun, Aug 26, 2012 at 2:43 PM, Hasan Diwan <ha...@gmail.com> wrote:
>> Mr Norskong, et al,
>> 
>> On 26 August 2012 14:37, Lance Norskog <go...@gmail.com> wrote:
>> 
>>> Also, there is a logging feature to print intermediate values.
>>> 
>> 
>> I see the data as it should be. It's just not recorded into SOLR. One
>> possible concern is that I have timestamp in epoch seconds, which I'd like
>> to store as a date on the SOLR side; I know I can apply a transformer to do
>> this, but what's the format for it? Many thanks! -- H
>> --
>> Sent from my mobile device
>> Envoyait de mon portable






Re: Debugging DIH

Posted by Lance Norskog <go...@gmail.com>.
The timestamp format is 2012-01-01T01:01:01, with an optional Z001 for
milliseconds. The timezone is UTC. This is a standard format but I do
not remember the name of the standard.

On Sun, Aug 26, 2012 at 2:43 PM, Hasan Diwan <ha...@gmail.com> wrote:
> Mr Norskong, et al,
>
> On 26 August 2012 14:37, Lance Norskog <go...@gmail.com> wrote:
>
>> Also, there is a logging feature to print intermediate values.
>>
>
> I see the data as it should be. It's just not recorded into SOLR. One
> possible concern is that I have timestamp in epoch seconds, which I'd like
> to store as a date on the SOLR side; I know I can apply a transformer to do
> this, but what's the format for it? Many thanks! -- H
> --
> Sent from my mobile device
> Envoyait de mon portable



-- 
Lance Norskog
goksron@gmail.com

Re: Debugging DIH

Posted by Hasan Diwan <ha...@gmail.com>.
Mr Norskong, et al,

On 26 August 2012 14:37, Lance Norskog <go...@gmail.com> wrote:

> Also, there is a logging feature to print intermediate values.
>

I see the data as it should be. It's just not recorded into SOLR. One
possible concern is that I have timestamp in epoch seconds, which I'd like
to store as a date on the SOLR side; I know I can apply a transformer to do
this, but what's the format for it? Many thanks! -- H
-- 
Sent from my mobile device
Envoyait de mon portable

Re: Debugging DIH

Posted by Lance Norskog <go...@gmail.com>.
Also, there is a logging feature to print intermediate values.

Another point is the complexity of your query. It can be easier to
test with the query as a database view, instead of embedding it in the
DIH script.

On Fri, Aug 24, 2012 at 9:00 AM, Ahmet Arslan <io...@yahoo.com> wrote:
>
>> That is not completely true. If the columns have the same
>> names as the fields, the mapping is redundant. Nevertheless,
>> it might be the problem. What I've experienced with Oracle,
>> at least, is that the columns would be returned in uppercase
>> even if my alias would be in lowercase. You might force it
>> by adding quotes, though. Or try adding
>>
>> <field name="location" column="LOCATION" />
>> <field name="amount" column="AMOUNT" />
>> <field name="when" column="WHEN" />
>>
>> You might check in your preferred SQL client how the column
>> names are returned. It might be an indicator. (At least, in
>> my case they would be uppercase in SQL Developer.)
>
> There is a jsp page for debugging DIH
>
> http://localhost:8080/solr/admin/dataimport.jsp?handler=/dataimport



-- 
Lance Norskog
goksron@gmail.com

Re: Debugging DIH

Posted by Ahmet Arslan <io...@yahoo.com>.
> That is not completely true. If the columns have the same
> names as the fields, the mapping is redundant. Nevertheless,
> it might be the problem. What I've experienced with Oracle,
> at least, is that the columns would be returned in uppercase
> even if my alias would be in lowercase. You might force it
> by adding quotes, though. Or try adding
> 
> <field name="location" column="LOCATION" />
> <field name="amount" column="AMOUNT" />
> <field name="when" column="WHEN" />
> 
> You might check in your preferred SQL client how the column
> names are returned. It might be an indicator. (At least, in
> my case they would be uppercase in SQL Developer.)

There is a jsp page for debugging DIH

http://localhost:8080/solr/admin/dataimport.jsp?handler=/dataimport

Re: Debugging DIH

Posted by Chantal Ackermann <c....@it-agenten.com>.
> 
> I don't see that you have anything in the DIH that tells what columns from the query go into which fields in the index.  You need something like
> 
> <field name="location" column="location" />
> <field name="amount" column="amount" />
> <field name="when" column="when" />
> 

That is not completely true. If the columns have the same names as the fields, the mapping is redundant. Nevertheless, it might be the problem. What I've experienced with Oracle, at least, is that the columns would be returned in uppercase even if my alias would be in lowercase. You might force it by adding quotes, though. Or try adding

<field name="location" column="LOCATION" />
<field name="amount" column="AMOUNT" />
<field name="when" column="WHEN" />

You might check in your preferred SQL client how the column names are returned. It might be an indicator. (At least, in my case they would be uppercase in SQL Developer.)

Cheers,
Chantal

Re: Debugging DIH

Posted by Andy Lester <an...@petdance.com>.
On Aug 24, 2012, at 9:17 AM, Hasan Diwan wrote:

> <dataConfig>
>    <dataSource type="JdbcDataSource" driver="org.h2.Driver"
> url="jdbc:h2:tcp://192.168.1.6/finance" user="sa" />
>    <document>
>      <entity name="receipt" query="select location as location, amount as
> amount, done_on as when from RECEIPTS as r join APP_USERS as a on r.user_id
> = a.id"/>
>    </document>
> </dataConfig>
> 
> and I've added the appropriate fields to schema.xml:
>  <field name="location" type="string" indexed="true" stored="true"/>
>   <field name="amount" type="currency" indexed="true" stored="true"/>
>   <field name="when" type="date" indexed="true" stored="true"/>
> 
> There's nothing in my index and 343 rows in my table. What is going on? -- H


I don't see that you have anything in the DIH that tells what columns from the query go into which fields in the index.  You need something like

<field name="location" column="location" />
<field name="amount" column="amount" />
<field name="when" column="when" />

xoa

--
Andy Lester => andy@petdance.com => www.petdance.com => AIM:petdance


Re: Debugging DIH

Posted by Hasan Diwan <ha...@gmail.com>.
On 24 August 2012 07:17, Hasan Diwan <ha...@gmail.com> wrote:

> I have some data in an H2 database that I'd like to move to SOLR. I
> probably should/could extract and post the contents as 1 new document per
> record, but I'd like to configure the data import handler and am having
> some difficulty doing so. Following the wiki instructions[1], I have the
> following in my db-data-config.xml:
> <dataConfig>
>     <dataSource type="JdbcDataSource" driver="org.h2.Driver"
> url="jdbc:h2:tcp://192.168.1.6/finance" user="sa" />
>     <document>
>       <entity name="receipt" query="select location as location, amount as
> amount, done_on as when from RECEIPTS as r join APP_USERS as a on r.user_id
> = a.id"/>
>     </document>
> </dataConfig>
>
> I also have dropped the JDBC driver into db/lib, witness:
> % jar tvf ./lib/h2-1.3.164.jar | grep 'Driver'
>     13 Fri Feb 03 12:02:56 PST 2012 META-INF/services/java.sql.Driver
>   2508 Fri Feb 03 12:02:56 PST 2012 org/h2/Driver.class
>    485 Fri Feb 03 12:02:56 PST 2012 org/h2/util/DbDriverActivator.class
>
> and I've added the appropriate fields to schema.xml:
>   <field name="location" type="string" indexed="true" stored="true"/>
>    <field name="amount" type="currency" indexed="true" stored="true"/>
>    <field name="when" type="date" indexed="true" stored="true"/>
>
> There's nothing in my index and 343 rows in my table. What is going on? --
> H
>

One more data point:
% curl -L "http://localhost:8983/solr/db/dataimport?command=status"

                 <?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">0</int></lst><lst name="initArgs"><lst name="defaults"><str
name="config">db-data-config.xml</str></lst></lst><str
name="command">status</str><str name="status">idle</str><str
name="importResponse"/><lst name="statusMessages"><str name="Total Requests
made to DataSource">1</str><str name="Total Rows Fetched">343</str><str
name="Total Documents Skipped">0</str><str name="Full Dump
Started">2012-08-24 07:19:26</str><str name="">Indexing completed.
Added/Updated: 0 documents. Deleted 0 documents.</str><str
name="Committed">2012-08-24 07:19:26</str><str name="Optimized">2012-08-24
07:19:26</str><str name="Total Documents Processed">0</str><str name="Time
taken ">0:0:0.328</str></lst><str name="WARNING">This response format is
experimental.  It is likely to change in the future.</str>
</response>



-- 
Sent from my mobile device
Envoyait de mon portable