You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Allistair Crossley <al...@roxxor.co.uk> on 2010/10/04 14:22:37 UTC

DIH sub-entity not indexing

Hello list,

I've been successful with DIH to a large extent but a seemingly simple extra column I need is posing problems. In a nutshell I have 2 entities let's say - Listing habtm Contact. I have copied the relevant parts of the configs below.

I have run my SQL for the sub-entity Contact and this is produces correct results. No errors are given by Solr on running the import. Yet no records are being set with the contacts array.

I have taken out my sub-entity config and replaced it with a simple template value just to check and values then come through OK.

So it certainly seems limited to my query or query config somehow. I followed roughly the example of the DIH bundled example.

DIH.xml
=======

<entity name="listing" ...>
  ...
  <entity name="contacts"
query="select concat(c.first_name, concat(' ', c.last_name)) as full_name from contacts c inner join listing_contacts lc on c.id = lc.contact_id where lc.listing_id = '${listing.id}'">
<field name="contacts" column="full_name" />
</entity>

SCHEMA.XML

<field name="contacts" type="text" indexed="true" stored="true" multiValued="true" required="false" />


Any tips appreciated.

Re: DIH sub-entity not indexing

Posted by Allistair Crossley <al...@roxxor.co.uk>.
Hey,

Yes that tool doesn't work too well for me. I can load it up and get the forms on the left, but when I run a debug the right hand side tells me that the page is not found. I *think* this is because I use a custom query string parameter in my DIH XML for use with delta querying and this being missing is failing the tool and it doesn't support adding custom query string params.

Cheers, Allistair

On Oct 4, 2010, at 9:20 AM, Ephraim Ofir wrote:

> The closest you can get to debugging (without actually debugging...) is
> to look at the logs and use
> http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mo
> de
> 
> Ephraim Ofir
> 
> 
> -----Original Message-----
> From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 3:09 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> Thanks Ephraim. I tried your suggestion with the ID but capitalising it
> did not work. 
> 
> Indeed, I have a column that already works using a lower-case id. I wish
> I could debug it somehow - see the SQL? Something particular about this
> config it is not liking.
> 
> I read the post you linked to. This is more a performance-related thing
> for him. I would be happy just to see low performance and my contacts
> populated right now!! :D
> 
> Thanks again
> 
> On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:
> 
>> Make sure you're not running into a case sensitivity problem, some
> stuff
>> in DIH is case sensitive (and some stuff gets capitalized by the
> jdbc).
>> Try using listing.ID instead of listing.id.
>> On a side note, if you're using mysql, you might want to look at the
>> CONCAT_WS function.
>> You might also want to look into a different approach than
> sub-entities
>> -
>> 
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
>> 
> C9F8B39CB3B7C6D4594293EA29CCF438B01702F22@ICQ-MAIL.icq.il.office.aol.com
>> %3E
>> 
>> Ephraim Ofir
>> 
>> -----Original Message-----
>> From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
>> Sent: Monday, October 04, 2010 2:49 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: DIH sub-entity not indexing
>> 
>> I have tried a more elaborate join also following the features example
>> of the DIH example but same result - SQL works fine directly but Solr
> is
>> not indexing the array of full_names per Listing, e.g.
>> 
>> <entity name="listing" ...>
>> 
>> 	<entity name="listing_contact"
>>                   query="select * from listing_contacts where
>> listing_id = '${listing.id}'">
>>               <entity name="contact"
>> 					query="select concat(first_name,
>> concat(' ', last_name)) as full_name from contacts where id =
>> '${listing_contact.contact_id}'">
>>               	<field name="contacts" column="full_name" />
>>           	</entity>
>>           </entity>
>> 
>> </entity>
>> 
>> Am I missing the obvious?
>> 
>> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
>> 
>>> Hello list,
>>> 
>>> I've been successful with DIH to a large extent but a seemingly
> simple
>> extra column I need is posing problems. In a nutshell I have 2
> entities
>> let's say - Listing habtm Contact. I have copied the relevant parts of
>> the configs below.
>>> 
>>> I have run my SQL for the sub-entity Contact and this is produces
>> correct results. No errors are given by Solr on running the import.
> Yet
>> no records are being set with the contacts array.
>>> 
>>> I have taken out my sub-entity config and replaced it with a simple
>> template value just to check and values then come through OK.
>>> 
>>> So it certainly seems limited to my query or query config somehow. I
>> followed roughly the example of the DIH bundled example.
>>> 
>>> DIH.xml
>>> =======
>>> 
>>> <entity name="listing" ...>
>>> ...
>>> <entity name="contacts"
>>> query="select concat(c.first_name, concat(' ', c.last_name)) as
>> full_name from contacts c inner join listing_contacts lc on c.id =
>> lc.contact_id where lc.listing_id = '${listing.id}'">
>>> <field name="contacts" column="full_name" />
>>> </entity>
>>> 
>>> SCHEMA.XML
>>> 
>>> <field name="contacts" type="text" indexed="true" stored="true"
>> multiValued="true" required="false" />
>>> 
>>> 
>>> Any tips appreciated.
>> 
> 


Re: DIH sub-entity not indexing

Posted by Allistair Crossley <al...@roxxor.co.uk>.
Very clever thinking indeed. Well, that's certainly revealed the problem ... ${listing.id} is empty on my sub-entity query ... 

And this because I prefix the indexed ID with a letter

<field column="id" name="id" template="L${listing.id}" />

This appears to modify the internal value of $listing.id for subsequent uses.

Well, I can work around this now. Thanks!

On Oct 4, 2010, at 9:35 AM, Stefan Matheis wrote:

> Allistair,
> 
> Indeed, I have a column that already works using a lower-case id. I wish
>> I could debug it somehow - see the SQL? Something particular about this
>> config it is not liking.
>> 
> 
> you may want to try the MySQL Query-Log, to check which Queries are
> performed?
> http://dev.mysql.com/doc/refman/5.1/en/query-log.html


Re: DIH sub-entity not indexing

Posted by Stefan Matheis <ma...@googlemail.com>.
Allistair,

Indeed, I have a column that already works using a lower-case id. I wish
> I could debug it somehow - see the SQL? Something particular about this
> config it is not liking.
>

you may want to try the MySQL Query-Log, to check which Queries are
performed?
http://dev.mysql.com/doc/refman/5.1/en/query-log.html

RE: DIH sub-entity not indexing

Posted by Ephraim Ofir <Ep...@icq.com>.
The closest you can get to debugging (without actually debugging...) is
to look at the logs and use
http://wiki.apache.org/solr/DataImportHandler#Interactive_Development_Mo
de

Ephraim Ofir


-----Original Message-----
From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
Sent: Monday, October 04, 2010 3:09 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH sub-entity not indexing

Thanks Ephraim. I tried your suggestion with the ID but capitalising it
did not work. 

Indeed, I have a column that already works using a lower-case id. I wish
I could debug it somehow - see the SQL? Something particular about this
config it is not liking.

I read the post you linked to. This is more a performance-related thing
for him. I would be happy just to see low performance and my contacts
populated right now!! :D

Thanks again

On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:

> Make sure you're not running into a case sensitivity problem, some
stuff
> in DIH is case sensitive (and some stuff gets capitalized by the
jdbc).
> Try using listing.ID instead of listing.id.
> On a side note, if you're using mysql, you might want to look at the
> CONCAT_WS function.
> You might also want to look into a different approach than
sub-entities
> -
>
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
>
C9F8B39CB3B7C6D4594293EA29CCF438B01702F22@ICQ-MAIL.icq.il.office.aol.com
> %3E
> 
> Ephraim Ofir
> 
> -----Original Message-----
> From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 2:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> I have tried a more elaborate join also following the features example
> of the DIH example but same result - SQL works fine directly but Solr
is
> not indexing the array of full_names per Listing, e.g.
> 
> <entity name="listing" ...>
> 
> 	<entity name="listing_contact"
>                    query="select * from listing_contacts where
> listing_id = '${listing.id}'">
>                <entity name="contact"
> 					query="select concat(first_name,
> concat(' ', last_name)) as full_name from contacts where id =
> '${listing_contact.contact_id}'">
>                	<field name="contacts" column="full_name" />
>            	</entity>
>            </entity>
> 
> </entity>
> 
> Am I missing the obvious?
> 
> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
> 
>> Hello list,
>> 
>> I've been successful with DIH to a large extent but a seemingly
simple
> extra column I need is posing problems. In a nutshell I have 2
entities
> let's say - Listing habtm Contact. I have copied the relevant parts of
> the configs below.
>> 
>> I have run my SQL for the sub-entity Contact and this is produces
> correct results. No errors are given by Solr on running the import.
Yet
> no records are being set with the contacts array.
>> 
>> I have taken out my sub-entity config and replaced it with a simple
> template value just to check and values then come through OK.
>> 
>> So it certainly seems limited to my query or query config somehow. I
> followed roughly the example of the DIH bundled example.
>> 
>> DIH.xml
>> =======
>> 
>> <entity name="listing" ...>
>> ...
>> <entity name="contacts"
>> query="select concat(c.first_name, concat(' ', c.last_name)) as
> full_name from contacts c inner join listing_contacts lc on c.id =
> lc.contact_id where lc.listing_id = '${listing.id}'">
>> <field name="contacts" column="full_name" />
>> </entity>
>> 
>> SCHEMA.XML
>> 
>> <field name="contacts" type="text" indexed="true" stored="true"
> multiValued="true" required="false" />
>> 
>> 
>> Any tips appreciated.
> 


Re: DIH sub-entity not indexing

Posted by Allistair Crossley <al...@roxxor.co.uk>.
Thanks Ephraim. I tried your suggestion with the ID but capitalising it did not work. 

Indeed, I have a column that already works using a lower-case id. I wish I could debug it somehow - see the SQL? Something particular about this config it is not liking.

I read the post you linked to. This is more a performance-related thing for him. I would be happy just to see low performance and my contacts populated right now!! :D

Thanks again

On Oct 4, 2010, at 9:00 AM, Ephraim Ofir wrote:

> Make sure you're not running into a case sensitivity problem, some stuff
> in DIH is case sensitive (and some stuff gets capitalized by the jdbc).
> Try using listing.ID instead of listing.id.
> On a side note, if you're using mysql, you might want to look at the
> CONCAT_WS function.
> You might also want to look into a different approach than sub-entities
> -
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
> C9F8B39CB3B7C6D4594293EA29CCF438B01702F22@ICQ-MAIL.icq.il.office.aol.com
> %3E
> 
> Ephraim Ofir
> 
> -----Original Message-----
> From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
> Sent: Monday, October 04, 2010 2:49 PM
> To: solr-user@lucene.apache.org
> Subject: Re: DIH sub-entity not indexing
> 
> I have tried a more elaborate join also following the features example
> of the DIH example but same result - SQL works fine directly but Solr is
> not indexing the array of full_names per Listing, e.g.
> 
> <entity name="listing" ...>
> 
> 	<entity name="listing_contact"
>                    query="select * from listing_contacts where
> listing_id = '${listing.id}'">
>                <entity name="contact"
> 					query="select concat(first_name,
> concat(' ', last_name)) as full_name from contacts where id =
> '${listing_contact.contact_id}'">
>                	<field name="contacts" column="full_name" />
>            	</entity>
>            </entity>
> 
> </entity>
> 
> Am I missing the obvious?
> 
> On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:
> 
>> Hello list,
>> 
>> I've been successful with DIH to a large extent but a seemingly simple
> extra column I need is posing problems. In a nutshell I have 2 entities
> let's say - Listing habtm Contact. I have copied the relevant parts of
> the configs below.
>> 
>> I have run my SQL for the sub-entity Contact and this is produces
> correct results. No errors are given by Solr on running the import. Yet
> no records are being set with the contacts array.
>> 
>> I have taken out my sub-entity config and replaced it with a simple
> template value just to check and values then come through OK.
>> 
>> So it certainly seems limited to my query or query config somehow. I
> followed roughly the example of the DIH bundled example.
>> 
>> DIH.xml
>> =======
>> 
>> <entity name="listing" ...>
>> ...
>> <entity name="contacts"
>> query="select concat(c.first_name, concat(' ', c.last_name)) as
> full_name from contacts c inner join listing_contacts lc on c.id =
> lc.contact_id where lc.listing_id = '${listing.id}'">
>> <field name="contacts" column="full_name" />
>> </entity>
>> 
>> SCHEMA.XML
>> 
>> <field name="contacts" type="text" indexed="true" stored="true"
> multiValued="true" required="false" />
>> 
>> 
>> Any tips appreciated.
> 


RE: DIH sub-entity not indexing

Posted by Ephraim Ofir <Ep...@icq.com>.
Make sure you're not running into a case sensitivity problem, some stuff
in DIH is case sensitive (and some stuff gets capitalized by the jdbc).
Try using listing.ID instead of listing.id.
On a side note, if you're using mysql, you might want to look at the
CONCAT_WS function.
You might also want to look into a different approach than sub-entities
-
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201008.mbox/%3
C9F8B39CB3B7C6D4594293EA29CCF438B01702F22@ICQ-MAIL.icq.il.office.aol.com
%3E

Ephraim Ofir

-----Original Message-----
From: Allistair Crossley [mailto:ali@roxxor.co.uk] 
Sent: Monday, October 04, 2010 2:49 PM
To: solr-user@lucene.apache.org
Subject: Re: DIH sub-entity not indexing

I have tried a more elaborate join also following the features example
of the DIH example but same result - SQL works fine directly but Solr is
not indexing the array of full_names per Listing, e.g.

<entity name="listing" ...>

	<entity name="listing_contact"
                    query="select * from listing_contacts where
listing_id = '${listing.id}'">
                <entity name="contact"
					query="select concat(first_name,
concat(' ', last_name)) as full_name from contacts where id =
'${listing_contact.contact_id}'">
                	<field name="contacts" column="full_name" />
            	</entity>
            </entity>

</entity>

Am I missing the obvious?

On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:

> Hello list,
> 
> I've been successful with DIH to a large extent but a seemingly simple
extra column I need is posing problems. In a nutshell I have 2 entities
let's say - Listing habtm Contact. I have copied the relevant parts of
the configs below.
> 
> I have run my SQL for the sub-entity Contact and this is produces
correct results. No errors are given by Solr on running the import. Yet
no records are being set with the contacts array.
> 
> I have taken out my sub-entity config and replaced it with a simple
template value just to check and values then come through OK.
> 
> So it certainly seems limited to my query or query config somehow. I
followed roughly the example of the DIH bundled example.
> 
> DIH.xml
> =======
> 
> <entity name="listing" ...>
>  ...
>  <entity name="contacts"
> query="select concat(c.first_name, concat(' ', c.last_name)) as
full_name from contacts c inner join listing_contacts lc on c.id =
lc.contact_id where lc.listing_id = '${listing.id}'">
> <field name="contacts" column="full_name" />
> </entity>
> 
> SCHEMA.XML
> 
> <field name="contacts" type="text" indexed="true" stored="true"
multiValued="true" required="false" />
> 
> 
> Any tips appreciated.


Re: DIH sub-entity not indexing

Posted by Allistair Crossley <al...@roxxor.co.uk>.
I have tried a more elaborate join also following the features example of the DIH example but same result - SQL works fine directly but Solr is not indexing the array of full_names per Listing, e.g.

<entity name="listing" ...>

	<entity name="listing_contact"
                    query="select * from listing_contacts where listing_id = '${listing.id}'">
                <entity name="contact"
					query="select concat(first_name, concat(' ', last_name)) as full_name from contacts where id = '${listing_contact.contact_id}'">
                	<field name="contacts" column="full_name" />
            	</entity>
            </entity>

</entity>

Am I missing the obvious?

On Oct 4, 2010, at 8:22 AM, Allistair Crossley wrote:

> Hello list,
> 
> I've been successful with DIH to a large extent but a seemingly simple extra column I need is posing problems. In a nutshell I have 2 entities let's say - Listing habtm Contact. I have copied the relevant parts of the configs below.
> 
> I have run my SQL for the sub-entity Contact and this is produces correct results. No errors are given by Solr on running the import. Yet no records are being set with the contacts array.
> 
> I have taken out my sub-entity config and replaced it with a simple template value just to check and values then come through OK.
> 
> So it certainly seems limited to my query or query config somehow. I followed roughly the example of the DIH bundled example.
> 
> DIH.xml
> =======
> 
> <entity name="listing" ...>
>  ...
>  <entity name="contacts"
> query="select concat(c.first_name, concat(' ', c.last_name)) as full_name from contacts c inner join listing_contacts lc on c.id = lc.contact_id where lc.listing_id = '${listing.id}'">
> <field name="contacts" column="full_name" />
> </entity>
> 
> SCHEMA.XML
> 
> <field name="contacts" type="text" indexed="true" stored="true" multiValued="true" required="false" />
> 
> 
> Any tips appreciated.