You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Luis Sepúlveda <lu...@abroadwith.com> on 2016/08/15 13:27:21 UTC

Solr incorrectly fetching elements that do not match conditions in where as all-null rows

Hello,

Solr is trying to process non-existing child/nested entities. By
non-existing I mean that they exist in DB but should not be at Solr side
because they don't match the conditions in the query I use to fetch them.

I have the below solr data configuration. The relationship between tables
is complicated, but the point is that I need to fetch child/nested entities
and perform some calculations at query time. My problem is that some
products have onSite services that are not enabled. I would expect Solr
from ignoring those elements because of the conditions in the query. If I
turn debug on when importing, I can see that all fields are null. However,
Solr still tries to process them, which results in invalid SQL queries
because it replaces null fields with nothing.

    <entity name="product" query="SELECT p.id AS productId,
CONCAT(unitPrice,',',currency) AS unitPriceCurr FROM products p LEFT JOIN
companies c ON c.id=p.companyId WHERE h.enabled=true">

        <entity name="serviceTypes"
                query="SELECT s.serviceType, sl.language FROM services s
LEFT JOIN serviceLanguages sl ON s.id=sl.serviceId WHERE
companyId=${product.companyId} AND s.enabled=true">
            <field column="serviceType" name="services" />
            <field column="language" name="languages" />
        </entity>

        <entity name="onSite"
                query="SELECT s.id, s.enabled, ${product.unitPrice} +
(hourlyPrice * MIN(hours)) AS onSitePriceRaw, CONCAT(${product.unitPrice} +
(hourlyPrice * MIN(hours)), ',', '${product.currency}') AS onSitePrice FROM
services s LEFT JOIN serviceHourlyPrices shp ON s.id=shp.serviceId WHERE
companyId=${product.companyId} AND s.enabled=true AND s.serviceType='OS'">
            <field column="onSitePrice" name="onSitePrice" />
            <entity name="fullReducedOnSitePrice"
                    query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
percentage), ',', '${product.currency}') AS fullReducedOnSitePrice FROM
discounts WHERE companyId=${product.companyId} AND category='FULL'">
                <field name="fullReducedOnSitePrice"
column="fullReducedOnSitePrice"/>
            </entity>
            <entity name="partialReducedOnSitePrice"
                    query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
percentage), ',', '${product.currency}') AS partialReducedOnSitePrice FROM
discounts WHERE companyId=${product.companyId} AND category='PARTIAL'">
                <field name="partialReducedOnSitePrice"
column="partialReducedOnSitePrice"/>
            </entity>
        </entity>
    </entity>

The problem seems to be related to the condition s.enabled=true in the
query, because are rows with enabled=false that are causing problems (Solr
interprets them as rows with all fields null). I get an invalid SQL query
SELECT CONCAT( * (1 - percentage), ',', 'USD') AS fullReducedOnSitePrice
FROM discounts WHERE companyId=65.

How can I force Solr to ignore, as it should, those elements?

Re: Solr incorrectly fetching elements that do not match conditions in where as all-null rows

Posted by Luis Sepúlveda <lu...@abroadwith.com>.
I'm very sorry, but you're right. Using one of the queries from the query
log, I get a 1 row(s) returned. So it itsn't a Solr issue.

Thanks a lot Alexandre.

2016-08-15 16:17 GMT+02:00 Alexandre Rafalovitch <ar...@gmail.com>:

> Hmm. I would still take as truth the database logs as opposed to Solr
> logs. Or at least network traces using something like Wireshark.
>
> Otherwise, you need some way to reduce your DIH query to the minimum
> reproducible example. I am used to reading tech support emails and
> even then I am not sure I can parse the significant configuration
> aspects from the multiple parallel and nested entities. Can you reduce
> this to the simplest (two level?) entity definition with a single
> field and explain what you expected and what you are seeing.
>
> Regards,
>    Alex.
> P.s. Solr DIH does have a gotcha with SQL import that it automagically
> tries to match table column names to fields defined in schema and
> populate them even if not explicitly declared. This does not match to
> the way you describe the problem (your select statement still needs to
> return those fields), but perhaps it interacts with something else to
> trigger it.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 15 August 2016 at 23:54, Luis Sepúlveda <lu...@abroadwith.com> wrote:
> > Thanks for the promp reply.
> >
> > h.enabled=true is a typo. It should be c.enabled=true, because the table
> > companies also has a column called enabled. That part is working fine (it
> > doesn't fetch companies with enabled=false).
> >
> > About the DB queries, I've taken, by turning Debug and Verbose on in the
> > Dataimport tab, the queries that Solr is sending to DB, executed the same
> > queries in my MySQL client. It clearly says '0 row(s) returned'.
> >
> > 2016-08-15 15:37 GMT+02:00 Alexandre Rafalovitch <ar...@gmail.com>:
> >
> >> Solr (well DIH) just passes that query to the DB, so if you are
> >> getting extra rows (not extra fields), than I would focus on the
> >> database side of the situation.
> >>
> >> Specifically, I would confirm from the database logs what the sent
> >> query actually looks like.
> >>
> >> Very specifically, in your very first entity, I see the condition
> >> "h.enabled=true" where "h" does not match the table names in the FROM
> >> statement. Perhaps, that's the problem?
> >>
> >> Regards,
> >>    Alex.
> >> ----
> >> Newsletter and resources for Solr beginners and intermediates:
> >> http://www.solr-start.com/
> >>
> >>
> >> On 15 August 2016 at 23:27, Luis Sepúlveda <lu...@abroadwith.com> wrote:
> >> > Hello,
> >> >
> >> > Solr is trying to process non-existing child/nested entities. By
> >> > non-existing I mean that they exist in DB but should not be at Solr
> side
> >> > because they don't match the conditions in the query I use to fetch
> them.
> >> >
> >> > I have the below solr data configuration. The relationship between
> tables
> >> > is complicated, but the point is that I need to fetch child/nested
> >> entities
> >> > and perform some calculations at query time. My problem is that some
> >> > products have onSite services that are not enabled. I would expect
> Solr
> >> > from ignoring those elements because of the conditions in the query.
> If I
> >> > turn debug on when importing, I can see that all fields are null.
> >> However,
> >> > Solr still tries to process them, which results in invalid SQL queries
> >> > because it replaces null fields with nothing.
> >> >
> >> >     <entity name="product" query="SELECT p.id AS productId,
> >> > CONCAT(unitPrice,',',currency) AS unitPriceCurr FROM products p LEFT
> JOIN
> >> > companies c ON c.id=p.companyId WHERE h.enabled=true">
> >> >
> >> >         <entity name="serviceTypes"
> >> >                 query="SELECT s.serviceType, sl.language FROM
> services s
> >> > LEFT JOIN serviceLanguages sl ON s.id=sl.serviceId WHERE
> >> > companyId=${product.companyId} AND s.enabled=true">
> >> >             <field column="serviceType" name="services" />
> >> >             <field column="language" name="languages" />
> >> >         </entity>
> >> >
> >> >         <entity name="onSite"
> >> >                 query="SELECT s.id, s.enabled, ${product.unitPrice} +
> >> > (hourlyPrice * MIN(hours)) AS onSitePriceRaw,
> >> CONCAT(${product.unitPrice} +
> >> > (hourlyPrice * MIN(hours)), ',', '${product.currency}') AS onSitePrice
> >> FROM
> >> > services s LEFT JOIN serviceHourlyPrices shp ON s.id=shp.serviceId
> WHERE
> >> > companyId=${product.companyId} AND s.enabled=true AND
> >> s.serviceType='OS'">
> >> >             <field column="onSitePrice" name="onSitePrice" />
> >> >             <entity name="fullReducedOnSitePrice"
> >> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} *
> (1 -
> >> > percentage), ',', '${product.currency}') AS fullReducedOnSitePrice
> FROM
> >> > discounts WHERE companyId=${product.companyId} AND category='FULL'">
> >> >                 <field name="fullReducedOnSitePrice"
> >> > column="fullReducedOnSitePrice"/>
> >> >             </entity>
> >> >             <entity name="partialReducedOnSitePrice"
> >> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} *
> (1 -
> >> > percentage), ',', '${product.currency}') AS partialReducedOnSitePrice
> >> FROM
> >> > discounts WHERE companyId=${product.companyId} AND
> category='PARTIAL'">
> >> >                 <field name="partialReducedOnSitePrice"
> >> > column="partialReducedOnSitePrice"/>
> >> >             </entity>
> >> >         </entity>
> >> >     </entity>
> >> >
> >> > The problem seems to be related to the condition s.enabled=true in the
> >> > query, because are rows with enabled=false that are causing problems
> >> (Solr
> >> > interprets them as rows with all fields null). I get an invalid SQL
> query
> >> > SELECT CONCAT( * (1 - percentage), ',', 'USD') AS
> fullReducedOnSitePrice
> >> > FROM discounts WHERE companyId=65.
> >> >
> >> > How can I force Solr to ignore, as it should, those elements?
> >>
>

Re: Solr incorrectly fetching elements that do not match conditions in where as all-null rows

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Hmm. I would still take as truth the database logs as opposed to Solr
logs. Or at least network traces using something like Wireshark.

Otherwise, you need some way to reduce your DIH query to the minimum
reproducible example. I am used to reading tech support emails and
even then I am not sure I can parse the significant configuration
aspects from the multiple parallel and nested entities. Can you reduce
this to the simplest (two level?) entity definition with a single
field and explain what you expected and what you are seeing.

Regards,
   Alex.
P.s. Solr DIH does have a gotcha with SQL import that it automagically
tries to match table column names to fields defined in schema and
populate them even if not explicitly declared. This does not match to
the way you describe the problem (your select statement still needs to
return those fields), but perhaps it interacts with something else to
trigger it.
----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 15 August 2016 at 23:54, Luis Sepúlveda <lu...@abroadwith.com> wrote:
> Thanks for the promp reply.
>
> h.enabled=true is a typo. It should be c.enabled=true, because the table
> companies also has a column called enabled. That part is working fine (it
> doesn't fetch companies with enabled=false).
>
> About the DB queries, I've taken, by turning Debug and Verbose on in the
> Dataimport tab, the queries that Solr is sending to DB, executed the same
> queries in my MySQL client. It clearly says '0 row(s) returned'.
>
> 2016-08-15 15:37 GMT+02:00 Alexandre Rafalovitch <ar...@gmail.com>:
>
>> Solr (well DIH) just passes that query to the DB, so if you are
>> getting extra rows (not extra fields), than I would focus on the
>> database side of the situation.
>>
>> Specifically, I would confirm from the database logs what the sent
>> query actually looks like.
>>
>> Very specifically, in your very first entity, I see the condition
>> "h.enabled=true" where "h" does not match the table names in the FROM
>> statement. Perhaps, that's the problem?
>>
>> Regards,
>>    Alex.
>> ----
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 15 August 2016 at 23:27, Luis Sepúlveda <lu...@abroadwith.com> wrote:
>> > Hello,
>> >
>> > Solr is trying to process non-existing child/nested entities. By
>> > non-existing I mean that they exist in DB but should not be at Solr side
>> > because they don't match the conditions in the query I use to fetch them.
>> >
>> > I have the below solr data configuration. The relationship between tables
>> > is complicated, but the point is that I need to fetch child/nested
>> entities
>> > and perform some calculations at query time. My problem is that some
>> > products have onSite services that are not enabled. I would expect Solr
>> > from ignoring those elements because of the conditions in the query. If I
>> > turn debug on when importing, I can see that all fields are null.
>> However,
>> > Solr still tries to process them, which results in invalid SQL queries
>> > because it replaces null fields with nothing.
>> >
>> >     <entity name="product" query="SELECT p.id AS productId,
>> > CONCAT(unitPrice,',',currency) AS unitPriceCurr FROM products p LEFT JOIN
>> > companies c ON c.id=p.companyId WHERE h.enabled=true">
>> >
>> >         <entity name="serviceTypes"
>> >                 query="SELECT s.serviceType, sl.language FROM services s
>> > LEFT JOIN serviceLanguages sl ON s.id=sl.serviceId WHERE
>> > companyId=${product.companyId} AND s.enabled=true">
>> >             <field column="serviceType" name="services" />
>> >             <field column="language" name="languages" />
>> >         </entity>
>> >
>> >         <entity name="onSite"
>> >                 query="SELECT s.id, s.enabled, ${product.unitPrice} +
>> > (hourlyPrice * MIN(hours)) AS onSitePriceRaw,
>> CONCAT(${product.unitPrice} +
>> > (hourlyPrice * MIN(hours)), ',', '${product.currency}') AS onSitePrice
>> FROM
>> > services s LEFT JOIN serviceHourlyPrices shp ON s.id=shp.serviceId WHERE
>> > companyId=${product.companyId} AND s.enabled=true AND
>> s.serviceType='OS'">
>> >             <field column="onSitePrice" name="onSitePrice" />
>> >             <entity name="fullReducedOnSitePrice"
>> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
>> > percentage), ',', '${product.currency}') AS fullReducedOnSitePrice FROM
>> > discounts WHERE companyId=${product.companyId} AND category='FULL'">
>> >                 <field name="fullReducedOnSitePrice"
>> > column="fullReducedOnSitePrice"/>
>> >             </entity>
>> >             <entity name="partialReducedOnSitePrice"
>> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
>> > percentage), ',', '${product.currency}') AS partialReducedOnSitePrice
>> FROM
>> > discounts WHERE companyId=${product.companyId} AND category='PARTIAL'">
>> >                 <field name="partialReducedOnSitePrice"
>> > column="partialReducedOnSitePrice"/>
>> >             </entity>
>> >         </entity>
>> >     </entity>
>> >
>> > The problem seems to be related to the condition s.enabled=true in the
>> > query, because are rows with enabled=false that are causing problems
>> (Solr
>> > interprets them as rows with all fields null). I get an invalid SQL query
>> > SELECT CONCAT( * (1 - percentage), ',', 'USD') AS fullReducedOnSitePrice
>> > FROM discounts WHERE companyId=65.
>> >
>> > How can I force Solr to ignore, as it should, those elements?
>>

Re: Solr incorrectly fetching elements that do not match conditions in where as all-null rows

Posted by Luis Sepúlveda <lu...@abroadwith.com>.
Thanks for the promp reply.

h.enabled=true is a typo. It should be c.enabled=true, because the table
companies also has a column called enabled. That part is working fine (it
doesn't fetch companies with enabled=false).

About the DB queries, I've taken, by turning Debug and Verbose on in the
Dataimport tab, the queries that Solr is sending to DB, executed the same
queries in my MySQL client. It clearly says '0 row(s) returned'.

2016-08-15 15:37 GMT+02:00 Alexandre Rafalovitch <ar...@gmail.com>:

> Solr (well DIH) just passes that query to the DB, so if you are
> getting extra rows (not extra fields), than I would focus on the
> database side of the situation.
>
> Specifically, I would confirm from the database logs what the sent
> query actually looks like.
>
> Very specifically, in your very first entity, I see the condition
> "h.enabled=true" where "h" does not match the table names in the FROM
> statement. Perhaps, that's the problem?
>
> Regards,
>    Alex.
> ----
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
>
>
> On 15 August 2016 at 23:27, Luis Sepúlveda <lu...@abroadwith.com> wrote:
> > Hello,
> >
> > Solr is trying to process non-existing child/nested entities. By
> > non-existing I mean that they exist in DB but should not be at Solr side
> > because they don't match the conditions in the query I use to fetch them.
> >
> > I have the below solr data configuration. The relationship between tables
> > is complicated, but the point is that I need to fetch child/nested
> entities
> > and perform some calculations at query time. My problem is that some
> > products have onSite services that are not enabled. I would expect Solr
> > from ignoring those elements because of the conditions in the query. If I
> > turn debug on when importing, I can see that all fields are null.
> However,
> > Solr still tries to process them, which results in invalid SQL queries
> > because it replaces null fields with nothing.
> >
> >     <entity name="product" query="SELECT p.id AS productId,
> > CONCAT(unitPrice,',',currency) AS unitPriceCurr FROM products p LEFT JOIN
> > companies c ON c.id=p.companyId WHERE h.enabled=true">
> >
> >         <entity name="serviceTypes"
> >                 query="SELECT s.serviceType, sl.language FROM services s
> > LEFT JOIN serviceLanguages sl ON s.id=sl.serviceId WHERE
> > companyId=${product.companyId} AND s.enabled=true">
> >             <field column="serviceType" name="services" />
> >             <field column="language" name="languages" />
> >         </entity>
> >
> >         <entity name="onSite"
> >                 query="SELECT s.id, s.enabled, ${product.unitPrice} +
> > (hourlyPrice * MIN(hours)) AS onSitePriceRaw,
> CONCAT(${product.unitPrice} +
> > (hourlyPrice * MIN(hours)), ',', '${product.currency}') AS onSitePrice
> FROM
> > services s LEFT JOIN serviceHourlyPrices shp ON s.id=shp.serviceId WHERE
> > companyId=${product.companyId} AND s.enabled=true AND
> s.serviceType='OS'">
> >             <field column="onSitePrice" name="onSitePrice" />
> >             <entity name="fullReducedOnSitePrice"
> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
> > percentage), ',', '${product.currency}') AS fullReducedOnSitePrice FROM
> > discounts WHERE companyId=${product.companyId} AND category='FULL'">
> >                 <field name="fullReducedOnSitePrice"
> > column="fullReducedOnSitePrice"/>
> >             </entity>
> >             <entity name="partialReducedOnSitePrice"
> >                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
> > percentage), ',', '${product.currency}') AS partialReducedOnSitePrice
> FROM
> > discounts WHERE companyId=${product.companyId} AND category='PARTIAL'">
> >                 <field name="partialReducedOnSitePrice"
> > column="partialReducedOnSitePrice"/>
> >             </entity>
> >         </entity>
> >     </entity>
> >
> > The problem seems to be related to the condition s.enabled=true in the
> > query, because are rows with enabled=false that are causing problems
> (Solr
> > interprets them as rows with all fields null). I get an invalid SQL query
> > SELECT CONCAT( * (1 - percentage), ',', 'USD') AS fullReducedOnSitePrice
> > FROM discounts WHERE companyId=65.
> >
> > How can I force Solr to ignore, as it should, those elements?
>

Re: Solr incorrectly fetching elements that do not match conditions in where as all-null rows

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Solr (well DIH) just passes that query to the DB, so if you are
getting extra rows (not extra fields), than I would focus on the
database side of the situation.

Specifically, I would confirm from the database logs what the sent
query actually looks like.

Very specifically, in your very first entity, I see the condition
"h.enabled=true" where "h" does not match the table names in the FROM
statement. Perhaps, that's the problem?

Regards,
   Alex.
----
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 15 August 2016 at 23:27, Luis Sepúlveda <lu...@abroadwith.com> wrote:
> Hello,
>
> Solr is trying to process non-existing child/nested entities. By
> non-existing I mean that they exist in DB but should not be at Solr side
> because they don't match the conditions in the query I use to fetch them.
>
> I have the below solr data configuration. The relationship between tables
> is complicated, but the point is that I need to fetch child/nested entities
> and perform some calculations at query time. My problem is that some
> products have onSite services that are not enabled. I would expect Solr
> from ignoring those elements because of the conditions in the query. If I
> turn debug on when importing, I can see that all fields are null. However,
> Solr still tries to process them, which results in invalid SQL queries
> because it replaces null fields with nothing.
>
>     <entity name="product" query="SELECT p.id AS productId,
> CONCAT(unitPrice,',',currency) AS unitPriceCurr FROM products p LEFT JOIN
> companies c ON c.id=p.companyId WHERE h.enabled=true">
>
>         <entity name="serviceTypes"
>                 query="SELECT s.serviceType, sl.language FROM services s
> LEFT JOIN serviceLanguages sl ON s.id=sl.serviceId WHERE
> companyId=${product.companyId} AND s.enabled=true">
>             <field column="serviceType" name="services" />
>             <field column="language" name="languages" />
>         </entity>
>
>         <entity name="onSite"
>                 query="SELECT s.id, s.enabled, ${product.unitPrice} +
> (hourlyPrice * MIN(hours)) AS onSitePriceRaw, CONCAT(${product.unitPrice} +
> (hourlyPrice * MIN(hours)), ',', '${product.currency}') AS onSitePrice FROM
> services s LEFT JOIN serviceHourlyPrices shp ON s.id=shp.serviceId WHERE
> companyId=${product.companyId} AND s.enabled=true AND s.serviceType='OS'">
>             <field column="onSitePrice" name="onSitePrice" />
>             <entity name="fullReducedOnSitePrice"
>                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
> percentage), ',', '${product.currency}') AS fullReducedOnSitePrice FROM
> discounts WHERE companyId=${product.companyId} AND category='FULL'">
>                 <field name="fullReducedOnSitePrice"
> column="fullReducedOnSitePrice"/>
>             </entity>
>             <entity name="partialReducedOnSitePrice"
>                     query="SELECT CONCAT(${onSite.onSitePriceRaw} * (1 -
> percentage), ',', '${product.currency}') AS partialReducedOnSitePrice FROM
> discounts WHERE companyId=${product.companyId} AND category='PARTIAL'">
>                 <field name="partialReducedOnSitePrice"
> column="partialReducedOnSitePrice"/>
>             </entity>
>         </entity>
>     </entity>
>
> The problem seems to be related to the condition s.enabled=true in the
> query, because are rows with enabled=false that are causing problems (Solr
> interprets them as rows with all fields null). I get an invalid SQL query
> SELECT CONCAT( * (1 - percentage), ',', 'USD') AS fullReducedOnSitePrice
> FROM discounts WHERE companyId=65.
>
> How can I force Solr to ignore, as it should, those elements?