You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Sjoerd Smeets <ss...@gmail.com> on 2021/12/05 16:28:21 UTC

Relevancy debugging - idf score

Hi all,

I'm debugging the relevancy scores of my query and I see the following for
two documents hits. My question is, why is the idf score not the same for
both documents? This is Solr 6.6.

Any guidance would be much appreciated.

Thanks!

*Doc1*
"71d72354eea23b9eae934ab616e8ce38de69d760": "
104.994415 = sum of:
  104.994415 = sum of:
    82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
[SchemaSimilarity], result of:
      82.89969 = score(freq=9.0), computed as boost * idf * tf from:
        100.0 = boost
        0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *52 = n, number of documents containing term*
          *125 = N, total number of documents with field*
        0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          9.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          12312.0 = dl, length of field (approximate)
          54179.03 = avgdl, average length of field
    22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
[SchemaSimilarity], result of:
      22.09473 = score(freq=4.0), computed as boost * idf * tf from:
        10.0 = boost
        2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *9 = n, number of documents containing term*
          *107 = N, total number of documents with field*
        0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          4.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          5656.0 = dl, length of field (approximate)
          50520.543 = avgdl, average length of field
  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
    0.0 = int(s_integer_search.previews)=0
    1.0 = boost
  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
    0.0 = int(s_integer_search.downloads)=0
    1.0 = boost
"

*Doc2*
"80302a1ecc44d1e556970ab96c25b1fd3328a854": "
84.61461 = sum of:
  84.61461 = sum of:
    64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
[SchemaSimilarity], result of:
      64.68881 = score(freq=493.0), computed as boost * idf * tf from:
        100.0 = boost
        0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *60 = n, number of documents containing term*
          *115 = N, total number of documents with field*
        0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          493.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          229400.0 = dl, length of field (approximate)
          73913.91 = avgdl, average length of field
    19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
[SchemaSimilarity], result of:
      19.9258 = score(freq=340.0), computed as boost * idf * tf from:
        10.0 = boost
        2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
from:
          *13 = n, number of documents containing term*
          *99 = N, total number of documents with field*
        0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
avgdl)) from:
          340.0 = freq, occurrences of term within document
          1.2 = k1, term saturation parameter
          0.75 = b, length normalization parameter
          147480.0 = dl, length of field (approximate)
          95534.95 = avgdl, average length of field
  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
    0.0 = int(s_integer_search.previews)=0
    1.0 = boost
  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
    0.0 = int(s_integer_search.downloads)=0
    1.0 = boost
"

Re: Relevancy debugging - idf score

Posted by Alessandro Benedetti <a....@sease.io>.
Hi Markus,
won't the problem be still present across shards without distributed IDF?
You may have skewed shards and then each of them will have a different IDF
for the same term (and field).
In relation to the performance penalty, Walter highlighted, I definitely
see some space for contribution, but I am not sure anyone is looking into
that right now.

Cheers
--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Mon, 6 Dec 2021 at 12:07, Markus Jelsma <ma...@openindex.io>
wrote:

> Hello Sjoerd,
>
> ExactStatsCache indeed works fine when replicas of the same shard do not
> share identical term stats, but it comes with some overhead. If you can,
> upgrade to at least 7.x and change the default NRT replica types to TLOG.
> You then no longer need to use ExactStatsCache because replicas will be
> identical.
>
> Regards,
> Markus
>
> Op ma 6 dec. 2021 om 12:09 schreef Alessandro Benedetti <
> a.benedetti@sease.io>:
>
> > Good to know you solved it!
> > Yes, Distributed IDF is definitely a problem in case you have skewed
> > documents distributions.
> >
> > Cheers
> > --------------------------
> > Alessandro Benedetti
> > Apache Lucene/Solr Committer
> > Director, R&D Software Engineer, Search Consultant
> >
> > www.sease.io
> >
> >
> > On Sun, 5 Dec 2021 at 17:19, Sjoerd Smeets <ss...@gmail.com> wrote:
> >
> > > Found it!
> > >
> > > I had to enable the
> > > ExactStatsCache
> > >
> > > Found a description over here. Thanks for pointing me in the right
> > > direction.
> > >
> > > https://solr.pl/en/2019/05/20/distributed-idf/
> > >
> > >
> > > On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets <ss...@gmail.com>
> wrote:
> > >
> > >> Hi Allessandro,
> > >>
> > >> Thanks for your reply! Yes, the document are in the same result list
> and
> > >> I'm not doing any indexing at the moment and executed a commit just to
> > be
> > >> sure. Still the same result. It is an environment with 4 shards.
> Perhaps
> > >> that plays a factor?
> > >>
> > >> Thanks,
> > >> Sjoerd
> > >>
> > >> On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <
> > >> a.benedetti@sease.io> wrote:
> > >>
> > >>> It's seems like the underline index changed.
> > >>> Are those two documents in the same result set?
> > >>> Is it just one query?
> > >>> It's definitely curious, even if a commit happened search results are
> > >>> consistent in one searcher.
> > >>>
> > >>>
> > >>> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
> > >>>
> > >>>> Hi all,
> > >>>>
> > >>>> I'm debugging the relevancy scores of my query and I see the
> following
> > >>>> for
> > >>>> two documents hits. My question is, why is the idf score not the
> same
> > >>>> for
> > >>>> both documents? This is Solr 6.6.
> > >>>>
> > >>>> Any guidance would be much appreciated.
> > >>>>
> > >>>> Thanks!
> > >>>>
> > >>>> *Doc1*
> > >>>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
> > >>>> 104.994415 = sum of:
> > >>>>   104.994415 = sum of:
> > >>>>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in
> > 22470)
> > >>>> [SchemaSimilarity], result of:
> > >>>>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
> > >>>>         100.0 = boost
> > >>>>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n +
> > 0.5))
> > >>>> from:
> > >>>>           *52 = n, number of documents containing term*
> > >>>>           *125 = N, total number of documents with field*
> > >>>>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b *
> > dl
> > >>>> /
> > >>>> avgdl)) from:
> > >>>>           9.0 = freq, occurrences of term within document
> > >>>>           1.2 = k1, term saturation parameter
> > >>>>           0.75 = b, length normalization parameter
> > >>>>           12312.0 = dl, length of field (approximate)
> > >>>>           54179.03 = avgdl, average length of field
> > >>>>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
> > >>>> [SchemaSimilarity], result of:
> > >>>>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
> > >>>>         10.0 = boost
> > >>>>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n +
> > 0.5))
> > >>>> from:
> > >>>>           *9 = n, number of documents containing term*
> > >>>>           *107 = N, total number of documents with field*
> > >>>>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b *
> > dl
> > >>>> /
> > >>>> avgdl)) from:
> > >>>>           4.0 = freq, occurrences of term within document
> > >>>>           1.2 = k1, term saturation parameter
> > >>>>           0.75 = b, length normalization parameter
> > >>>>           5656.0 = dl, length of field (approximate)
> > >>>>           50520.543 = avgdl, average length of field
> > >>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
> > >>>>     0.0 = int(s_integer_search.previews)=0
> > >>>>     1.0 = boost
> > >>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
> > >>>>     0.0 = int(s_integer_search.downloads)=0
> > >>>>     1.0 = boost
> > >>>> "
> > >>>>
> > >>>> *Doc2*
> > >>>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
> > >>>> 84.61461 = sum of:
> > >>>>   84.61461 = sum of:
> > >>>>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
> > >>>> [SchemaSimilarity], result of:
> > >>>>       64.68881 = score(freq=493.0), computed as boost * idf * tf
> from:
> > >>>>         100.0 = boost
> > >>>>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n +
> > 0.5))
> > >>>> from:
> > >>>>           *60 = n, number of documents containing term*
> > >>>>           *115 = N, total number of documents with field*
> > >>>>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b
> *
> > >>>> dl /
> > >>>> avgdl)) from:
> > >>>>           493.0 = freq, occurrences of term within document
> > >>>>           1.2 = k1, term saturation parameter
> > >>>>           0.75 = b, length normalization parameter
> > >>>>           229400.0 = dl, length of field (approximate)
> > >>>>           73913.91 = avgdl, average length of field
> > >>>>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
> > >>>> [SchemaSimilarity], result of:
> > >>>>       19.9258 = score(freq=340.0), computed as boost * idf * tf
> from:
> > >>>>         10.0 = boost
> > >>>>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n +
> > 0.5))
> > >>>> from:
> > >>>>           *13 = n, number of documents containing term*
> > >>>>           *99 = N, total number of documents with field*
> > >>>>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b
> *
> > >>>> dl /
> > >>>> avgdl)) from:
> > >>>>           340.0 = freq, occurrences of term within document
> > >>>>           1.2 = k1, term saturation parameter
> > >>>>           0.75 = b, length normalization parameter
> > >>>>           147480.0 = dl, length of field (approximate)
> > >>>>           95534.95 = avgdl, average length of field
> > >>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
> > >>>>     0.0 = int(s_integer_search.previews)=0
> > >>>>     1.0 = boost
> > >>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
> > >>>>     0.0 = int(s_integer_search.downloads)=0
> > >>>>     1.0 = boost
> > >>>> "
> > >>>>
> > >>>
> >
>

Re: Relevancy debugging - idf score

Posted by Markus Jelsma <ma...@openindex.io>.
Hello Sjoerd,

ExactStatsCache indeed works fine when replicas of the same shard do not
share identical term stats, but it comes with some overhead. If you can,
upgrade to at least 7.x and change the default NRT replica types to TLOG.
You then no longer need to use ExactStatsCache because replicas will be
identical.

Regards,
Markus

Op ma 6 dec. 2021 om 12:09 schreef Alessandro Benedetti <
a.benedetti@sease.io>:

> Good to know you solved it!
> Yes, Distributed IDF is definitely a problem in case you have skewed
> documents distributions.
>
> Cheers
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Sun, 5 Dec 2021 at 17:19, Sjoerd Smeets <ss...@gmail.com> wrote:
>
> > Found it!
> >
> > I had to enable the
> > ExactStatsCache
> >
> > Found a description over here. Thanks for pointing me in the right
> > direction.
> >
> > https://solr.pl/en/2019/05/20/distributed-idf/
> >
> >
> > On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets <ss...@gmail.com> wrote:
> >
> >> Hi Allessandro,
> >>
> >> Thanks for your reply! Yes, the document are in the same result list and
> >> I'm not doing any indexing at the moment and executed a commit just to
> be
> >> sure. Still the same result. It is an environment with 4 shards. Perhaps
> >> that plays a factor?
> >>
> >> Thanks,
> >> Sjoerd
> >>
> >> On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <
> >> a.benedetti@sease.io> wrote:
> >>
> >>> It's seems like the underline index changed.
> >>> Are those two documents in the same result set?
> >>> Is it just one query?
> >>> It's definitely curious, even if a commit happened search results are
> >>> consistent in one searcher.
> >>>
> >>>
> >>> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
> >>>
> >>>> Hi all,
> >>>>
> >>>> I'm debugging the relevancy scores of my query and I see the following
> >>>> for
> >>>> two documents hits. My question is, why is the idf score not the same
> >>>> for
> >>>> both documents? This is Solr 6.6.
> >>>>
> >>>> Any guidance would be much appreciated.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> *Doc1*
> >>>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
> >>>> 104.994415 = sum of:
> >>>>   104.994415 = sum of:
> >>>>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in
> 22470)
> >>>> [SchemaSimilarity], result of:
> >>>>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
> >>>>         100.0 = boost
> >>>>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n +
> 0.5))
> >>>> from:
> >>>>           *52 = n, number of documents containing term*
> >>>>           *125 = N, total number of documents with field*
> >>>>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b *
> dl
> >>>> /
> >>>> avgdl)) from:
> >>>>           9.0 = freq, occurrences of term within document
> >>>>           1.2 = k1, term saturation parameter
> >>>>           0.75 = b, length normalization parameter
> >>>>           12312.0 = dl, length of field (approximate)
> >>>>           54179.03 = avgdl, average length of field
> >>>>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
> >>>> [SchemaSimilarity], result of:
> >>>>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
> >>>>         10.0 = boost
> >>>>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n +
> 0.5))
> >>>> from:
> >>>>           *9 = n, number of documents containing term*
> >>>>           *107 = N, total number of documents with field*
> >>>>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b *
> dl
> >>>> /
> >>>> avgdl)) from:
> >>>>           4.0 = freq, occurrences of term within document
> >>>>           1.2 = k1, term saturation parameter
> >>>>           0.75 = b, length normalization parameter
> >>>>           5656.0 = dl, length of field (approximate)
> >>>>           50520.543 = avgdl, average length of field
> >>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
> >>>>     0.0 = int(s_integer_search.previews)=0
> >>>>     1.0 = boost
> >>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
> >>>>     0.0 = int(s_integer_search.downloads)=0
> >>>>     1.0 = boost
> >>>> "
> >>>>
> >>>> *Doc2*
> >>>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
> >>>> 84.61461 = sum of:
> >>>>   84.61461 = sum of:
> >>>>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
> >>>> [SchemaSimilarity], result of:
> >>>>       64.68881 = score(freq=493.0), computed as boost * idf * tf from:
> >>>>         100.0 = boost
> >>>>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n +
> 0.5))
> >>>> from:
> >>>>           *60 = n, number of documents containing term*
> >>>>           *115 = N, total number of documents with field*
> >>>>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b *
> >>>> dl /
> >>>> avgdl)) from:
> >>>>           493.0 = freq, occurrences of term within document
> >>>>           1.2 = k1, term saturation parameter
> >>>>           0.75 = b, length normalization parameter
> >>>>           229400.0 = dl, length of field (approximate)
> >>>>           73913.91 = avgdl, average length of field
> >>>>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
> >>>> [SchemaSimilarity], result of:
> >>>>       19.9258 = score(freq=340.0), computed as boost * idf * tf from:
> >>>>         10.0 = boost
> >>>>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n +
> 0.5))
> >>>> from:
> >>>>           *13 = n, number of documents containing term*
> >>>>           *99 = N, total number of documents with field*
> >>>>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b *
> >>>> dl /
> >>>> avgdl)) from:
> >>>>           340.0 = freq, occurrences of term within document
> >>>>           1.2 = k1, term saturation parameter
> >>>>           0.75 = b, length normalization parameter
> >>>>           147480.0 = dl, length of field (approximate)
> >>>>           95534.95 = avgdl, average length of field
> >>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
> >>>>     0.0 = int(s_integer_search.previews)=0
> >>>>     1.0 = boost
> >>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
> >>>>     0.0 = int(s_integer_search.downloads)=0
> >>>>     1.0 = boost
> >>>> "
> >>>>
> >>>
>

Re: Relevancy debugging - idf score

Posted by Walter Underwood <wu...@wunderwood.org>.
When we tried exact IDF, it was about 10X slower in our sharded system, so we couldn’t use it.

It is possible to calculate IDF when merging results from shards, with no speed penalty. Infoseek was doing that 25 years ago and the patent has expired. You return df from each shard, then calculate idf by adding the shard dfs. 

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Dec 6, 2021, at 3:08 AM, Alessandro Benedetti <a....@sease.io> wrote:
> 
> Good to know you solved it!
> Yes, Distributed IDF is definitely a problem in case you have skewed
> documents distributions.
> 
> Cheers
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr Committer
> Director, R&D Software Engineer, Search Consultant
> 
> www.sease.io
> 
> 
> On Sun, 5 Dec 2021 at 17:19, Sjoerd Smeets <ss...@gmail.com> wrote:
> 
>> Found it!
>> 
>> I had to enable the
>> ExactStatsCache
>> 
>> Found a description over here. Thanks for pointing me in the right
>> direction.
>> 
>> https://solr.pl/en/2019/05/20/distributed-idf/
>> 
>> 
>> On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets <ss...@gmail.com> wrote:
>> 
>>> Hi Allessandro,
>>> 
>>> Thanks for your reply! Yes, the document are in the same result list and
>>> I'm not doing any indexing at the moment and executed a commit just to be
>>> sure. Still the same result. It is an environment with 4 shards. Perhaps
>>> that plays a factor?
>>> 
>>> Thanks,
>>> Sjoerd
>>> 
>>> On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <
>>> a.benedetti@sease.io> wrote:
>>> 
>>>> It's seems like the underline index changed.
>>>> Are those two documents in the same result set?
>>>> Is it just one query?
>>>> It's definitely curious, even if a commit happened search results are
>>>> consistent in one searcher.
>>>> 
>>>> 
>>>> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I'm debugging the relevancy scores of my query and I see the following
>>>>> for
>>>>> two documents hits. My question is, why is the idf score not the same
>>>>> for
>>>>> both documents? This is Solr 6.6.
>>>>> 
>>>>> Any guidance would be much appreciated.
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>> *Doc1*
>>>>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
>>>>> 104.994415 = sum of:
>>>>>  104.994415 = sum of:
>>>>>    82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
>>>>> [SchemaSimilarity], result of:
>>>>>      82.89969 = score(freq=9.0), computed as boost * idf * tf from:
>>>>>        100.0 = boost
>>>>>        0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>>> from:
>>>>>          *52 = n, number of documents containing term*
>>>>>          *125 = N, total number of documents with field*
>>>>>        0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>>>> /
>>>>> avgdl)) from:
>>>>>          9.0 = freq, occurrences of term within document
>>>>>          1.2 = k1, term saturation parameter
>>>>>          0.75 = b, length normalization parameter
>>>>>          12312.0 = dl, length of field (approximate)
>>>>>          54179.03 = avgdl, average length of field
>>>>>    22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
>>>>> [SchemaSimilarity], result of:
>>>>>      22.09473 = score(freq=4.0), computed as boost * idf * tf from:
>>>>>        10.0 = boost
>>>>>        2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>>> from:
>>>>>          *9 = n, number of documents containing term*
>>>>>          *107 = N, total number of documents with field*
>>>>>        0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>>>> /
>>>>> avgdl)) from:
>>>>>          4.0 = freq, occurrences of term within document
>>>>>          1.2 = k1, term saturation parameter
>>>>>          0.75 = b, length normalization parameter
>>>>>          5656.0 = dl, length of field (approximate)
>>>>>          50520.543 = avgdl, average length of field
>>>>>  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>>>    0.0 = int(s_integer_search.previews)=0
>>>>>    1.0 = boost
>>>>>  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>>>    0.0 = int(s_integer_search.downloads)=0
>>>>>    1.0 = boost
>>>>> "
>>>>> 
>>>>> *Doc2*
>>>>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
>>>>> 84.61461 = sum of:
>>>>>  84.61461 = sum of:
>>>>>    64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
>>>>> [SchemaSimilarity], result of:
>>>>>      64.68881 = score(freq=493.0), computed as boost * idf * tf from:
>>>>>        100.0 = boost
>>>>>        0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>>> from:
>>>>>          *60 = n, number of documents containing term*
>>>>>          *115 = N, total number of documents with field*
>>>>>        0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b *
>>>>> dl /
>>>>> avgdl)) from:
>>>>>          493.0 = freq, occurrences of term within document
>>>>>          1.2 = k1, term saturation parameter
>>>>>          0.75 = b, length normalization parameter
>>>>>          229400.0 = dl, length of field (approximate)
>>>>>          73913.91 = avgdl, average length of field
>>>>>    19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
>>>>> [SchemaSimilarity], result of:
>>>>>      19.9258 = score(freq=340.0), computed as boost * idf * tf from:
>>>>>        10.0 = boost
>>>>>        2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>>> from:
>>>>>          *13 = n, number of documents containing term*
>>>>>          *99 = N, total number of documents with field*
>>>>>        0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b *
>>>>> dl /
>>>>> avgdl)) from:
>>>>>          340.0 = freq, occurrences of term within document
>>>>>          1.2 = k1, term saturation parameter
>>>>>          0.75 = b, length normalization parameter
>>>>>          147480.0 = dl, length of field (approximate)
>>>>>          95534.95 = avgdl, average length of field
>>>>>  0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>>>    0.0 = int(s_integer_search.previews)=0
>>>>>    1.0 = boost
>>>>>  0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>>>    0.0 = int(s_integer_search.downloads)=0
>>>>>    1.0 = boost
>>>>> "
>>>>> 
>>>> 


Re: Relevancy debugging - idf score

Posted by Alessandro Benedetti <a....@sease.io>.
Good to know you solved it!
Yes, Distributed IDF is definitely a problem in case you have skewed
documents distributions.

Cheers
--------------------------
Alessandro Benedetti
Apache Lucene/Solr Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Sun, 5 Dec 2021 at 17:19, Sjoerd Smeets <ss...@gmail.com> wrote:

> Found it!
>
> I had to enable the
> ExactStatsCache
>
> Found a description over here. Thanks for pointing me in the right
> direction.
>
> https://solr.pl/en/2019/05/20/distributed-idf/
>
>
> On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets <ss...@gmail.com> wrote:
>
>> Hi Allessandro,
>>
>> Thanks for your reply! Yes, the document are in the same result list and
>> I'm not doing any indexing at the moment and executed a commit just to be
>> sure. Still the same result. It is an environment with 4 shards. Perhaps
>> that plays a factor?
>>
>> Thanks,
>> Sjoerd
>>
>> On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <
>> a.benedetti@sease.io> wrote:
>>
>>> It's seems like the underline index changed.
>>> Are those two documents in the same result set?
>>> Is it just one query?
>>> It's definitely curious, even if a commit happened search results are
>>> consistent in one searcher.
>>>
>>>
>>> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I'm debugging the relevancy scores of my query and I see the following
>>>> for
>>>> two documents hits. My question is, why is the idf score not the same
>>>> for
>>>> both documents? This is Solr 6.6.
>>>>
>>>> Any guidance would be much appreciated.
>>>>
>>>> Thanks!
>>>>
>>>> *Doc1*
>>>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
>>>> 104.994415 = sum of:
>>>>   104.994415 = sum of:
>>>>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
>>>> [SchemaSimilarity], result of:
>>>>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
>>>>         100.0 = boost
>>>>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>> from:
>>>>           *52 = n, number of documents containing term*
>>>>           *125 = N, total number of documents with field*
>>>>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>>> /
>>>> avgdl)) from:
>>>>           9.0 = freq, occurrences of term within document
>>>>           1.2 = k1, term saturation parameter
>>>>           0.75 = b, length normalization parameter
>>>>           12312.0 = dl, length of field (approximate)
>>>>           54179.03 = avgdl, average length of field
>>>>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
>>>> [SchemaSimilarity], result of:
>>>>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
>>>>         10.0 = boost
>>>>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>> from:
>>>>           *9 = n, number of documents containing term*
>>>>           *107 = N, total number of documents with field*
>>>>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>>> /
>>>> avgdl)) from:
>>>>           4.0 = freq, occurrences of term within document
>>>>           1.2 = k1, term saturation parameter
>>>>           0.75 = b, length normalization parameter
>>>>           5656.0 = dl, length of field (approximate)
>>>>           50520.543 = avgdl, average length of field
>>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>>     0.0 = int(s_integer_search.previews)=0
>>>>     1.0 = boost
>>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>>     0.0 = int(s_integer_search.downloads)=0
>>>>     1.0 = boost
>>>> "
>>>>
>>>> *Doc2*
>>>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
>>>> 84.61461 = sum of:
>>>>   84.61461 = sum of:
>>>>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
>>>> [SchemaSimilarity], result of:
>>>>       64.68881 = score(freq=493.0), computed as boost * idf * tf from:
>>>>         100.0 = boost
>>>>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>> from:
>>>>           *60 = n, number of documents containing term*
>>>>           *115 = N, total number of documents with field*
>>>>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b *
>>>> dl /
>>>> avgdl)) from:
>>>>           493.0 = freq, occurrences of term within document
>>>>           1.2 = k1, term saturation parameter
>>>>           0.75 = b, length normalization parameter
>>>>           229400.0 = dl, length of field (approximate)
>>>>           73913.91 = avgdl, average length of field
>>>>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
>>>> [SchemaSimilarity], result of:
>>>>       19.9258 = score(freq=340.0), computed as boost * idf * tf from:
>>>>         10.0 = boost
>>>>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>>> from:
>>>>           *13 = n, number of documents containing term*
>>>>           *99 = N, total number of documents with field*
>>>>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b *
>>>> dl /
>>>> avgdl)) from:
>>>>           340.0 = freq, occurrences of term within document
>>>>           1.2 = k1, term saturation parameter
>>>>           0.75 = b, length normalization parameter
>>>>           147480.0 = dl, length of field (approximate)
>>>>           95534.95 = avgdl, average length of field
>>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>>     0.0 = int(s_integer_search.previews)=0
>>>>     1.0 = boost
>>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>>     0.0 = int(s_integer_search.downloads)=0
>>>>     1.0 = boost
>>>> "
>>>>
>>>

Re: Relevancy debugging - idf score

Posted by Sjoerd Smeets <ss...@gmail.com>.
Found it!

I had to enable the
ExactStatsCache

Found a description over here. Thanks for pointing me in the right
direction.

https://solr.pl/en/2019/05/20/distributed-idf/


On Sun, Dec 5, 2021 at 11:09 AM Sjoerd Smeets <ss...@gmail.com> wrote:

> Hi Allessandro,
>
> Thanks for your reply! Yes, the document are in the same result list and
> I'm not doing any indexing at the moment and executed a commit just to be
> sure. Still the same result. It is an environment with 4 shards. Perhaps
> that plays a factor?
>
> Thanks,
> Sjoerd
>
> On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <a....@sease.io>
> wrote:
>
>> It's seems like the underline index changed.
>> Are those two documents in the same result set?
>> Is it just one query?
>> It's definitely curious, even if a commit happened search results are
>> consistent in one searcher.
>>
>>
>> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I'm debugging the relevancy scores of my query and I see the following
>>> for
>>> two documents hits. My question is, why is the idf score not the same for
>>> both documents? This is Solr 6.6.
>>>
>>> Any guidance would be much appreciated.
>>>
>>> Thanks!
>>>
>>> *Doc1*
>>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
>>> 104.994415 = sum of:
>>>   104.994415 = sum of:
>>>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
>>> [SchemaSimilarity], result of:
>>>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
>>>         100.0 = boost
>>>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>> from:
>>>           *52 = n, number of documents containing term*
>>>           *125 = N, total number of documents with field*
>>>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>>> avgdl)) from:
>>>           9.0 = freq, occurrences of term within document
>>>           1.2 = k1, term saturation parameter
>>>           0.75 = b, length normalization parameter
>>>           12312.0 = dl, length of field (approximate)
>>>           54179.03 = avgdl, average length of field
>>>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
>>> [SchemaSimilarity], result of:
>>>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
>>>         10.0 = boost
>>>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>> from:
>>>           *9 = n, number of documents containing term*
>>>           *107 = N, total number of documents with field*
>>>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>>> avgdl)) from:
>>>           4.0 = freq, occurrences of term within document
>>>           1.2 = k1, term saturation parameter
>>>           0.75 = b, length normalization parameter
>>>           5656.0 = dl, length of field (approximate)
>>>           50520.543 = avgdl, average length of field
>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>     0.0 = int(s_integer_search.previews)=0
>>>     1.0 = boost
>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>     0.0 = int(s_integer_search.downloads)=0
>>>     1.0 = boost
>>> "
>>>
>>> *Doc2*
>>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
>>> 84.61461 = sum of:
>>>   84.61461 = sum of:
>>>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
>>> [SchemaSimilarity], result of:
>>>       64.68881 = score(freq=493.0), computed as boost * idf * tf from:
>>>         100.0 = boost
>>>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>> from:
>>>           *60 = n, number of documents containing term*
>>>           *115 = N, total number of documents with field*
>>>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>> /
>>> avgdl)) from:
>>>           493.0 = freq, occurrences of term within document
>>>           1.2 = k1, term saturation parameter
>>>           0.75 = b, length normalization parameter
>>>           229400.0 = dl, length of field (approximate)
>>>           73913.91 = avgdl, average length of field
>>>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
>>> [SchemaSimilarity], result of:
>>>       19.9258 = score(freq=340.0), computed as boost * idf * tf from:
>>>         10.0 = boost
>>>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>>> from:
>>>           *13 = n, number of documents containing term*
>>>           *99 = N, total number of documents with field*
>>>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b * dl
>>> /
>>> avgdl)) from:
>>>           340.0 = freq, occurrences of term within document
>>>           1.2 = k1, term saturation parameter
>>>           0.75 = b, length normalization parameter
>>>           147480.0 = dl, length of field (approximate)
>>>           95534.95 = avgdl, average length of field
>>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>>     0.0 = int(s_integer_search.previews)=0
>>>     1.0 = boost
>>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>>     0.0 = int(s_integer_search.downloads)=0
>>>     1.0 = boost
>>> "
>>>
>>

Re: Relevancy debugging - idf score

Posted by Sjoerd Smeets <ss...@gmail.com>.
Hi Allessandro,

Thanks for your reply! Yes, the document are in the same result list and
I'm not doing any indexing at the moment and executed a commit just to be
sure. Still the same result. It is an environment with 4 shards. Perhaps
that plays a factor?

Thanks,
Sjoerd

On Sun, Dec 5, 2021 at 11:02 AM Alessandro Benedetti <a....@sease.io>
wrote:

> It's seems like the underline index changed.
> Are those two documents in the same result set?
> Is it just one query?
> It's definitely curious, even if a commit happened search results are
> consistent in one searcher.
>
>
> On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:
>
>> Hi all,
>>
>> I'm debugging the relevancy scores of my query and I see the following for
>> two documents hits. My question is, why is the idf score not the same for
>> both documents? This is Solr 6.6.
>>
>> Any guidance would be much appreciated.
>>
>> Thanks!
>>
>> *Doc1*
>> "71d72354eea23b9eae934ab616e8ce38de69d760": "
>> 104.994415 = sum of:
>>   104.994415 = sum of:
>>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
>> [SchemaSimilarity], result of:
>>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
>>         100.0 = boost
>>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>> from:
>>           *52 = n, number of documents containing term*
>>           *125 = N, total number of documents with field*
>>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>> avgdl)) from:
>>           9.0 = freq, occurrences of term within document
>>           1.2 = k1, term saturation parameter
>>           0.75 = b, length normalization parameter
>>           12312.0 = dl, length of field (approximate)
>>           54179.03 = avgdl, average length of field
>>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
>> [SchemaSimilarity], result of:
>>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
>>         10.0 = boost
>>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>> from:
>>           *9 = n, number of documents containing term*
>>           *107 = N, total number of documents with field*
>>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>> avgdl)) from:
>>           4.0 = freq, occurrences of term within document
>>           1.2 = k1, term saturation parameter
>>           0.75 = b, length normalization parameter
>>           5656.0 = dl, length of field (approximate)
>>           50520.543 = avgdl, average length of field
>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>     0.0 = int(s_integer_search.previews)=0
>>     1.0 = boost
>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>     0.0 = int(s_integer_search.downloads)=0
>>     1.0 = boost
>> "
>>
>> *Doc2*
>> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
>> 84.61461 = sum of:
>>   84.61461 = sum of:
>>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
>> [SchemaSimilarity], result of:
>>       64.68881 = score(freq=493.0), computed as boost * idf * tf from:
>>         100.0 = boost
>>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>> from:
>>           *60 = n, number of documents containing term*
>>           *115 = N, total number of documents with field*
>>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>> avgdl)) from:
>>           493.0 = freq, occurrences of term within document
>>           1.2 = k1, term saturation parameter
>>           0.75 = b, length normalization parameter
>>           229400.0 = dl, length of field (approximate)
>>           73913.91 = avgdl, average length of field
>>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
>> [SchemaSimilarity], result of:
>>       19.9258 = score(freq=340.0), computed as boost * idf * tf from:
>>         10.0 = boost
>>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
>> from:
>>           *13 = n, number of documents containing term*
>>           *99 = N, total number of documents with field*
>>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
>> avgdl)) from:
>>           340.0 = freq, occurrences of term within document
>>           1.2 = k1, term saturation parameter
>>           0.75 = b, length normalization parameter
>>           147480.0 = dl, length of field (approximate)
>>           95534.95 = avgdl, average length of field
>>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>>     0.0 = int(s_integer_search.previews)=0
>>     1.0 = boost
>>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>>     0.0 = int(s_integer_search.downloads)=0
>>     1.0 = boost
>> "
>>
>

Re: Relevancy debugging - idf score

Posted by Alessandro Benedetti <a....@sease.io>.
It's seems like the underline index changed.
Are those two documents in the same result set?
Is it just one query?
It's definitely curious, even if a commit happened search results are
consistent in one searcher.


On Sun, 5 Dec 2021, 16:28 Sjoerd Smeets, <ss...@gmail.com> wrote:

> Hi all,
>
> I'm debugging the relevancy scores of my query and I see the following for
> two documents hits. My question is, why is the idf score not the same for
> both documents? This is Solr 6.6.
>
> Any guidance would be much appreciated.
>
> Thanks!
>
> *Doc1*
> "71d72354eea23b9eae934ab616e8ce38de69d760": "
> 104.994415 = sum of:
>   104.994415 = sum of:
>     82.89969 = weight(stemmed_data.timenote.narratives:remedi in 22470)
> [SchemaSimilarity], result of:
>       82.89969 = score(freq=9.0), computed as boost * idf * tf from:
>         100.0 = boost
>         0.87546873 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
> from:
>           *52 = n, number of documents containing term*
>           *125 = N, total number of documents with field*
>         0.9469177 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
> avgdl)) from:
>           9.0 = freq, occurrences of term within document
>           1.2 = k1, term saturation parameter
>           0.75 = b, length normalization parameter
>           12312.0 = dl, length of field (approximate)
>           54179.03 = avgdl, average length of field
>     22.09473 = weight(stemmed_data.timenote.matters:remedi in 22470)
> [SchemaSimilarity], result of:
>       22.09473 = score(freq=4.0), computed as boost * idf * tf from:
>         10.0 = boost
>         2.4308395 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
> from:
>           *9 = n, number of documents containing term*
>           *107 = N, total number of documents with field*
>         0.9089341 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
> avgdl)) from:
>           4.0 = freq, occurrences of term within document
>           1.2 = k1, term saturation parameter
>           0.75 = b, length normalization parameter
>           5656.0 = dl, length of field (approximate)
>           50520.543 = avgdl, average length of field
>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>     0.0 = int(s_integer_search.previews)=0
>     1.0 = boost
>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>     0.0 = int(s_integer_search.downloads)=0
>     1.0 = boost
> "
>
> *Doc2*
> "80302a1ecc44d1e556970ab96c25b1fd3328a854": "
> 84.61461 = sum of:
>   84.61461 = sum of:
>     64.68881 = weight(stemmed_data.timenote.narratives:remedi in 0)
> [SchemaSimilarity], result of:
>       64.68881 = score(freq=493.0), computed as boost * idf * tf from:
>         100.0 = boost
>         0.65094686 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
> from:
>           *60 = n, number of documents containing term*
>           *115 = N, total number of documents with field*
>         0.99376476 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
> avgdl)) from:
>           493.0 = freq, occurrences of term within document
>           1.2 = k1, term saturation parameter
>           0.75 = b, length normalization parameter
>           229400.0 = dl, length of field (approximate)
>           73913.91 = avgdl, average length of field
>     19.9258 = weight(stemmed_data.timenote.matters:remedi in 0)
> [SchemaSimilarity], result of:
>       19.9258 = score(freq=340.0), computed as boost * idf * tf from:
>         10.0 = boost
>         2.0024805 = idf, computed as log(1 + (N - n + 0.5) / (n + 0.5))
> from:
>           *13 = n, number of documents containing term*
>           *99 = N, total number of documents with field*
>         0.99505585 = tf, computed as freq / (freq + k1 * (1 - b + b * dl /
> avgdl)) from:
>           340.0 = freq, occurrences of term within document
>           1.2 = k1, term saturation parameter
>           0.75 = b, length normalization parameter
>           147480.0 = dl, length of field (approximate)
>           95534.95 = avgdl, average length of field
>   0.0 = FunctionQuery(int(s_integer_search.previews)), product of:
>     0.0 = int(s_integer_search.previews)=0
>     1.0 = boost
>   0.0 = FunctionQuery(int(s_integer_search.downloads)), product of:
>     0.0 = int(s_integer_search.downloads)=0
>     1.0 = boost
> "
>