You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Michael Wechner <mi...@wyona.com> on 2023/11/15 09:01:59 UTC
SPLADE implementation
Hi
I have found the following issue re a possible SPLADE implementation
https://github.com/apache/lucene/issues/11799
Is somebody still working on this?
Thanks
Michael
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
Re: SPLADE implementation
Posted by Michael Wechner <mi...@wyona.com>.
I got it running now :-) thanks again, whereas see the code below, which
might help others as well.
I don't quite understand the correlation between weights, scores, etc.
yet, but will try to figure out from the documentation at
https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
Thanks
Michael
String question ="What animals live in the rainforests of Brazil?";
Query questionQuery = parser.parse(question);
List<String> features = getFeatures(question); // For example "jungle" as an alternatie to "rainforests"
if (features.size() >0) {
BooleanQuery.Builder bqb =new BooleanQuery.Builder();
bqb.add(questionQuery, BooleanClause.Occur.SHOULD);
for (String feature : features) {
// TODO: Replace hard-coded weight bqb.add(new BooleanClause(FeatureField.newLinearQuery("feature_field_name", feature,0.3F), BooleanClause.Occur.SHOULD));}
BooleanQuery termExpansionQuery = bqb.build();
log.info("Term expansion query: " + termExpansionQuery);
return termExpansionQuery;
}else {
log.info("Regular query: " + questionQuery);
return questionQuery;
}
Am 15.11.23 um 11:35 schrieb Michael Wechner:
> thank you very much, will try this :-)
>
>
> Am 15.11.23 um 11:25 schrieb Adrien Grand:
>> Say your model produces a set of weighted terms:
>> - At index time, for each (term, weight) pair, you add a "new
>> FeatureField(fieldName, term, weight)` field to your document.
>> - At search time, for each (term, weight) pair, you add a "new
>> BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))"
>> to your BooleanQuery.
>>
>> On Wed, Nov 15, 2023 at 11:08 AM Michael Wechner
>> <mi...@wyona.com> wrote:
>>
>> Hi Adrien
>>
>> Ah ok, I did not realize this, thanks for pointing this out!
>>
>> I don't quite understand though, how you would implement the
>> "SPLADE" approach using FeatureField from the documentation at
>>
>> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>>
>> For example when indexing a document or doing a query and I use
>> some language model (e.g. BERT) to do the term expansion, how
>> do I then make use of FeatureField exactly?
>>
>> I tried to find some code examples, but couldn't, do you maybe
>> have some pointers?
>>
>> Thanks
>>
>> Michael
>>
>>
>> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>>> Hi Michael,
>>>
>>> What functionality are you missing? Lucene already supports
>>> indexing/querying weighted terms using FeatureField.
>>>
>>> On Wed, Nov 15, 2023 at 10:03 AM Michael Wechner
>>> <mi...@wyona.com> wrote:
>>>
>>> Hi
>>>
>>> I have found the following issue re a possible SPLADE
>>> implementation
>>>
>>> https://github.com/apache/lucene/issues/11799
>>>
>>> Is somebody still working on this?
>>>
>>> Thanks
>>>
>>> Michael
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>
>>>
>>> --
>>> Adrien
>>
>>
>>
>> --
>> Adrien
>
Re: SPLADE implementation
Posted by Michael Wechner <mi...@wyona.com>.
thank you very much, will try this :-)
Am 15.11.23 um 11:25 schrieb Adrien Grand:
> Say your model produces a set of weighted terms:
> - At index time, for each (term, weight) pair, you add a "new
> FeatureField(fieldName, term, weight)` field to your document.
> - At search time, for each (term, weight) pair, you add a "new
> BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))"
> to your BooleanQuery.
>
> On Wed, Nov 15, 2023 at 11:08 AM Michael Wechner
> <mi...@wyona.com> wrote:
>
> Hi Adrien
>
> Ah ok, I did not realize this, thanks for pointing this out!
>
> I don't quite understand though, how you would implement the
> "SPLADE" approach using FeatureField from the documentation at
>
> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>
> For example when indexing a document or doing a query and I use
> some language model (e.g. BERT) to do the term expansion, how
> do I then make use of FeatureField exactly?
>
> I tried to find some code examples, but couldn't, do you maybe
> have some pointers?
>
> Thanks
>
> Michael
>
>
> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>> Hi Michael,
>>
>> What functionality are you missing? Lucene already supports
>> indexing/querying weighted terms using FeatureField.
>>
>> On Wed, Nov 15, 2023 at 10:03 AM Michael Wechner
>> <mi...@wyona.com> wrote:
>>
>> Hi
>>
>> I have found the following issue re a possible SPLADE
>> implementation
>>
>> https://github.com/apache/lucene/issues/11799
>>
>> Is somebody still working on this?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>>
>> --
>> Adrien
>
>
>
> --
> Adrien
Re: SPLADE implementation
Posted by Adrien Grand <jp...@gmail.com>.
Say your model produces a set of weighted terms:
- At index time, for each (term, weight) pair, you add a "new
FeatureField(fieldName, term, weight)` field to your document.
- At search time, for each (term, weight) pair, you add a "new
BooleanClause(FeatureField.newLinearQuery(fieldName, term, weight))" to
your BooleanQuery.
On Wed, Nov 15, 2023 at 11:08 AM Michael Wechner <mi...@wyona.com>
wrote:
> Hi Adrien
>
> Ah ok, I did not realize this, thanks for pointing this out!
>
> I don't quite understand though, how you would implement the "SPLADE"
> approach using FeatureField from the documentation at
>
>
> https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
>
> For example when indexing a document or doing a query and I use some
> language model (e.g. BERT) to do the term expansion, how
> do I then make use of FeatureField exactly?
>
> I tried to find some code examples, but couldn't, do you maybe have some
> pointers?
>
> Thanks
>
> Michael
>
>
> Am 15.11.23 um 10:34 schrieb Adrien Grand:
>
> Hi Michael,
>
> What functionality are you missing? Lucene already supports
> indexing/querying weighted terms using FeatureField.
>
> On Wed, Nov 15, 2023 at 10:03 AM Michael Wechner <
> michael.wechner@wyona.com> wrote:
>
>> Hi
>>
>> I have found the following issue re a possible SPLADE implementation
>>
>> https://github.com/apache/lucene/issues/11799
>>
>> Is somebody still working on this?
>>
>> Thanks
>>
>> Michael
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> --
> Adrien
>
>
>
--
Adrien
Re: SPLADE implementation
Posted by Michael Wechner <mi...@wyona.com>.
Hi Adrien
Ah ok, I did not realize this, thanks for pointing this out!
I don't quite understand though, how you would implement the "SPLADE"
approach using FeatureField from the documentation at
https://lucene.apache.org/core/9_8_0/core/org/apache/lucene/document/FeatureField.html
For example when indexing a document or doing a query and I use some
language model (e.g. BERT) to do the term expansion, how
do I then make use of FeatureField exactly?
I tried to find some code examples, but couldn't, do you maybe have some
pointers?
Thanks
Michael
Am 15.11.23 um 10:34 schrieb Adrien Grand:
> Hi Michael,
>
> What functionality are you missing? Lucene already supports
> indexing/querying weighted terms using FeatureField.
>
> On Wed, Nov 15, 2023 at 10:03 AM Michael Wechner
> <mi...@wyona.com> wrote:
>
> Hi
>
> I have found the following issue re a possible SPLADE implementation
>
> https://github.com/apache/lucene/issues/11799
>
> Is somebody still working on this?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
>
> --
> Adrien
Re: SPLADE implementation
Posted by Adrien Grand <jp...@gmail.com>.
Hi Michael,
What functionality are you missing? Lucene already supports
indexing/querying weighted terms using FeatureField.
On Wed, Nov 15, 2023 at 10:03 AM Michael Wechner <mi...@wyona.com>
wrote:
> Hi
>
> I have found the following issue re a possible SPLADE implementation
>
> https://github.com/apache/lucene/issues/11799
>
> Is somebody still working on this?
>
> Thanks
>
> Michael
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
--
Adrien