You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Kevin Osborn (JIRA)" <ji...@apache.org> on 2009/08/18 01:58:15 UTC

[jira] Created: (SOLR-1365) Add configurable Sweetspot Similarity factory

Add configurable Sweetspot Similarity factory
---------------------------------------------

                 Key: SOLR-1365
                 URL: https://issues.apache.org/jira/browse/SOLR-1365
             Project: Solr
          Issue Type: New Feature
    Affects Versions: 1.3
            Reporter: Kevin Osborn
            Priority: Minor
             Fix For: 1.4


This is some code that I wrote a while back.

Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.

So, in schema.xml, you could have something like this:

<similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
    <bool name="useHyperbolicTf">true</bool>

	<float name="hyperbolicTfFactorsMin">1.0</float>
	<float name="hyperbolicTfFactorsMax">1.5</float>
	<float name="hyperbolicTfFactorsBase">1.3</float>
	<float name="hyperbolicTfFactorsXOffset">2.0</float>

	<int name="lengthNormFactorsMin">1</int>
	<int name="lengthNormFactorsMax">1</int>
	<float name="lengthNormFactorsSteepness">0.5</float>

	<int name="lengthNormFactorsMin_description">2</int>
	<int name="lengthNormFactorsMax_description">9</int>
	<float name="lengthNormFactorsSteepness_description">0.2</float>

	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
 </similarity>

So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Kevin Osborn (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kevin Osborn updated SOLR-1365:
-------------------------------

    Attachment: SOLR-1365.patch

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744434#action_12744434 ] 

Erik Hatcher commented on SOLR-1365:
------------------------------------

Any class loaded by SolrResourceLoader (any custom plugin, basically) can implement SolrCoreAware.

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Grant Ingersoll (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Ingersoll updated SOLR-1365:
----------------------------------

    Fix Version/s:     (was: 1.4)
                   1.5

Needs tests.  Not sure this will make 1.4, as we are trying to not add new features at this point.

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12834938#action_12834938 ] 

Hoss Man commented on SOLR-1365:
--------------------------------

The constraints on what can be SolrCoreAware exist for two main reasons:

 # to ensure some sanity in initialization .. one of the main reasons the SolrCoreAware interface was needed in the first place was because some plugins wanted to use the SolrCore to get access to other plugins during their initialization -- but those other components weren't necessarily initialized yet.  with the inform(SolrCore) method SolrCoreAware plugins know that all other components have been initialized, but they haven't necessarily been informed about the SolrCore, so they might not be "ready" to deal with other plugins yet ... it's generally just a big initialization-cluster-fuck, so the fewer classes involved the better
 # prevent too much pollution of the SolrCore API.  having direct access to the SolrCore is "a big deal" -- once you have a reference to the core, you can get to pretty much anything, which opens us (ie: Solr maintainers) up to a lot of crazy code paths to worry about -- so the fewer plugin types that we need to consider when making changes to SolrCore the better.

In the case of SimilarityFactor, i'm not entirely sure how i feel about making it SolrCoreAware(able) ... we have tried really, REALLY hard to make sure nothing initialized as part of the IndexSchema can be SolrCore aware because it opens up the possibility of plugin behavior being affected by SolrCore configuration which might be differnet between master and slave machines -- which could provide disastrous results.  a schema.xml needs to be internally consistent regardless of what solrconfig.xml might refrence it.

In this case the real issue isn't that we have a use case where SImilarityFactory _needs_ access to SolrCore -- what it wants access to is the IndexSchema, so it might make sense to just provide access to that in some way w/o having to expos the entire SolrCore.

Practically speaking, after re-skimming the patch: I'm not even convinced that would eally add anything.  refactoring/reusing some of the *code* that IndexSchema uses to manage dynamicFIelds might be handy for the SweetSpotSimilarityFactory, but i don't actual see how being able to inspect the IndexSchema to get the list of dynamicFields (or find out if a field is dynamic) would make it any better or easier to use.  We'd still want people to configure it with field names and field name globs directly because there won't necessarily be a one to one correspondence between what fields are dynamic in the schema and how you want the sweetspots defined ... you might have a generic "en_*" dynamicField in your schema for english text, and an "fr_*" dynamicField for french text, but that doesn't mean the sweetspot for all "fr_*" fields will be the same ... you are just as likely to want some very specific field names to have their own sweetspot, or to have the sweetspot be suffix based (ie: "*_title" could have one sweetspot even the resulting field names are fr_title and en_title.

I think the patch could be improved, and i think there is definitely some code reuse possibility for parsing the field name globs, but i don't know that it really needs run time access to the IndexSchema (and it definitely doesn't need access to the SolrCore)

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Kevin Osborn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744376#action_12744376 ] 

Kevin Osborn commented on SOLR-1365:
------------------------------------

Thanks for the feedback. I looked at IndexSchema. It seems like the only useful function in my case is using isDynamicField vs. seeing if the field name ends with a "*".

But also is SimilarityFactory allowed to implement SolrCoreAware? I'm not too familiar with this interface, but my initial research shows that only SolrRequestHandler, QueryResponseWriter, SearchComponent, or UpdateRequestProcessorFactory may implement SolrCoreAware. Is this correct?

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832973#action_12832973 ] 

Erik Hatcher commented on SOLR-1365:
------------------------------------

I'm not really sure why we have that constraint in SolrResourceLoader, and why any class we load can't simply implement SolrCoreAware.  But at the very least, we can update this to support a SimilarityFactory for the sake of this issue.  +1

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744309#action_12744309 ] 

Erik Hatcher commented on SOLR-1365:
------------------------------------

bq. I took a brief look at the patch, the only feedback I have is that I believe that the dynamic field handling might be able to leverage some of Solr's built-in logic in IndexSchema. But how can a SimilarityFactory get access to that? Hmmm....?

Why by implementing SolrCoreAware, of course.

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744643#action_12744643 ] 

Hoss Man commented on SOLR-1365:
--------------------------------

FWIW: if a new feature doesn't have any impact on existing users, and has good tests, then i say we might as well commit it for 1.4

 (If we were talking about a new feature on an existing component, then i'd be hesitant because of how that feature might impact existing users of that component -- but in this case even if it has bad performance or some small bug that slips through tests, people have to go out of their way to use it)

But grant's right: needs tests before it's really a subject for debate.

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Kevin Osborn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832711#action_12832711 ] 

Kevin Osborn commented on SOLR-1365:
------------------------------------

I am finally getting back around to this. And I am having trouble implementing SolrCoreAware. As The SolrResourceLoader has a method called assertAwareCompatibility which throws an exception my class does not extend SolrRequestHandler, QueryResponseWriter, SearchComponent, or UpdateRequestProcessorFactory. Am I missing anything?

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Kevin Osborn (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744682#action_12744682 ] 

Kevin Osborn commented on SOLR-1365:
------------------------------------

Thanks for the comments. I'll make the changes for Erik's suggestions and come up with some tests. If it gets into 1.4, great. If not, then it is not a huge deal since this is already production code for us. But, if it could be put into the main code base, then even better.

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.5
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1365) Add configurable Sweetspot Similarity factory

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744306#action_12744306 ] 

Erik Hatcher commented on SOLR-1365:
------------------------------------

Sweet!  :)

Very nice use of the SimilarityFactory capability.  

I took a brief look at the patch, the only feedback I have is that I believe that the dynamic field handling might be able to leverage some of Solr's built-in logic in IndexSchema.  But how can a SimilarityFactory get access to that?   Hmmm....?

> Add configurable Sweetspot Similarity factory
> ---------------------------------------------
>
>                 Key: SOLR-1365
>                 URL: https://issues.apache.org/jira/browse/SOLR-1365
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.3
>            Reporter: Kevin Osborn
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1365.patch
>
>
> This is some code that I wrote a while back.
> Normally, if you use SweetSpotSimilarity, you are going to make it do something useful by extending SweetSpotSimilarity. So, instead, I made a factory class and an configurable SweetSpotSimilarty. There are two classes. SweetSpotSimilarityFactory reads the parameters from schema.xml. It then creates an instance of VariableSweetSpotSimilarity, which is my custom SweetSpotSimilarity class. In addition to the standard functions, it also handles dynamic fields.
> So, in schema.xml, you could have something like this:
> <similarity class="org.apache.solr.schema.SweetSpotSimilarityFactory">
>     <bool name="useHyperbolicTf">true</bool>
> 	<float name="hyperbolicTfFactorsMin">1.0</float>
> 	<float name="hyperbolicTfFactorsMax">1.5</float>
> 	<float name="hyperbolicTfFactorsBase">1.3</float>
> 	<float name="hyperbolicTfFactorsXOffset">2.0</float>
> 	<int name="lengthNormFactorsMin">1</int>
> 	<int name="lengthNormFactorsMax">1</int>
> 	<float name="lengthNormFactorsSteepness">0.5</float>
> 	<int name="lengthNormFactorsMin_description">2</int>
> 	<int name="lengthNormFactorsMax_description">9</int>
> 	<float name="lengthNormFactorsSteepness_description">0.2</float>
> 	<int name="lengthNormFactorsMin_supplierDescription_*">2</int>
> 	<int name="lengthNormFactorsMax_supplierDescription_*">7</int>
> 	<float name="lengthNormFactorsSteepness_supplierDescription_*">0.4</float>
>  </similarity>
> So, now everything is in a config file instead of having to create your own subclass.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.