You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Christine Poerschke (JIRA)" <ji...@apache.org> on 2016/02/22 17:28:18 UTC

[jira] [Commented] (SOLR-8542) Integrate Learning to Rank into Solr

    [ https://issues.apache.org/jira/browse/SOLR-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15157231#comment-15157231 ] 

Christine Poerschke commented on SOLR-8542:
-------------------------------------------

Hello. Just a quick note to say that i'm resuming actively looking at this ticket, today focused mainly on the [solr/contrib/ltr/src/java/org/apache/solr/ltr/rest|https://github.com/bloomberg/lucene-solr/tree/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/rest] classes.

*code comments/questions:*
* In [ManagedFeatureStore|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/rest/ManagedFeatureStore.java] and [ManagedModelStore|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/rest/ManagedModelStore.java] the doDeleteChild method makes no storeManagedData method call - oversight?
* ManagedFeatureStore.doGet throws an exception when the childId concerned is not present, might it just return a response without features?
* ManagedResource.doPut->ManagedFeatureStore.applyUpdatesToManagedData->update->addFeature calling chain it seems could throw an exception when a name being updated/added already exists. [REST wikipedia page|https://en.wikipedia.org/wiki/Representational_state_transfer] mentions about PUT and DELETE being idempotent - should repeats of the same name simply replace the existing entry for that name?

*observations (question to follow):*
* ManagedFeatureStore.addFeature calls NameValidator.check and could throw an InvalidFeatureNameException exception
* ManagedFeatureStore.createFeature would throw an exception if Class.forName(type) finds no class or f.init(name, params, id) throws an exception
* ManagedModelStore.applyUpdatesToManagedData->update->makeModelMetaData throws an exception when the data has no features field or when there are other 'invalid input' type problems
* [LTRComponent|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/ranking/LTRComponent.java] uses ManagedFeatureStore and ManagedModelStore
* [LTRQParserPlugin|https://github.com/bloomberg/lucene-solr/blob/master-ltr-plugin-rfc/solr/contrib/ltr/src/java/org/apache/solr/ltr/ranking/LTRQParserPlugin.java] uses ManagedModelStore, and ManagedModelStore in turn uses ManagedFeatureStore

*question (for everyone and perhaps more REST that LTR related actually really?):*
* To what extent should the REST/ManagedResource class be only representing state and/or to what extent should it also contain 'invalid input' type logic and associated error handling?
* If the represented state could be logically valid as well as invalid, might the state representation and use of the represented state be separated out, perhaps something along these lines in {{LTRComponent.inform(SolrCore core)}}?

{code}
core.getRestManager().addManagedResource(LTRParams.FSTORE_END_POINT, ManagedFeatureStoreInfo.class);
ManagedFeatureStoreInfo fri = (ManagedFeatureStoreInfo) core.getRestManager().getManagedResource(LTRParams.FSTORE_END_POINT);

core.getRestManager().addManagedResource(LTRParams.MSTORE_END_POINT, ManagedModelStoreInfo.class);
ManagedModelStoreInfo mri = (ManagedModelStoreInfo) core.getRestManager().getManagedResource(LTRParams.MSTORE_END_POINT);

LTRModelStore ltr_ms;
try {
  ltr_ms = new LTRModelStore(fri, mri);
} catch ... {
  // exception handling here
}
// TODO: do something here so that ltr_ms is available to LTRQParserPlugin
// question: would feature store and model store changes still propagate through to ltr_ms?
{code}

> Integrate Learning to Rank into Solr
> ------------------------------------
>
>                 Key: SOLR-8542
>                 URL: https://issues.apache.org/jira/browse/SOLR-8542
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joshua Pantony
>            Assignee: Christine Poerschke
>            Priority: Minor
>         Attachments: README.md, README.md, SOLR-8542-branch_5x.patch, SOLR-8542-trunk.patch
>
>
> This is a ticket to integrate learning to rank machine learning models into Solr. Solr Learning to Rank (LTR) provides a way for you to extract features directly inside Solr for use in training a machine learned model. You can then deploy that model to Solr and use it to rerank your top X search results. This concept was previously presented by the authors at Lucene/Solr Revolution 2015 ( http://www.slideshare.net/lucidworks/learning-to-rank-in-solr-presented-by-michael-nilsson-diego-ceccarelli-bloomberg-lp ).
> The attached code was jointly worked on by Joshua Pantony, Michael Nilsson, David Grohmann and Diego Ceccarelli.
> Any chance this could make it into a 5x release? We've also attached documentation as a github MD file, but are happy to convert to a desired format.
> h3. Test the plugin with solr/example/techproducts in 6 steps
> Solr provides some simple example of indices. In order to test the plugin with 
> the techproducts example please follow these steps
> h4. 1. compile solr and the examples 
> cd solr
> ant dist
> ant example
> h4. 2. run the example
> ./bin/solr -e techproducts 
> h4. 3. stop it and install the plugin:
>    
> ./bin/solr stop
> mkdir example/techproducts/solr/techproducts/lib
> cp build/contrib/ltr/lucene-ltr-6.0.0-SNAPSHOT.jar example/techproducts/solr/techproducts/lib/
> cp contrib/ltr/example/solrconfig.xml example/techproducts/solr/techproducts/conf/
> h4. 4. run the example again
>     
> ./bin/solr -e techproducts
> h4. 5. index some features and a model
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/fstore'  --data-binary "@./contrib/ltr/example/techproducts-features.json"  -H 'Content-type:application/json'
> curl -XPUT 'http://localhost:8983/solr/techproducts/schema/mstore'  --data-binary "@./contrib/ltr/example/techproducts-model.json"  -H 'Content-type:application/json'
> h4. 6. have fun !
> *access to the default feature store*
> http://localhost:8983/solr/techproducts/schema/fstore/_DEFAULT_ 
> *access to the model store*
> http://localhost:8983/solr/techproducts/schema/mstore
> *perform a query using the model, and retrieve the features*
> http://localhost:8983/solr/techproducts/query?indent=on&q=test&wt=json&rq={!ltr%20model=svm%20reRankDocs=25%20efi.query=%27test%27}&fl=*,[features],price,score,name&fv=true



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org