You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Alessandro Benedetti (Jira)" <ji...@apache.org> on 2023/02/21 10:24:00 UTC

[jira] [Updated] (SOLR-16596) LTR MultipleAdditiveTreeModel do not support missing features' value

     [ https://issues.apache.org/jira/browse/SOLR-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alessandro Benedetti updated SOLR-16596:
----------------------------------------
    Component/s: contrib - LTR

> LTR MultipleAdditiveTreeModel do not support missing features' value
> --------------------------------------------------------------------
>
>                 Key: SOLR-16596
>                 URL: https://issues.apache.org/jira/browse/SOLR-16596
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - LTR
>            Reporter: Anna
>            Priority: Minor
>             Fix For: 9.2
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The current MultipleAdditiveTree model doesn't support missing features' values.
> When a feature value is not passed, the model directly translates it to zero.
> Other LTR model libraries, like xgboost, are able to differentiate missing values from other values and also from zero values. They learn how to treat missing values at training time and add an additional "missing" branch to the tree with the direction learned to be the best in that situation.
> It would be nice to integrate this feature also in Solr MultipleAdditiveTree models. An additional "missing" parameter should be added to the RegressionTreeNode. This will determine the direction to take in case the feature value is missing.
> This integration will allow us to differentiate between zero and missing features. 
> For example, if the feature is "hotel_avg_review" (with a ranking between zero and five stars), we would like to behave differently if the hotel has no reviews (we do not know if it is good) or if it has a review of zero stars (the hotel is bad).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org