You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2018/04/06 15:53:00 UTC
[jira] [Created] (SOLR-12197) Implement sampling for logistic
regression classifier
Joel Bernstein created SOLR-12197:
-------------------------------------
Summary: Implement sampling for logistic regression classifier
Key: SOLR-12197
URL: https://issues.apache.org/jira/browse/SOLR-12197
Project: Solr
Issue Type: Improvement
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein
Currently the *train* function trains a logistic regression model by iterating over the entire distributed training set on each pass. Each iteration involves building a matrix on each shard with the number of rows being the size of the training set contained on the shard. The number of columns will be the number of features. This scenario can create very large matrices when working with large training sets and feature sets.
This ticket will add a *sample* parameter which will limit the size of the training set on each iteration to a random sample of the training set.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org