You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Dhanesh Padmanabhan <dh...@gmail.com> on 2017/03/19 11:08:22 UTC

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

binomial. Please use in combination with onevsrest for multi-class problems
in spark 2.0.2

Dhanesh
+91-9741125245

On Sun, Mar 19, 2017 at 4:29 PM, jinhong lu <lu...@gmail.com> wrote:

> By the way, I found in spark 2.1 I can use setFamily() to decide binomial
> or multinomial, but how  can I do the same thing in spark 2.0.2?
> If  not support , which one is used in spark 2.0.2?  binomial or
> multinomial?
>
> 在 2017年3月19日，18:12，jinhong lu <lu...@gmail.com> 写道：
>
>
> I train my LogisticRegressionModel like this,  I want my model to retain
> only some of the features(e.g. 500 of them), not all the 5555 features.
> What shou I do?
> I use .setElasticNetParam(1.0), but still all the features is
> in lrModel.coefficients.
>
>  import org.apache.spark.ml.classification.LogisticRegression
>  val data=spark.read.format("libsvm").option("numFeatures",
> "5555").load("/tmp/data/training_data3")
>  val Array(trainingData, testData) = data.randomSplit(Array(0.5, 0.5),
> seed = 1234L)
>
>  val lr = new LogisticRegression()
>  val lrModel = lr.fit(trainingData)
>  println(s"Coefficients: ${lrModel.coefficients} Intercept:
> ${lrModel.intercept}")
>
>  val predictions = lrModel.transform(testData)
>  predictions.show()
>
>
> Thanks,
> lujinhong
>
>
> Thanks,
> lujinhong
>
>

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by "颜发才 (Yan Facai)" <fa...@gmail.com>.

Hi, jinhong.
Do you use `setRegParam`, which is 0.0 by default ?


Both elasticNetParam and regParam are required if regularization is need.

val regParamL1 = $(elasticNetParam) * $(regParam)
val regParamL2 = (1.0 - $(elasticNetParam)) * $(regParam)




On Mon, Mar 20, 2017 at 6:31 PM, Yanbo Liang <yb...@gmail.com> wrote:

> Do you want to get sparse model that most of the coefficients are zeros?
> If yes, using L1 regularization leads to sparsity. But the
> LogisticRegressionModel coefficients vector's size is still equal with the
> number of features, you can get the non-zero elements manually. Actually,
> it would be a sparse vector (or matrix for multinomial case) if it's sparse
> enough.
>
> Thanks
> Yanbo
>
> On Sun, Mar 19, 2017 at 5:02 AM, Dhanesh Padmanabhan <
> dhanesh123us@gmail.com> wrote:
>
>> It shouldn't be difficult to convert the coefficients to a sparse vector.
>> Not sure if that is what you are looking for
>>
>> -Dhanesh
>>
>> On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:
>>
>> Thanks Dhanesh,  and how about the features question?
>>
>> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
>>
>> Dhanesh
>>
>>
>> Thanks,
>> lujinhong
>>
>> --
>> Dhanesh
>> +91-9741125245 <+91%2097411%2025245>
>>
>
>

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by "颜发才 (Yan Facai)" <fa...@gmail.com>.

Hi, jinhong.
Do you use `setRegParam`, which is 0.0 by default ?


Both elasticNetParam and regParam are required if regularization is need.

val regParamL1 = $(elasticNetParam) * $(regParam)
val regParamL2 = (1.0 - $(elasticNetParam)) * $(regParam)




On Mon, Mar 20, 2017 at 6:31 PM, Yanbo Liang <yb...@gmail.com> wrote:

> Do you want to get sparse model that most of the coefficients are zeros?
> If yes, using L1 regularization leads to sparsity. But the
> LogisticRegressionModel coefficients vector's size is still equal with the
> number of features, you can get the non-zero elements manually. Actually,
> it would be a sparse vector (or matrix for multinomial case) if it's sparse
> enough.
>
> Thanks
> Yanbo
>
> On Sun, Mar 19, 2017 at 5:02 AM, Dhanesh Padmanabhan <
> dhanesh123us@gmail.com> wrote:
>
>> It shouldn't be difficult to convert the coefficients to a sparse vector.
>> Not sure if that is what you are looking for
>>
>> -Dhanesh
>>
>> On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:
>>
>> Thanks Dhanesh,  and how about the features question?
>>
>> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
>>
>> Dhanesh
>>
>>
>> Thanks,
>> lujinhong
>>
>> --
>> Dhanesh
>> +91-9741125245 <+91%2097411%2025245>
>>
>
>

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by Yanbo Liang <yb...@gmail.com>.

Do you want to get sparse model that most of the coefficients are zeros? If
yes, using L1 regularization leads to sparsity. But the
LogisticRegressionModel coefficients vector's size is still equal with the
number of features, you can get the non-zero elements manually. Actually,
it would be a sparse vector (or matrix for multinomial case) if it's sparse
enough.

Thanks
Yanbo

On Sun, Mar 19, 2017 at 5:02 AM, Dhanesh Padmanabhan <dhanesh123us@gmail.com
> wrote:

> It shouldn't be difficult to convert the coefficients to a sparse vector.
> Not sure if that is what you are looking for
>
> -Dhanesh
>
> On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:
>
> Thanks Dhanesh,  and how about the features question?
>
> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
>
> Dhanesh
>
>
> Thanks,
> lujinhong
>
> --
> Dhanesh
> +91-9741125245
>

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by Yanbo Liang <yb...@gmail.com>.

Do you want to get sparse model that most of the coefficients are zeros? If
yes, using L1 regularization leads to sparsity. But the
LogisticRegressionModel coefficients vector's size is still equal with the
number of features, you can get the non-zero elements manually. Actually,
it would be a sparse vector (or matrix for multinomial case) if it's sparse
enough.

Thanks
Yanbo

On Sun, Mar 19, 2017 at 5:02 AM, Dhanesh Padmanabhan <dhanesh123us@gmail.com
> wrote:

> It shouldn't be difficult to convert the coefficients to a sparse vector.
> Not sure if that is what you are looking for
>
> -Dhanesh
>
> On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:
>
> Thanks Dhanesh,  and how about the features question?
>
> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
>
> Dhanesh
>
>
> Thanks,
> lujinhong
>
> --
> Dhanesh
> +91-9741125245
>

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by Dhanesh Padmanabhan <dh...@gmail.com>.

It shouldn't be difficult to convert the coefficients to a sparse vector.
Not sure if that is what you are looking for

-Dhanesh

On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:

Thanks Dhanesh,  and how about the features question?

在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：

Dhanesh


Thanks,
lujinhong

-- 
Dhanesh
+91-9741125245

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by Dhanesh Padmanabhan <dh...@gmail.com>.

It shouldn't be difficult to convert the coefficients to a sparse vector.
Not sure if that is what you are looking for

-Dhanesh

On Sun, Mar 19, 2017 at 5:02 PM jinhong lu <lu...@gmail.com> wrote:

Thanks Dhanesh,  and how about the features question?

在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：

Dhanesh


Thanks,
lujinhong

-- 
Dhanesh
+91-9741125245

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by jinhong lu <lu...@gmail.com>.

Thanks Dhanesh,  and how about the features question?

> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
> 
> Dhanesh

Thanks,
lujinhong

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

Posted by jinhong lu <lu...@gmail.com>.

Thanks Dhanesh,  and how about the features question?

> 在 2017年3月19日，19:08，Dhanesh Padmanabhan <dh...@gmail.com> 写道：
> 
> Dhanesh

Thanks,
lujinhong