You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Gerald Loeffler <ge...@googlemail.com> on 2015/08/07 10:45:40 UTC

miniBatchFraction for LinearRegressionWithSGD

hi,

if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
doesn’t that make it a deterministic/classical gradient descent rather
than a SGD?

Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
all rows. In the spirit of SGD, shouldn’t the default be the fraction
that results in exactly one row of the data set?

thank you
gerald

-- 
Gerald Loeffler
mailto:gerald.loeffler@googlemail.com
http://www.gerald-loeffler.net

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Koen Vantomme <ko...@gmail.com>.


Verzonden vanaf mijn Sony Xperia™-smartphone

---- Meihua Wu schreef ----

>Feynman, thanks for clarifying.
>
>If we default miniBatchFraction = (1 / numInstances), then we will
>only hit one row for every iteration of SGD regardless the number of
>partitions and executors. In other words the parallelism provided by
>the RDD is lost in this approach. I think this is something we need to
>consider for the default value of miniBatchFraction.
>
>On Fri, Aug 7, 2015 at 11:24 AM, Feynman Liang <fl...@databricks.com> wrote:
>> Yep, I think that's what Gerald is saying and they are proposing to default
>> miniBatchFraction = (1 / numInstances). Is that correct?
>>
>> On Fri, Aug 7, 2015 at 11:16 AM, Meihua Wu <ro...@gmail.com>
>> wrote:
>>>
>>> I think in the SGD algorithm, the mini batch sample is done without
>>> replacement. So with fraction=1, then all the rows will be sampled
>>> exactly once to form the miniBatch, resulting to the
>>> deterministic/classical case.
>>>
>>> On Fri, Aug 7, 2015 at 9:05 AM, Feynman Liang <fl...@databricks.com>
>>> wrote:
>>> > Sounds reasonable to me, feel free to create a JIRA (and PR if you're up
>>> > for
>>> > it) so we can see what others think!
>>> >
>>> > On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler
>>> > <ge...@googlemail.com> wrote:
>>> >>
>>> >> hi,
>>> >>
>>> >> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
>>> >> doesn’t that make it a deterministic/classical gradient descent rather
>>> >> than a SGD?
>>> >>
>>> >> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
>>> >> all rows. In the spirit of SGD, shouldn’t the default be the fraction
>>> >> that results in exactly one row of the data set?
>>> >>
>>> >> thank you
>>> >> gerald
>>> >>
>>> >> --
>>> >> Gerald Loeffler
>>> >> mailto:gerald.loeffler@googlemail.com
>>> >> http://www.gerald-loeffler.net
>>> >>
>>> >> ---------------------------------------------------------------------
>>> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> >> For additional commands, e-mail: user-help@spark.apache.org
>>> >>
>>> >
>>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>For additional commands, e-mail: user-help@spark.apache.org
>

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Feynman Liang <fl...@databricks.com>.

Good point; I agree that defaulting to online SGD (single example per
iteration) would be a poor UX due to performance.

On Fri, Aug 7, 2015 at 12:44 PM, Meihua Wu <ro...@gmail.com>
wrote:

> Feynman, thanks for clarifying.
>
> If we default miniBatchFraction = (1 / numInstances), then we will
> only hit one row for every iteration of SGD regardless the number of
> partitions and executors. In other words the parallelism provided by
> the RDD is lost in this approach. I think this is something we need to
> consider for the default value of miniBatchFraction.
>
> On Fri, Aug 7, 2015 at 11:24 AM, Feynman Liang <fl...@databricks.com>
> wrote:
> > Yep, I think that's what Gerald is saying and they are proposing to
> default
> > miniBatchFraction = (1 / numInstances). Is that correct?
> >
> > On Fri, Aug 7, 2015 at 11:16 AM, Meihua Wu <rotationsymmetry14@gmail.com
> >
> > wrote:
> >>
> >> I think in the SGD algorithm, the mini batch sample is done without
> >> replacement. So with fraction=1, then all the rows will be sampled
> >> exactly once to form the miniBatch, resulting to the
> >> deterministic/classical case.
> >>
> >> On Fri, Aug 7, 2015 at 9:05 AM, Feynman Liang <fl...@databricks.com>
> >> wrote:
> >> > Sounds reasonable to me, feel free to create a JIRA (and PR if you're
> up
> >> > for
> >> > it) so we can see what others think!
> >> >
> >> > On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler
> >> > <ge...@googlemail.com> wrote:
> >> >>
> >> >> hi,
> >> >>
> >> >> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
> >> >> doesn’t that make it a deterministic/classical gradient descent
> rather
> >> >> than a SGD?
> >> >>
> >> >> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
> >> >> all rows. In the spirit of SGD, shouldn’t the default be the fraction
> >> >> that results in exactly one row of the data set?
> >> >>
> >> >> thank you
> >> >> gerald
> >> >>
> >> >> --
> >> >> Gerald Loeffler
> >> >> mailto:gerald.loeffler@googlemail.com
> >> >> http://www.gerald-loeffler.net
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >> >> For additional commands, e-mail: user-help@spark.apache.org
> >> >>
> >> >
> >
> >
>

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Meihua Wu <ro...@gmail.com>.

Feynman, thanks for clarifying.

If we default miniBatchFraction = (1 / numInstances), then we will
only hit one row for every iteration of SGD regardless the number of
partitions and executors. In other words the parallelism provided by
the RDD is lost in this approach. I think this is something we need to
consider for the default value of miniBatchFraction.

On Fri, Aug 7, 2015 at 11:24 AM, Feynman Liang <fl...@databricks.com> wrote:
> Yep, I think that's what Gerald is saying and they are proposing to default
> miniBatchFraction = (1 / numInstances). Is that correct?
>
> On Fri, Aug 7, 2015 at 11:16 AM, Meihua Wu <ro...@gmail.com>
> wrote:
>>
>> I think in the SGD algorithm, the mini batch sample is done without
>> replacement. So with fraction=1, then all the rows will be sampled
>> exactly once to form the miniBatch, resulting to the
>> deterministic/classical case.
>>
>> On Fri, Aug 7, 2015 at 9:05 AM, Feynman Liang <fl...@databricks.com>
>> wrote:
>> > Sounds reasonable to me, feel free to create a JIRA (and PR if you're up
>> > for
>> > it) so we can see what others think!
>> >
>> > On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler
>> > <ge...@googlemail.com> wrote:
>> >>
>> >> hi,
>> >>
>> >> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
>> >> doesn’t that make it a deterministic/classical gradient descent rather
>> >> than a SGD?
>> >>
>> >> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
>> >> all rows. In the spirit of SGD, shouldn’t the default be the fraction
>> >> that results in exactly one row of the data set?
>> >>
>> >> thank you
>> >> gerald
>> >>
>> >> --
>> >> Gerald Loeffler
>> >> mailto:gerald.loeffler@googlemail.com
>> >> http://www.gerald-loeffler.net
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> >> For additional commands, e-mail: user-help@spark.apache.org
>> >>
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Feynman Liang <fl...@databricks.com>.

Yep, I think that's what Gerald is saying and they are proposing to default
miniBatchFraction = (1 / numInstances). Is that correct?

On Fri, Aug 7, 2015 at 11:16 AM, Meihua Wu <ro...@gmail.com>
wrote:

> I think in the SGD algorithm, the mini batch sample is done without
> replacement. So with fraction=1, then all the rows will be sampled
> exactly once to form the miniBatch, resulting to the
> deterministic/classical case.
>
> On Fri, Aug 7, 2015 at 9:05 AM, Feynman Liang <fl...@databricks.com>
> wrote:
> > Sounds reasonable to me, feel free to create a JIRA (and PR if you're up
> for
> > it) so we can see what others think!
> >
> > On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler
> > <ge...@googlemail.com> wrote:
> >>
> >> hi,
> >>
> >> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
> >> doesn’t that make it a deterministic/classical gradient descent rather
> >> than a SGD?
> >>
> >> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
> >> all rows. In the spirit of SGD, shouldn’t the default be the fraction
> >> that results in exactly one row of the data set?
> >>
> >> thank you
> >> gerald
> >>
> >> --
> >> Gerald Loeffler
> >> mailto:gerald.loeffler@googlemail.com
> >> http://www.gerald-loeffler.net
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >> For additional commands, e-mail: user-help@spark.apache.org
> >>
> >
>

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Meihua Wu <ro...@gmail.com>.

I think in the SGD algorithm, the mini batch sample is done without
replacement. So with fraction=1, then all the rows will be sampled
exactly once to form the miniBatch, resulting to the
deterministic/classical case.

On Fri, Aug 7, 2015 at 9:05 AM, Feynman Liang <fl...@databricks.com> wrote:
> Sounds reasonable to me, feel free to create a JIRA (and PR if you're up for
> it) so we can see what others think!
>
> On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler
> <ge...@googlemail.com> wrote:
>>
>> hi,
>>
>> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
>> doesn’t that make it a deterministic/classical gradient descent rather
>> than a SGD?
>>
>> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
>> all rows. In the spirit of SGD, shouldn’t the default be the fraction
>> that results in exactly one row of the data set?
>>
>> thank you
>> gerald
>>
>> --
>> Gerald Loeffler
>> mailto:gerald.loeffler@googlemail.com
>> http://www.gerald-loeffler.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: miniBatchFraction for LinearRegressionWithSGD

Posted by Feynman Liang <fl...@databricks.com>.

Sounds reasonable to me, feel free to create a JIRA (and PR if you're up
for it) so we can see what others think!

On Fri, Aug 7, 2015 at 1:45 AM, Gerald Loeffler <
gerald.loeffler@googlemail.com> wrote:

> hi,
>
> if new LinearRegressionWithSGD() uses a miniBatchFraction of 1.0,
> doesn’t that make it a deterministic/classical gradient descent rather
> than a SGD?
>
> Specifically, miniBatchFraction=1.0 means the entire data set, i.e.
> all rows. In the spirit of SGD, shouldn’t the default be the fraction
> that results in exactly one row of the data set?
>
> thank you
> gerald
>
> --
> Gerald Loeffler
> mailto:gerald.loeffler@googlemail.com
> http://www.gerald-loeffler.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>