You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by damodar shetyo <ak...@gmail.com> on 2012/06/28 15:59:19 UTC

Continued : simple OnlineLogisticRegression classication example using mahout

This post is continuation to another mailing thread thats going on,Sorry
for creating a new thread but  i was not getting mails from group before .

Following code was implemented By Ted Dunning .Now i have few questions:

1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
while creating DenseVector?
  Vector v = new DenseVector(3);   / / why 3 , why not 2?

2) In getVector method why we set       v.set(2, 1); ??

3)Whats the use of setting lambda?

4)What happens if i increase or decrease learning rate?

I have read the book "Mahout in action " and am not able to understand
whats the use of this 2 parameters?


import com.google.common.collect.Lists;
import org.apache.mahout.classifier.sgd.L1;
import org.apache.mahout.classifier.sgd.OnlineLogisticRegression;
import org.apache.mahout.math.DenseVector;
import org.apache.mahout.math.RandomAccessSparseVector;
import org.apache.mahout.math.Vector;

import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class ClassifierExample {

    public static class Point {
        public int x;
        public int y;

        public Point(int x, int y) {
            this.x = x;
            this.y = y;
        }

        @Override
        public boolean equals(Object arg0) {
            Point p = (Point) arg0;
            return ((this.x == p.x) && (this.y == p.y));
        }

        @Override
        public String toString() {
            // TODO Auto-generated method stub
            return this.x + " , " + this.y;
        }
    }

    public static void main(String[] args) {

        Map<Point, Integer> points = new HashMap<Point,
                Integer>();

        points.put(new Point(0, 0), 0);
        points.put(new Point(1, 1), 0);
        points.put(new Point(1, 0), 0);
        points.put(new Point(0, 1), 0);
        points.put(new Point(2, 2), 0);


        points.put(new Point(8, 8), 1);
        points.put(new Point(8, 9), 1);
        points.put(new Point(9, 8), 1);
        points.put(new Point(9, 9), 1);


        OnlineLogisticRegression learningAlgo = new
OnlineLogisticRegression(2, 3, new L1());
        // this is a really big value which will make the model very
cautious
        // for lambda = 0.1, the first example below should be about .83
certain
        // for lambda = 0.01, the first example below should be about 0.98
certain

        learningAlgo.lambda(0.1);
        learningAlgo.learningRate(4);

        System.out.println("training model  \n");
        final List<Point> keys = Lists.newArrayList(points.keySet());
        // 200 times through the training data is probably over-kill.
 Itdoesn't matter
        // for tiny data.  The key here is total number of points seen, not
number of passes.

        for (int i = 0; i < 200; i++) {
            // randomize training data on each iteration
            Collections.shuffle(keys);
            for (Point point : keys) {
                Vector v = getVector(point);
                learningAlgo.train(points.get(point), v);
            }
        }
        learningAlgo.close();


        //now classify real data
        Vector v = new RandomAccessSparseVector(3);
        v.set(0, 0.5);
        v.set(1, 0.5);
        v.set(2, 1);

        Vector r = learningAlgo.classifyFull(v);
        System.out.println(r);

        System.out.println("ans = ");
        System.out.printf("no of categories = %d\n",
learningAlgo.numCategories());
        System.out.printf("no of features = %d\n",
learningAlgo.numFeatures());
        System.out.printf("Probability of cluster 0 = %.3f\n", r.get(0));
        System.out.printf("Probability of cluster 1 = %.3f\n", r.get(1));

        v.set(0, 4.5);
        v.set(1, 6.5);
        v.set(2, 1);

        r = learningAlgo.classifyFull(v);

        System.out.println("ans = ");
        System.out.printf("no of categories = %d\n",
learningAlgo.numCategories());
        System.out.printf("no of features =
%d\n",learningAlgo.numFeatures());
        System.out.printf("Probability of cluster 0 = %.3f\n", r.get(0));
        System.out.printf("Probability of cluster 1 = %.3f\n", r.get(1));

        // show how the score varies along a line from 0,0 to 1,1
        System.out.printf("\nx\tscore\n");
        for (int i = 0; i < 100; i++) {
            final double x = 0.0 + i / 10.0;
            v.set(0, x);
            v.set(1, x);
            v.set(2, 1);

            r = learningAlgo.classifyFull(v);

            System.out.printf("%.2f\t%.3f\n", x, r.get(1));
        }

    }

    public static Vector getVector(Point point) {
        Vector v = new DenseVector(3);
        v.set(0, point.x);
        v.set(1, point.y);
        v.set(2, 1);

        return v;
    }
}


-- 
Regards,
Damodar Shetyo

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Ted Dunning <te...@gmail.com>.

You should pretty much always use 1.

On Mon, Jul 2, 2012 at 7:39 PM, Lance Norskog <go...@gmail.com> wrote:

> Should 1 be the default?
>
> On Mon, Jul 2, 2012 at 6:19 AM, Ted Dunning <te...@gmail.com> wrote:
> > To be clear, you can set it to any non-zero value you like but the
> > algorithm will adjust what it learns and you will get the same result.
> >
> > Setting the intercept to anything except 1 is thus a way to make your
> > program perversely hard for somebody to understand when they read your
> > program.
> >
> > On Mon, Jul 2, 2012 at 12:59 AM, Sean Owen <sr...@gmail.com> wrote:
> >
> >> No just set the bias term to 1 in all cases.
> >>
> >> On Mon, Jul 2, 2012 at 10:13 AM, damodar shetyo <
> akshay.shetye@gmail.com>
> >> wrote:
> >> > Is it required that i set the bias(intercept) equal to one only?Or
> can i
> >> > set it to any constant value x?
> >> >
> >> > Also How can choose value of bias for different types of data (other
> data
> >> > like  spam .non spam /email data etc , or assigning category to news
> >> items)
> >>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Lance Norskog <go...@gmail.com>.

Should 1 be the default?

On Mon, Jul 2, 2012 at 6:19 AM, Ted Dunning <te...@gmail.com> wrote:
> To be clear, you can set it to any non-zero value you like but the
> algorithm will adjust what it learns and you will get the same result.
>
> Setting the intercept to anything except 1 is thus a way to make your
> program perversely hard for somebody to understand when they read your
> program.
>
> On Mon, Jul 2, 2012 at 12:59 AM, Sean Owen <sr...@gmail.com> wrote:
>
>> No just set the bias term to 1 in all cases.
>>
>> On Mon, Jul 2, 2012 at 10:13 AM, damodar shetyo <ak...@gmail.com>
>> wrote:
>> > Is it required that i set the bias(intercept) equal to one only?Or can i
>> > set it to any constant value x?
>> >
>> > Also How can choose value of bias for different types of data (other data
>> > like  spam .non spam /email data etc , or assigning category to news
>> items)
>>



-- 
Lance Norskog
goksron@gmail.com

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Ted Dunning <te...@gmail.com>.

To be clear, you can set it to any non-zero value you like but the
algorithm will adjust what it learns and you will get the same result.

Setting the intercept to anything except 1 is thus a way to make your
program perversely hard for somebody to understand when they read your
program.

On Mon, Jul 2, 2012 at 12:59 AM, Sean Owen <sr...@gmail.com> wrote:

> No just set the bias term to 1 in all cases.
>
> On Mon, Jul 2, 2012 at 10:13 AM, damodar shetyo <ak...@gmail.com>
> wrote:
> > Is it required that i set the bias(intercept) equal to one only?Or can i
> > set it to any constant value x?
> >
> > Also How can choose value of bias for different types of data (other data
> > like  spam .non spam /email data etc , or assigning category to news
> items)
>

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Sean Owen <sr...@gmail.com>.

No just set the bias term to 1 in all cases.

On Mon, Jul 2, 2012 at 10:13 AM, damodar shetyo <ak...@gmail.com> wrote:
> Is it required that i set the bias(intercept) equal to one only?Or can i
> set it to any constant value x?
>
> Also How can choose value of bias for different types of data (other data
> like  spam .non spam /email data etc , or assigning category to news items)

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by damodar shetyo <ak...@gmail.com>.

Is it required that i set the bias(intercept) equal to one only?Or can i
set it to any constant value x?

Also How can choose value of bias for different types of data (other data
like  spam .non spam /email data etc , or assigning category to news items)

Regards,
Damodar

On Thu, Jun 28, 2012 at 9:02 PM, Ted Dunning <te...@gmail.com> wrote:

> http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html
>
> Search for intercept.
>
> Another way to look at this is that the model is trying to find a line that
> separates your examples.  Without the constant (intercept) term, all of
> these lines will have to go through the origin.  For your data, this isn't
> going to find a usable model.  Adding the 1 allows the lines to not go
> through the origin.
>
> On Thu, Jun 28, 2012 at 11:07 AM, Sean Owen <sr...@gmail.com> wrote:
>
> > (The third dimension, 1, is the bias / intercept term. You will
> > probably see this in the literature -- go have a look at a basic intro
> > to logistic regression. I found Andrew Ng's videos on Coursera a good
> > intro-level survey of exactly this.)
> >
> > On Thu, Jun 28, 2012 at 3:57 PM, Ted Dunning <te...@gmail.com>
> > wrote:
> > > On Thu, Jun 28, 2012 at 9:59 AM, damodar shetyo <
> akshay.shetye@gmail.com
> > >wrote:
> > >
> > >> This post is continuation to another mailing thread thats going
> on,Sorry
> > >> for creating a new thread but  i was not getting mails from group
> > before .
> > >>
> > >> Following code was implemented By Ted Dunning .Now i have few
> questions:
> > >>
> > >> 1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
> > >> while creating DenseVector?
> > >>  Vector v = new DenseVector(3);   / / why 3 , why not 2?
> > >>
> > >> 2) In getVector method why we set       v.set(2, 1); ??
> > >>
> > >> 3)Whats the use of setting lambda?
> > >>
> > >
> > > http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf
> > >
> > > (in this next, C is used instead of lambda)
> > >
> >
> http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf
> > >
> > > (and in this one, alpha is used)
> > > http://en.wikipedia.org/wiki/Least_squares#LASSO_method
> > >
> > > 4)What happens if i increase or decrease learning rate?
> > >>
> > >
> > > It affects speed to converge.  Very high starting point can be useful
> in
> > > some cases, but mostly just makes it take longer to converge.   Very
> low
> > > starting point can make convergence fail.
> > >
> > > http://leon.bottou.org/projects/sgd
> >
>



-- 
Regards,
Damodar Shetyo

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Ted Dunning <te...@gmail.com>.

http://alias-i.com/lingpipe/demos/tutorial/logistic-regression/read-me.html

Search for intercept.

Another way to look at this is that the model is trying to find a line that
separates your examples.  Without the constant (intercept) term, all of
these lines will have to go through the origin.  For your data, this isn't
going to find a usable model.  Adding the 1 allows the lines to not go
through the origin.

On Thu, Jun 28, 2012 at 11:07 AM, Sean Owen <sr...@gmail.com> wrote:

> (The third dimension, 1, is the bias / intercept term. You will
> probably see this in the literature -- go have a look at a basic intro
> to logistic regression. I found Andrew Ng's videos on Coursera a good
> intro-level survey of exactly this.)
>
> On Thu, Jun 28, 2012 at 3:57 PM, Ted Dunning <te...@gmail.com>
> wrote:
> > On Thu, Jun 28, 2012 at 9:59 AM, damodar shetyo <akshay.shetye@gmail.com
> >wrote:
> >
> >> This post is continuation to another mailing thread thats going on,Sorry
> >> for creating a new thread but  i was not getting mails from group
> before .
> >>
> >> Following code was implemented By Ted Dunning .Now i have few questions:
> >>
> >> 1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
> >> while creating DenseVector?
> >>  Vector v = new DenseVector(3);   / / why 3 , why not 2?
> >>
> >> 2) In getVector method why we set       v.set(2, 1); ??
> >>
> >> 3)Whats the use of setting lambda?
> >>
> >
> > http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf
> >
> > (in this next, C is used instead of lambda)
> >
> http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf
> >
> > (and in this one, alpha is used)
> > http://en.wikipedia.org/wiki/Least_squares#LASSO_method
> >
> > 4)What happens if i increase or decrease learning rate?
> >>
> >
> > It affects speed to converge.  Very high starting point can be useful in
> > some cases, but mostly just makes it take longer to converge.   Very low
> > starting point can make convergence fail.
> >
> > http://leon.bottou.org/projects/sgd
>

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Sean Owen <sr...@gmail.com>.

(The third dimension, 1, is the bias / intercept term. You will
probably see this in the literature -- go have a look at a basic intro
to logistic regression. I found Andrew Ng's videos on Coursera a good
intro-level survey of exactly this.)

On Thu, Jun 28, 2012 at 3:57 PM, Ted Dunning <te...@gmail.com> wrote:
> On Thu, Jun 28, 2012 at 9:59 AM, damodar shetyo <ak...@gmail.com>wrote:
>
>> This post is continuation to another mailing thread thats going on,Sorry
>> for creating a new thread but  i was not getting mails from group before .
>>
>> Following code was implemented By Ted Dunning .Now i have few questions:
>>
>> 1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
>> while creating DenseVector?
>>  Vector v = new DenseVector(3);   / / why 3 , why not 2?
>>
>> 2) In getVector method why we set       v.set(2, 1); ??
>>
>> 3)Whats the use of setting lambda?
>>
>
> http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf
>
> (in this next, C is used instead of lambda)
> http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf
>
> (and in this one, alpha is used)
> http://en.wikipedia.org/wiki/Least_squares#LASSO_method
>
> 4)What happens if i increase or decrease learning rate?
>>
>
> It affects speed to converge.  Very high starting point can be useful in
> some cases, but mostly just makes it take longer to converge.   Very low
> starting point can make convergence fail.
>
> http://leon.bottou.org/projects/sgd

Re: Continued : simple OnlineLogisticRegression classication example using mahout

Posted by Ted Dunning <te...@gmail.com>.

On Thu, Jun 28, 2012 at 9:59 AM, damodar shetyo <ak...@gmail.com>wrote:

> This post is continuation to another mailing thread thats going on,Sorry
> for creating a new thread but  i was not getting mails from group before .
>
> Following code was implemented By Ted Dunning .Now i have few questions:
>
> 1)The point (x,y) has 2 dimensions.But why are we using 3 instead of 2
> while creating DenseVector?
>  Vector v = new DenseVector(3);   / / why 3 , why not 2?
>
> 2) In getVector method why we set       v.set(2, 1); ??
>
> 3)Whats the use of setting lambda?
>

http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf

(in this next, C is used instead of lambda)
http://www.ttic.edu/sigml/symposium2011/papers/Moore+DeNero_Regularization.pdf

(and in this one, alpha is used)
http://en.wikipedia.org/wiki/Least_squares#LASSO_method

4)What happens if i increase or decrease learning rate?
>

It affects speed to converge.  Very high starting point can be useful in
some cases, but mostly just makes it take longer to converge.   Very low
starting point can make convergence fail.

http://leon.bottou.org/projects/sgd