You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by XiaoboGu <gu...@gmail.com> on 2011/05/23 17:23:50 UTC

Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after trainning?

Hi,
	The TrainNewsGroup.java just use OnlineLogisticRegression model = state.getModels().get(0); to get an OLR object to do the overall description of the AdaptiveLogisticRegression’s performance, there are two questions:
1. Are the OLR objects of the best CrosFolderLearner equal.
2. According to what cretirear, the best CrossFolderLearner object is chosen?

Regards,

Xiaobo Gu


RE: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by XiaoboGu <gu...@gmail.com>.
I see now.

> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Sunday, May 29, 2011 1:44 PM
> To: user@mahout.apache.org
> Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after
> training?
> 
> I think not.
> 
> When you present the first symbol to the Dictionary, dict.size() will be
> zero.  That value will be inserted into the table under that symbol.  Each
> new symbol will be inserted with the size of the table as it was *before*
> that symbol was inserted.
> 
> I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> demonstrate and enforce this.  It won't be committed until the current
> release goes out.
> 
> On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com> wrote:
> >
> >
> > Ok, then target values are always more than 0, I refer to this
> >
> > public class Dictionary {
> >  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
> >
> >  public int intern(String s) {
> >    if (!dict.containsKey(s)) {
> >      dict.put(s, dict.size());
> >    }
> >    return dict.get(s);
> >   }
> >
> >
> >


Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Stanley Xu <we...@gmail.com>.
Yes. What I mean is for row number, which is num_target - 1.
The number of features is the same as the vector length of the training
example.

Best wishes,
Stanley Xu



On Mon, May 30, 2011 at 12:25 PM, Ted Dunning <te...@gmail.com> wrote:

> Stanley is correct in his first point because the number of items has to
> match.
>
> The second point confuses me.  beta is a matrix and thus has rows and
> columns.  It should be NUM_FEATURES x (NUM_TARGETS - 1) in size.
>
> On Sun, May 29, 2011 at 7:50 PM, Stanley Xu <we...@gmail.com> wrote:
>
> > Nope. The target values from 1.... to n-1 will be mapper to a list of
> > target
> > value from 0....to n-2.
> > And in the beta matrix, we will use all 0 weights for the first line
> > vector,
> > so the weight generated would be beta[0]..... to beta[n-3].
> >
> > Best wishes,
> > Stanley Xu
> >
> >
> >
> > On Mon, May 30, 2011 at 10:16 AM, Xiaobo Gu <gu...@gmail.com>
> > wrote:
> >
> > > On Mon, May 30, 2011 at 2:30 AM, Ted Dunning <te...@gmail.com>
> > > wrote:
> > > > Target values 1 ... n-1 correspond to columns 0 ... n-2 of the beta
> > > matrix.
> > > >  classifyFull puts a synthetic result at location 0.
> > >
> > > I think Target values 1 ... n-1 correspond to row 0, ... n-1 of the
> beta
> > > matrix,
> > >
> > > is it ?
> > >
> > >
> > >
> > > >
> > > > If you can afford the (very small) cost of allocating a larger
> vector,
> > I
> > > > recommend using classifyFull to make your life simpler.  I almost
> > regret
> > > > using the simpler name for the method that imposes complexity on the
> > > user.
> > > >
> > > > On Sun, May 29, 2011 at 12:57 AM, XiaoboGu <gu...@gmail.com>
> > > wrote:
> > > >
> > > >> Then which value is missed in the beta matrix of
> > > OnlineLogisticRegression,
> > > >> the last value of the target present to LR.train(), that is n - 1 is
> > > missed?
> > > >>
> > > >>
> > > >> > -----Original Message-----
> > > >> > From: Ted Dunning [mailto:ted.dunning@gmail.com]
> > > >> > Sent: Sunday, May 29, 2011 1:44 PM
> > > >> > To: user@mahout.apache.org
> > > >> > Subject: Re: Are the OnlineLogisticRegression s of a
> > > CrossFolderLearner
> > > >> object equal after
> > > >> > training?
> > > >> >
> > > >> > I think not.
> > > >> >
> > > >> > When you present the first symbol to the Dictionary, dict.size()
> > will
> > > be
> > > >> > zero.  That value will be inserted into the table under that
> symbol.
> > > >>  Each
> > > >> > new symbol will be inserted with the size of the table as it was
> > > *before*
> > > >> > that symbol was inserted.
> > > >> >
> > > >> > I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> > > >> > demonstrate and enforce this.  It won't be committed until the
> > current
> > > >> > release goes out.
> > > >> >
> > > >> > On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <guxiaobo1982@gmail.com
> >
> > > >> wrote:
> > > >> > >
> > > >> > >
> > > >> > > Ok, then target values are always more than 0, I refer to this
> > > >> > >
> > > >> > > public class Dictionary {
> > > >> > >  private final Map<String, Integer> dict =
> > Maps.newLinkedHashMap();
> > > >> > >
> > > >> > >  public int intern(String s) {
> > > >> > >    if (!dict.containsKey(s)) {
> > > >> > >      dict.put(s, dict.size());
> > > >> > >    }
> > > >> > >    return dict.get(s);
> > > >> > >   }
> > > >> > >
> > > >> > >
> > > >> > >
> > > >>
> > > >>
> > > >
> > >
> >
>

Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Ted Dunning <te...@gmail.com>.
Stanley is correct in his first point because the number of items has to
match.

The second point confuses me.  beta is a matrix and thus has rows and
columns.  It should be NUM_FEATURES x (NUM_TARGETS - 1) in size.

On Sun, May 29, 2011 at 7:50 PM, Stanley Xu <we...@gmail.com> wrote:

> Nope. The target values from 1.... to n-1 will be mapper to a list of
> target
> value from 0....to n-2.
> And in the beta matrix, we will use all 0 weights for the first line
> vector,
> so the weight generated would be beta[0]..... to beta[n-3].
>
> Best wishes,
> Stanley Xu
>
>
>
> On Mon, May 30, 2011 at 10:16 AM, Xiaobo Gu <gu...@gmail.com>
> wrote:
>
> > On Mon, May 30, 2011 at 2:30 AM, Ted Dunning <te...@gmail.com>
> > wrote:
> > > Target values 1 ... n-1 correspond to columns 0 ... n-2 of the beta
> > matrix.
> > >  classifyFull puts a synthetic result at location 0.
> >
> > I think Target values 1 ... n-1 correspond to row 0, ... n-1 of the beta
> > matrix,
> >
> > is it ?
> >
> >
> >
> > >
> > > If you can afford the (very small) cost of allocating a larger vector,
> I
> > > recommend using classifyFull to make your life simpler.  I almost
> regret
> > > using the simpler name for the method that imposes complexity on the
> > user.
> > >
> > > On Sun, May 29, 2011 at 12:57 AM, XiaoboGu <gu...@gmail.com>
> > wrote:
> > >
> > >> Then which value is missed in the beta matrix of
> > OnlineLogisticRegression,
> > >> the last value of the target present to LR.train(), that is n - 1 is
> > missed?
> > >>
> > >>
> > >> > -----Original Message-----
> > >> > From: Ted Dunning [mailto:ted.dunning@gmail.com]
> > >> > Sent: Sunday, May 29, 2011 1:44 PM
> > >> > To: user@mahout.apache.org
> > >> > Subject: Re: Are the OnlineLogisticRegression s of a
> > CrossFolderLearner
> > >> object equal after
> > >> > training?
> > >> >
> > >> > I think not.
> > >> >
> > >> > When you present the first symbol to the Dictionary, dict.size()
> will
> > be
> > >> > zero.  That value will be inserted into the table under that symbol.
> > >>  Each
> > >> > new symbol will be inserted with the size of the table as it was
> > *before*
> > >> > that symbol was inserted.
> > >> >
> > >> > I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> > >> > demonstrate and enforce this.  It won't be committed until the
> current
> > >> > release goes out.
> > >> >
> > >> > On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com>
> > >> wrote:
> > >> > >
> > >> > >
> > >> > > Ok, then target values are always more than 0, I refer to this
> > >> > >
> > >> > > public class Dictionary {
> > >> > >  private final Map<String, Integer> dict =
> Maps.newLinkedHashMap();
> > >> > >
> > >> > >  public int intern(String s) {
> > >> > >    if (!dict.containsKey(s)) {
> > >> > >      dict.put(s, dict.size());
> > >> > >    }
> > >> > >    return dict.get(s);
> > >> > >   }
> > >> > >
> > >> > >
> > >> > >
> > >>
> > >>
> > >
> >
>

Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Stanley Xu <we...@gmail.com>.
Nope. The target values from 1.... to n-1 will be mapper to a list of target
value from 0....to n-2.
And in the beta matrix, we will use all 0 weights for the first line vector,
so the weight generated would be beta[0]..... to beta[n-3].

Best wishes,
Stanley Xu



On Mon, May 30, 2011 at 10:16 AM, Xiaobo Gu <gu...@gmail.com> wrote:

> On Mon, May 30, 2011 at 2:30 AM, Ted Dunning <te...@gmail.com>
> wrote:
> > Target values 1 ... n-1 correspond to columns 0 ... n-2 of the beta
> matrix.
> >  classifyFull puts a synthetic result at location 0.
>
> I think Target values 1 ... n-1 correspond to row 0, ... n-1 of the beta
> matrix,
>
> is it ?
>
>
>
> >
> > If you can afford the (very small) cost of allocating a larger vector, I
> > recommend using classifyFull to make your life simpler.  I almost regret
> > using the simpler name for the method that imposes complexity on the
> user.
> >
> > On Sun, May 29, 2011 at 12:57 AM, XiaoboGu <gu...@gmail.com>
> wrote:
> >
> >> Then which value is missed in the beta matrix of
> OnlineLogisticRegression,
> >> the last value of the target present to LR.train(), that is n - 1 is
> missed?
> >>
> >>
> >> > -----Original Message-----
> >> > From: Ted Dunning [mailto:ted.dunning@gmail.com]
> >> > Sent: Sunday, May 29, 2011 1:44 PM
> >> > To: user@mahout.apache.org
> >> > Subject: Re: Are the OnlineLogisticRegression s of a
> CrossFolderLearner
> >> object equal after
> >> > training?
> >> >
> >> > I think not.
> >> >
> >> > When you present the first symbol to the Dictionary, dict.size() will
> be
> >> > zero.  That value will be inserted into the table under that symbol.
> >>  Each
> >> > new symbol will be inserted with the size of the table as it was
> *before*
> >> > that symbol was inserted.
> >> >
> >> > I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> >> > demonstrate and enforce this.  It won't be committed until the current
> >> > release goes out.
> >> >
> >> > On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com>
> >> wrote:
> >> > >
> >> > >
> >> > > Ok, then target values are always more than 0, I refer to this
> >> > >
> >> > > public class Dictionary {
> >> > >  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
> >> > >
> >> > >  public int intern(String s) {
> >> > >    if (!dict.containsKey(s)) {
> >> > >      dict.put(s, dict.size());
> >> > >    }
> >> > >    return dict.get(s);
> >> > >   }
> >> > >
> >> > >
> >> > >
> >>
> >>
> >
>

Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Xiaobo Gu <gu...@gmail.com>.
On Mon, May 30, 2011 at 2:30 AM, Ted Dunning <te...@gmail.com> wrote:
> Target values 1 ... n-1 correspond to columns 0 ... n-2 of the beta matrix.
>  classifyFull puts a synthetic result at location 0.

I think Target values 1 ... n-1 correspond to row 0, ... n-1 of the beta matrix,

is it ?



>
> If you can afford the (very small) cost of allocating a larger vector, I
> recommend using classifyFull to make your life simpler.  I almost regret
> using the simpler name for the method that imposes complexity on the user.
>
> On Sun, May 29, 2011 at 12:57 AM, XiaoboGu <gu...@gmail.com> wrote:
>
>> Then which value is missed in the beta matrix of OnlineLogisticRegression,
>> the last value of the target present to LR.train(), that is n - 1 is missed?
>>
>>
>> > -----Original Message-----
>> > From: Ted Dunning [mailto:ted.dunning@gmail.com]
>> > Sent: Sunday, May 29, 2011 1:44 PM
>> > To: user@mahout.apache.org
>> > Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner
>> object equal after
>> > training?
>> >
>> > I think not.
>> >
>> > When you present the first symbol to the Dictionary, dict.size() will be
>> > zero.  That value will be inserted into the table under that symbol.
>>  Each
>> > new symbol will be inserted with the size of the table as it was *before*
>> > that symbol was inserted.
>> >
>> > I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
>> > demonstrate and enforce this.  It won't be committed until the current
>> > release goes out.
>> >
>> > On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com>
>> wrote:
>> > >
>> > >
>> > > Ok, then target values are always more than 0, I refer to this
>> > >
>> > > public class Dictionary {
>> > >  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
>> > >
>> > >  public int intern(String s) {
>> > >    if (!dict.containsKey(s)) {
>> > >      dict.put(s, dict.size());
>> > >    }
>> > >    return dict.get(s);
>> > >   }
>> > >
>> > >
>> > >
>>
>>
>

Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Ted Dunning <te...@gmail.com>.
Target values 1 ... n-1 correspond to columns 0 ... n-2 of the beta matrix.
 classifyFull puts a synthetic result at location 0.

If you can afford the (very small) cost of allocating a larger vector, I
recommend using classifyFull to make your life simpler.  I almost regret
using the simpler name for the method that imposes complexity on the user.

On Sun, May 29, 2011 at 12:57 AM, XiaoboGu <gu...@gmail.com> wrote:

> Then which value is missed in the beta matrix of OnlineLogisticRegression,
> the last value of the target present to LR.train(), that is n - 1 is missed?
>
>
> > -----Original Message-----
> > From: Ted Dunning [mailto:ted.dunning@gmail.com]
> > Sent: Sunday, May 29, 2011 1:44 PM
> > To: user@mahout.apache.org
> > Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner
> object equal after
> > training?
> >
> > I think not.
> >
> > When you present the first symbol to the Dictionary, dict.size() will be
> > zero.  That value will be inserted into the table under that symbol.
>  Each
> > new symbol will be inserted with the size of the table as it was *before*
> > that symbol was inserted.
> >
> > I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> > demonstrate and enforce this.  It won't be committed until the current
> > release goes out.
> >
> > On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com>
> wrote:
> > >
> > >
> > > Ok, then target values are always more than 0, I refer to this
> > >
> > > public class Dictionary {
> > >  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
> > >
> > >  public int intern(String s) {
> > >    if (!dict.containsKey(s)) {
> > >      dict.put(s, dict.size());
> > >    }
> > >    return dict.get(s);
> > >   }
> > >
> > >
> > >
>
>

RE: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by XiaoboGu <gu...@gmail.com>.
Then which value is missed in the beta matrix of OnlineLogisticRegression, the last value of the target present to LR.train(), that is n - 1 is missed?


> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Sunday, May 29, 2011 1:44 PM
> To: user@mahout.apache.org
> Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after
> training?
> 
> I think not.
> 
> When you present the first symbol to the Dictionary, dict.size() will be
> zero.  That value will be inserted into the table under that symbol.  Each
> new symbol will be inserted with the size of the table as it was *before*
> that symbol was inserted.
> 
> I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
> demonstrate and enforce this.  It won't be committed until the current
> release goes out.
> 
> On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com> wrote:
> >
> >
> > Ok, then target values are always more than 0, I refer to this
> >
> > public class Dictionary {
> >  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
> >
> >  public int intern(String s) {
> >    if (!dict.containsKey(s)) {
> >      dict.put(s, dict.size());
> >    }
> >    return dict.get(s);
> >   }
> >
> >
> >


Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Ted Dunning <te...@gmail.com>.
I think not.

When you present the first symbol to the Dictionary, dict.size() will be
zero.  That value will be inserted into the table under that symbol.  Each
new symbol will be inserted with the size of the table as it was *before*
that symbol was inserted.

I have added a line to CsvRecordFactoryTest.testDictionaryOrder to
demonstrate and enforce this.  It won't be committed until the current
release goes out.

On Sat, May 28, 2011 at 9:57 PM, XiaoboGu <gu...@gmail.com> wrote:
>
>
> Ok, then target values are always more than 0, I refer to this
>
> public class Dictionary {
>  private final Map<String, Integer> dict = Maps.newLinkedHashMap();
>
>  public int intern(String s) {
>    if (!dict.containsKey(s)) {
>      dict.put(s, dict.size());
>    }
>    return dict.get(s);
>   }
>
>
>

RE: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by XiaoboGu <gu...@gmail.com>.

> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Sunday, May 29, 2011 12:48 PM
> To: user@mahout.apache.org
> Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after
> training?
> 
> The color column contains strings.  Internally, codes are assigned to those
> strings.
> 
> The book talks about the confusion that comes from using numerical values as
> codes for categorical values.  This is an example of that confusion.

Ok, then target values are always more than 0, I refer to this 

public class Dictionary {
  private final Map<String, Integer> dict = Maps.newLinkedHashMap();

  public int intern(String s) {
    if (!dict.containsKey(s)) {
      dict.put(s, dict.size());
    }
    return dict.get(s);
  }



> 
> On Sat, May 28, 2011 at 8:48 PM, XiaoboGu <gu...@gmail.com> wrote:
> 
> > >
> > >
> > > > "Multinomial models" means the number n of distinct values the target
> > is
> > > > more than 2, and they should be encoded as 0, 1, 2,......, n-1,
> > > >
> > >
> > > Yes.
> > But the values of the color column of donut.csv are encoded as 1 and 2.
> >
> >
> >


Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by Ted Dunning <te...@gmail.com>.
The color column contains strings.  Internally, codes are assigned to those
strings.

The book talks about the confusion that comes from using numerical values as
codes for categorical values.  This is an example of that confusion.

On Sat, May 28, 2011 at 8:48 PM, XiaoboGu <gu...@gmail.com> wrote:

> >
> >
> > > "Multinomial models" means the number n of distinct values the target
> is
> > > more than 2, and they should be encoded as 0, 1, 2,......, n-1,
> > >
> >
> > Yes.
> But the values of the color column of donut.csv are encoded as 1 and 2.
>
>
>

RE: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after training?

Posted by XiaoboGu <gu...@gmail.com>.
> 
> 
> > "Multinomial models" means the number n of distinct values the target is
> > more than 2, and they should be encoded as 0, 1, 2,......, n-1,
> >
> 
> Yes.
But the values of the color column of donut.csv are encoded as 1 and 2.



Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after trainning?

Posted by Ted Dunning <te...@gmail.com>.
On Sat, May 28, 2011 at 5:45 PM, XiaoboGu <gu...@gmail.com> wrote:

>
> ...

> For now, any of the OLR's is as good as any other.
> >
> > For your second question, I think that you are asking "According to what
> > criterion is the best ...".
> >
> > Typically the choice is based on AUC for binary models and log-likelihood
> > for multinomial models.  You could change that to be percent correct or
> any
> > other metric you might like.  Grouped AUC is common, for instance.
>
> Just to confirm,
> "Binary models" means the target only has two distinct values, and they
> must be 0 and 1.
>

Yes.


> "Multinomial models" means the number n of distinct values the target is
> more than 2, and they should be encoded as 0, 1, 2,......, n-1,
>

Yes.


> And AUC and log-likelihood are used for evaluating the performance of
> binary and multinomial models respectively, can't mix them up?
>

No.

Log-likelihood can be used for either.  AUC normally is only used for
binomial cases.  There are generalizations of AUC, but we haven't
implemented them.


>
>
>

RE: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after trainning?

Posted by XiaoboGu <gu...@gmail.com>.

> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Sent: Thursday, May 26, 2011 12:19 PM
> To: user@mahout.apache.org
> Subject: Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after
> trainning?
> 
> Xiaobo,
> 
> Sorry to be slow answering you.
> 
> In general, there is no reason to pick one OLR inside a CrossFolderLearner
> over another.  They all have seen 80% of the data.  Some day, we might want
> to produce a different kind of CFL that is not symmetrical, but I haven't
> had a need for that yet.  For instance, we might have one OLR that gets all
> of the data for training and another that gets 80% for training and 20% for
> evaluation.
> 
> For now, any of the OLR's is as good as any other.
> 
> For your second question, I think that you are asking "According to what
> criterion is the best ...".
> 
> Typically the choice is based on AUC for binary models and log-likelihood
> for multinomial models.  You could change that to be percent correct or any
> other metric you might like.  Grouped AUC is common, for instance.

Just to confirm,
"Binary models" means the target only has two distinct values, and they must be 0 and 1.

"Multinomial models" means the number n of distinct values the target is more than 2, and they should be encoded as 0, 1, 2,......, n-1,

And AUC and log-likelihood are used for evaluating the performance of binary and multinomial models respectively, can't mix them up?



> On Mon, May 23, 2011 at 8:23 AM, XiaoboGu <gu...@gmail.com> wrote:
> 
> > Hi,
> >        The TrainNewsGroup.java just use OnlineLogisticRegression model =
> > state.getModels().get(0); to get an OLR object to do the overall description
> > of the AdaptiveLogisticRegression’s performance, there are two questions:
> > 1. Are the OLR objects of the best CrosFolderLearner equal.
> > 2. According to what cretirear, the best CrossFolderLearner object is
> > chosen?
> >
> > Regards,
> >
> > Xiaobo Gu
> >
> >


Re: Are the OnlineLogisticRegression s of a CrossFolderLearner object equal after trainning?

Posted by Ted Dunning <te...@gmail.com>.
Xiaobo,

Sorry to be slow answering you.

In general, there is no reason to pick one OLR inside a CrossFolderLearner
over another.  They all have seen 80% of the data.  Some day, we might want
to produce a different kind of CFL that is not symmetrical, but I haven't
had a need for that yet.  For instance, we might have one OLR that gets all
of the data for training and another that gets 80% for training and 20% for
evaluation.

For now, any of the OLR's is as good as any other.

For your second question, I think that you are asking "According to what
criterion is the best ...".

Typically the choice is based on AUC for binary models and log-likelihood
for multinomial models.  You could change that to be percent correct or any
other metric you might like.  Grouped AUC is common, for instance.

On Mon, May 23, 2011 at 8:23 AM, XiaoboGu <gu...@gmail.com> wrote:

> Hi,
>        The TrainNewsGroup.java just use OnlineLogisticRegression model =
> state.getModels().get(0); to get an OLR object to do the overall description
> of the AdaptiveLogisticRegression’s performance, there are two questions:
> 1. Are the OLR objects of the best CrosFolderLearner equal.
> 2. According to what cretirear, the best CrossFolderLearner object is
> chosen?
>
> Regards,
>
> Xiaobo Gu
>
>