You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dirk Weissenborn <di...@googlemail.com> on 2012/03/26 20:54:47 UTC

online lda learning algorithm

Hello,

I wanted to ask whether there is already an online learning algorithm
implementation for lda or not?

http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf

cheers,
Dirk

Re: online lda learning algorithm

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
i don't think there's online variety, no.
MR variety, yes.

On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn
<di...@googlemail.com> wrote:
> Hello,
>
> I wanted to ask whether there is already an online learning algorithm
> implementation for lda or not?
>
> http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
>
> cheers,
> Dirk

Re: online lda learning algorithm

Posted by Dirk Weissenborn <di...@googlemail.com>.
I found this paper on supervised lda.
http://www.cs.princeton.edu/~blei/papers/BleiMcAuliffe2007.pdf

Could an implementation of supervised lda in your opinion be done easily
given the already existent implementation?

2012/3/27 Dirk Weissenborn <di...@googlemail.com>

> The fact that the dictionary stays fixed could be a problem for what I
> want to do. Would it be hard to change it in the code?
>
> I now of one kind of supervised lda, called disclda... do you know others?
>
> If the model is slightly changed (training on just few new inputs compared
> to the overall number of documents the model was already trained on), I
> think retraining a classifier should converge fast, taking the old model
> parameters as starting parameters, so I guess this probably would not be to
> much of a drawback.
> The reason I want to do this is, that lda works really good on sparse
> datasets for training, and it has been shown that a composition of lda and
> a classifier is much faster at the classification task, with similar
> results to tf-idf based classifiers.
>
> 2012/3/27 Jake Mannix <ja...@gmail.com>
>
>> On Mon, Mar 26, 2012 at 5:13 PM, Dirk Weissenborn <
>> dirk.weissenborn@googlemail.com> wrote:
>>
>> > Ok thanks,
>> >
>> > maybe one question. Is it possible to train an already trained model on
>> > just a few new documents with the provided algorithms or do you have to
>> > train through the whole corpus again. What I mean is whether you can
>> train
>> > an lda model incrementally or not?
>> >
>>
>> So the answer to *this* above question is that yes, you can incrementally
>> train your LDA model with new documents, and in fact you can do this
>> with the current codebase, although it's not really documented how you
>> would
>> do it: we currently persist a full-fledged model in between passes over
>> the corpus (because we're doing MapReduce iteratively, this happens
>> naturally), and this model doesn't "know" what documents were used to
>> train it (its a big matrix of topic <-> feature counts, that's all), so
>> you
>> could
>> take a fully trained model, and then run the current LDA over it, using
>> new documents as input, and it will start learning from them.  Now, it
>> won't be able to learn with *new vocabulary* without changing the code
>> a little, however.  The dictionary stays fixed in the current codebase.
>>
>>
>> > What I would actually like to do is training a topical classifier on
>> top of
>> > a lda model. Do you have any experience with that? I mean by changing
>> the
>> > lda model, the inputs for the classifier would also change. Do I have to
>> > train a classifier from scratch again or can I reuse the classifier
>> trained
>> > on top of the older lda model and just ajust that one slightly?
>> >
>>
>> This is actually a rather different question, it seems.  Let me make sure
>> I'm understanding what you're asking:
>>
>> You are using the p(topic | doc) values for each doc as *input features*
>> for another classifier, right?   Trying to update your model
>> (which itself can be thought of as a fuzzy classifier consisting of
>> weights
>> p(feature | topic) and an inference algorithm to produce p(topic | doc)
>> if fed a document consisting of a weighted feature vector) and keeping
>> the p(topic | doc) vectors fixed will definitely be wrong - if the model
>> changes, these weights will need to be updated.  The only way
>> to do this is to run these documents against the model in some way.
>>
>> But you don't really want to do that, you want to update your secondary
>> classifier, after your input topics drift a bit upon having been updated.
>> I think this problem still remains, however: your secondary classifier
>> was trained on input features which have changed, so it most likely
>> needs to be retrained as well.
>>
>> If, on the other hand, you are training a joint classifier (like
>> Supervised LDA, or Labeled LDA), and you ran *this* in an online
>> mode, you could probably update your classifier continually as you
>> got new labeled training data to train on.  But I'm speculating at this
>> point. :)
>>
>>
>>
>> >
>> > 2012/3/27 Dirk Weissenborn <di...@googlemail.com>
>> >
>> > > no problem. I ll post it
>> > >
>> > >
>> > > 2012/3/27 Jake Mannix <ja...@gmail.com>
>> > >
>> > >> Hey Dirk,
>> > >>
>> > >>   Do you mind continuing this discussion on the mailing list?  Lots
>> of
>> > >> our users may ask this kind of question in the future...
>> > >>
>> > >> On Mon, Mar 26, 2012 at 3:36 PM, Dirk Weissenborn <
>> > >> dirk.weissenborn@googlemail.com> wrote:
>> > >>
>> > >>> Ok thanks,
>> > >>>
>> > >>> maybe one question. Is it possible to train an already trained
>> model on
>> > >>> just a few new documents with the provided algorithms or do you
>> have to
>> > >>> train through the whole corpus again. What I mean is whether you can
>> > train
>> > >>> an lda model incrementally or not?
>> > >>> What I would actually like to do is training a topical classifier on
>> > top
>> > >>> of a lda model. Do you have any experience with that? I mean by
>> > changing
>> > >>> the lda model, the inputs for the classifier would also change. Do I
>> > have
>> > >>> to train a classifier from scratch again or can I reuse the
>> classifier
>> > >>> trained on top of the older lda model and just ajust that one?
>> > >>>
>> > >>>
>> > >>> 2012/3/26 Jake Mannix <ja...@gmail.com>
>> > >>>
>> > >>>> On Mon, Mar 26, 2012 at 12:58 PM, Dirk Weissenborn <
>> > >>>> dirk.weissenborn@googlemail.com> wrote:
>> > >>>>
>> > >>>> > Thank you for the quick response! It is possible that I need it
>> in
>> > >>>> not too
>> > >>>> > far future maybe I ll implement on top what already exists, which
>> > >>>> should
>> > >>>> > not be that hard as you mentioned. I ll provide a patch when the
>> > time
>> > >>>> > comes.
>> > >>>> >
>> > >>>>
>> > >>>> Feel free to email any questions about using the
>> > >>>> InMemoryCollapsedVariationalBayes0
>> > >>>> class - it's mainly been used for testing, so far, but if you want
>> to
>> > >>>> take
>> > >>>> that class
>> > >>>> and clean it up and look into fixing the online learning aspect of
>> it,
>> > >>>> that'd be
>> > >>>> excellent.  Let me know if you make any progress, because I'll
>> > probably
>> > >>>> be
>> > >>>> looking
>> > >>>> to work on this at some point as well, but I won't if you're
>> already
>> > >>>> working on it. :)
>> > >>>>
>> > >>>>
>> > >>>> >
>> > >>>> > 2012/3/26 Jake Mannix <ja...@gmail.com>
>> > >>>> >
>> > >>>> > > Hi Dirk,
>> > >>>> > >
>> > >>>> > >  This has not been implemented in Mahout, but the version of
>> > >>>> map-reduce
>> > >>>> > > (batch)-learned
>> > >>>> > > LDA which is done via (approximate+collapsed-) variational
>> bayes
>> > >>>> [1] is
>> > >>>> > > reasonably easily
>> > >>>> > > modifiable to the methods in this paper, as the LDA learner we
>> > >>>> currently
>> > >>>> > do
>> > >>>> > > via iterative
>> > >>>> > > MR passes is essentially an ensemble learner: each subset of
>> the
>> > >>>> data
>> > >>>> > > partially trains a
>> > >>>> > > full LDA model starting from the aggregate (summed) counts of
>> all
>> > >>>> of the
>> > >>>> > > data from
>> > >>>> > > previous iterations (see essentially the method named
>> > "approximately
>> > >>>> > > distributed LDA" /
>> > >>>> > > AD-LDA in Ref-[2]).
>> > >>>> > >
>> > >>>> > >  The method in the paper you refer to turns traditional VB (the
>> > >>>> slower,
>> > >>>> > > uncollapsed kind,
>> > >>>> > > with the nasty digamma functions all over the place) into a
>> > >>>> streaming
>> > >>>> > > learner, by accreting
>> > >>>> > > the word-counts of each document onto the model you're using
>> for
>> > >>>> > inference
>> > >>>> > > on the next
>> > >>>> > > documents.  The same exact idea can be done on the CVB0
>> inference
>> > >>>> > > technique, almost
>> > >>>> > > without change - as VB differs from CVB0 only in the E-step,
>> not
>> > the
>> > >>>> > > M-step.
>> > >>>> > >
>> > >>>> > >  The problem which comes up when I've considered doing this
>> kind
>> > of
>> > >>>> thing
>> > >>>> > > in the past
>> > >>>> > > is that if you do this in a distributed fashion, each member of
>> > the
>> > >>>> > > ensemble starts learning
>> > >>>> > > different topics simultaneously, and then the merge gets
>> trickier.
>> > >>>>  You
>> > >>>> > can
>> > >>>> > > avoid this by
>> > >>>> > > doing some of the techniques mentioned in [2] for HDP, where
>> you
>> > >>>> swap
>> > >>>> > > topic-ids on
>> > >>>> > > merge to make sure they match up, but I haven't investigated
>> that
>> > >>>> very
>> > >>>> > > thoroughly.  The
>> > >>>> > > other way to avoid this problem is to use the parameter denoted
>> > >>>> \rho_t in
>> > >>>> > > Hoffman et al -
>> > >>>> > > this parameter is telling us how much to weight the model as it
>> > was
>> > >>>> up
>> > >>>> > > until now, against
>> > >>>> > > the updates from the latest document (alternatively, how much
>> to
>> > >>>> "decay"
>> > >>>> > > previous
>> > >>>> > > documents).  If you don't let the topics drift *too much*
>> during
>> > >>>> parallel
>> > >>>> > > learning, you could
>> > >>>> > > probably make sure that they match up just fine on each merge,
>> > while
>> > >>>> > still
>> > >>>> > > speeding up
>> > >>>> > > the process faster than fully batch learning.
>> > >>>> > >
>> > >>>> > >  So yeah, this is a great idea, but getting it to work in a
>> > >>>> distributed
>> > >>>> > > fashion is tricky.
>> > >>>> > > In a non-distributed form, this idea is almost completely
>> > >>>> implemented in
>> > >>>> > > the class
>> > >>>> > > InMemoryCollapsedVariationalBayes0.  I say "almost" because
>> it's
>> > >>>> > > technically in
>> > >>>> > > there already, as a parameter choice
>> (initialModelCorpusFraction
>> > !=
>> > >>>> 0),
>> > >>>> > but
>> > >>>> > > I don't
>> > >>>> > > think it's working properly yet.  If you're interested in the
>> > >>>> problem,
>> > >>>> > > playing with this
>> > >>>> > > class would be a great place to start!
>> > >>>> > >
>> > >>>> > > Refrences:
>> > >>>> > > 1)
>> > >>>> > >
>> > >>>>
>> >
>> http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
>> > >>>> > > 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
>> > >>>> > >
>> > >>>> > > On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
>> > >>>> > > dirk.weissenborn@googlemail.com> wrote:
>> > >>>> > >
>> > >>>> > > > Hello,
>> > >>>> > > >
>> > >>>> > > > I wanted to ask whether there is already an online learning
>> > >>>> algorithm
>> > >>>> > > > implementation for lda or not?
>> > >>>> > > >
>> > >>>> > > >
>> > http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
>> > >>>> > > >
>> > >>>> > > > cheers,
>> > >>>> > > > Dirk
>> > >>>> > > >
>> > >>>> > >
>> > >>>> > >
>> > >>>> > >
>> > >>>> > > --
>> > >>>> > >
>> > >>>> > >  -jake
>> > >>>> > >
>> > >>>> >
>> > >>>>
>> > >>>>
>> > >>>>
>> > >>>> --
>> > >>>>
>> > >>>>  -jake
>> > >>>>
>> > >>>
>> > >>>
>> > >>
>> > >>
>> > >> --
>> > >>
>> > >>   -jake
>> > >>
>> > >>
>> > >
>> >
>>
>>
>>
>> --
>>
>>  -jake
>>
>
>

Re: online lda learning algorithm

Posted by Dirk Weissenborn <di...@googlemail.com>.
The fact that the dictionary stays fixed could be a problem for what I want
to do. Would it be hard to change it in the code?

I now of one kind of supervised lda, called disclda... do you know others?

If the model is slightly changed (training on just few new inputs compared
to the overall number of documents the model was already trained on), I
think retraining a classifier should converge fast, taking the old model
parameters as starting parameters, so I guess this probably would not be to
much of a drawback.
The reason I want to do this is, that lda works really good on sparse
datasets for training, and it has been shown that a composition of lda and
a classifier is much faster at the classification task, with similar
results to tf-idf based classifiers.

2012/3/27 Jake Mannix <ja...@gmail.com>

> On Mon, Mar 26, 2012 at 5:13 PM, Dirk Weissenborn <
> dirk.weissenborn@googlemail.com> wrote:
>
> > Ok thanks,
> >
> > maybe one question. Is it possible to train an already trained model on
> > just a few new documents with the provided algorithms or do you have to
> > train through the whole corpus again. What I mean is whether you can
> train
> > an lda model incrementally or not?
> >
>
> So the answer to *this* above question is that yes, you can incrementally
> train your LDA model with new documents, and in fact you can do this
> with the current codebase, although it's not really documented how you
> would
> do it: we currently persist a full-fledged model in between passes over
> the corpus (because we're doing MapReduce iteratively, this happens
> naturally), and this model doesn't "know" what documents were used to
> train it (its a big matrix of topic <-> feature counts, that's all), so you
> could
> take a fully trained model, and then run the current LDA over it, using
> new documents as input, and it will start learning from them.  Now, it
> won't be able to learn with *new vocabulary* without changing the code
> a little, however.  The dictionary stays fixed in the current codebase.
>
>
> > What I would actually like to do is training a topical classifier on top
> of
> > a lda model. Do you have any experience with that? I mean by changing the
> > lda model, the inputs for the classifier would also change. Do I have to
> > train a classifier from scratch again or can I reuse the classifier
> trained
> > on top of the older lda model and just ajust that one slightly?
> >
>
> This is actually a rather different question, it seems.  Let me make sure
> I'm understanding what you're asking:
>
> You are using the p(topic | doc) values for each doc as *input features*
> for another classifier, right?   Trying to update your model
> (which itself can be thought of as a fuzzy classifier consisting of weights
> p(feature | topic) and an inference algorithm to produce p(topic | doc)
> if fed a document consisting of a weighted feature vector) and keeping
> the p(topic | doc) vectors fixed will definitely be wrong - if the model
> changes, these weights will need to be updated.  The only way
> to do this is to run these documents against the model in some way.
>
> But you don't really want to do that, you want to update your secondary
> classifier, after your input topics drift a bit upon having been updated.
> I think this problem still remains, however: your secondary classifier
> was trained on input features which have changed, so it most likely
> needs to be retrained as well.
>
> If, on the other hand, you are training a joint classifier (like
> Supervised LDA, or Labeled LDA), and you ran *this* in an online
> mode, you could probably update your classifier continually as you
> got new labeled training data to train on.  But I'm speculating at this
> point. :)
>
>
>
> >
> > 2012/3/27 Dirk Weissenborn <di...@googlemail.com>
> >
> > > no problem. I ll post it
> > >
> > >
> > > 2012/3/27 Jake Mannix <ja...@gmail.com>
> > >
> > >> Hey Dirk,
> > >>
> > >>   Do you mind continuing this discussion on the mailing list?  Lots of
> > >> our users may ask this kind of question in the future...
> > >>
> > >> On Mon, Mar 26, 2012 at 3:36 PM, Dirk Weissenborn <
> > >> dirk.weissenborn@googlemail.com> wrote:
> > >>
> > >>> Ok thanks,
> > >>>
> > >>> maybe one question. Is it possible to train an already trained model
> on
> > >>> just a few new documents with the provided algorithms or do you have
> to
> > >>> train through the whole corpus again. What I mean is whether you can
> > train
> > >>> an lda model incrementally or not?
> > >>> What I would actually like to do is training a topical classifier on
> > top
> > >>> of a lda model. Do you have any experience with that? I mean by
> > changing
> > >>> the lda model, the inputs for the classifier would also change. Do I
> > have
> > >>> to train a classifier from scratch again or can I reuse the
> classifier
> > >>> trained on top of the older lda model and just ajust that one?
> > >>>
> > >>>
> > >>> 2012/3/26 Jake Mannix <ja...@gmail.com>
> > >>>
> > >>>> On Mon, Mar 26, 2012 at 12:58 PM, Dirk Weissenborn <
> > >>>> dirk.weissenborn@googlemail.com> wrote:
> > >>>>
> > >>>> > Thank you for the quick response! It is possible that I need it in
> > >>>> not too
> > >>>> > far future maybe I ll implement on top what already exists, which
> > >>>> should
> > >>>> > not be that hard as you mentioned. I ll provide a patch when the
> > time
> > >>>> > comes.
> > >>>> >
> > >>>>
> > >>>> Feel free to email any questions about using the
> > >>>> InMemoryCollapsedVariationalBayes0
> > >>>> class - it's mainly been used for testing, so far, but if you want
> to
> > >>>> take
> > >>>> that class
> > >>>> and clean it up and look into fixing the online learning aspect of
> it,
> > >>>> that'd be
> > >>>> excellent.  Let me know if you make any progress, because I'll
> > probably
> > >>>> be
> > >>>> looking
> > >>>> to work on this at some point as well, but I won't if you're already
> > >>>> working on it. :)
> > >>>>
> > >>>>
> > >>>> >
> > >>>> > 2012/3/26 Jake Mannix <ja...@gmail.com>
> > >>>> >
> > >>>> > > Hi Dirk,
> > >>>> > >
> > >>>> > >  This has not been implemented in Mahout, but the version of
> > >>>> map-reduce
> > >>>> > > (batch)-learned
> > >>>> > > LDA which is done via (approximate+collapsed-) variational bayes
> > >>>> [1] is
> > >>>> > > reasonably easily
> > >>>> > > modifiable to the methods in this paper, as the LDA learner we
> > >>>> currently
> > >>>> > do
> > >>>> > > via iterative
> > >>>> > > MR passes is essentially an ensemble learner: each subset of the
> > >>>> data
> > >>>> > > partially trains a
> > >>>> > > full LDA model starting from the aggregate (summed) counts of
> all
> > >>>> of the
> > >>>> > > data from
> > >>>> > > previous iterations (see essentially the method named
> > "approximately
> > >>>> > > distributed LDA" /
> > >>>> > > AD-LDA in Ref-[2]).
> > >>>> > >
> > >>>> > >  The method in the paper you refer to turns traditional VB (the
> > >>>> slower,
> > >>>> > > uncollapsed kind,
> > >>>> > > with the nasty digamma functions all over the place) into a
> > >>>> streaming
> > >>>> > > learner, by accreting
> > >>>> > > the word-counts of each document onto the model you're using for
> > >>>> > inference
> > >>>> > > on the next
> > >>>> > > documents.  The same exact idea can be done on the CVB0
> inference
> > >>>> > > technique, almost
> > >>>> > > without change - as VB differs from CVB0 only in the E-step, not
> > the
> > >>>> > > M-step.
> > >>>> > >
> > >>>> > >  The problem which comes up when I've considered doing this kind
> > of
> > >>>> thing
> > >>>> > > in the past
> > >>>> > > is that if you do this in a distributed fashion, each member of
> > the
> > >>>> > > ensemble starts learning
> > >>>> > > different topics simultaneously, and then the merge gets
> trickier.
> > >>>>  You
> > >>>> > can
> > >>>> > > avoid this by
> > >>>> > > doing some of the techniques mentioned in [2] for HDP, where you
> > >>>> swap
> > >>>> > > topic-ids on
> > >>>> > > merge to make sure they match up, but I haven't investigated
> that
> > >>>> very
> > >>>> > > thoroughly.  The
> > >>>> > > other way to avoid this problem is to use the parameter denoted
> > >>>> \rho_t in
> > >>>> > > Hoffman et al -
> > >>>> > > this parameter is telling us how much to weight the model as it
> > was
> > >>>> up
> > >>>> > > until now, against
> > >>>> > > the updates from the latest document (alternatively, how much to
> > >>>> "decay"
> > >>>> > > previous
> > >>>> > > documents).  If you don't let the topics drift *too much* during
> > >>>> parallel
> > >>>> > > learning, you could
> > >>>> > > probably make sure that they match up just fine on each merge,
> > while
> > >>>> > still
> > >>>> > > speeding up
> > >>>> > > the process faster than fully batch learning.
> > >>>> > >
> > >>>> > >  So yeah, this is a great idea, but getting it to work in a
> > >>>> distributed
> > >>>> > > fashion is tricky.
> > >>>> > > In a non-distributed form, this idea is almost completely
> > >>>> implemented in
> > >>>> > > the class
> > >>>> > > InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
> > >>>> > > technically in
> > >>>> > > there already, as a parameter choice (initialModelCorpusFraction
> > !=
> > >>>> 0),
> > >>>> > but
> > >>>> > > I don't
> > >>>> > > think it's working properly yet.  If you're interested in the
> > >>>> problem,
> > >>>> > > playing with this
> > >>>> > > class would be a great place to start!
> > >>>> > >
> > >>>> > > Refrences:
> > >>>> > > 1)
> > >>>> > >
> > >>>>
> > http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
> > >>>> > > 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
> > >>>> > >
> > >>>> > > On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
> > >>>> > > dirk.weissenborn@googlemail.com> wrote:
> > >>>> > >
> > >>>> > > > Hello,
> > >>>> > > >
> > >>>> > > > I wanted to ask whether there is already an online learning
> > >>>> algorithm
> > >>>> > > > implementation for lda or not?
> > >>>> > > >
> > >>>> > > >
> > http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
> > >>>> > > >
> > >>>> > > > cheers,
> > >>>> > > > Dirk
> > >>>> > > >
> > >>>> > >
> > >>>> > >
> > >>>> > >
> > >>>> > > --
> > >>>> > >
> > >>>> > >  -jake
> > >>>> > >
> > >>>> >
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>>
> > >>>>  -jake
> > >>>>
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >>
> > >>   -jake
> > >>
> > >>
> > >
> >
>
>
>
> --
>
>  -jake
>

Re: online lda learning algorithm

Posted by Jake Mannix <ja...@gmail.com>.
On Mon, Mar 26, 2012 at 5:13 PM, Dirk Weissenborn <
dirk.weissenborn@googlemail.com> wrote:

> Ok thanks,
>
> maybe one question. Is it possible to train an already trained model on
> just a few new documents with the provided algorithms or do you have to
> train through the whole corpus again. What I mean is whether you can train
> an lda model incrementally or not?
>

So the answer to *this* above question is that yes, you can incrementally
train your LDA model with new documents, and in fact you can do this
with the current codebase, although it's not really documented how you would
do it: we currently persist a full-fledged model in between passes over
the corpus (because we're doing MapReduce iteratively, this happens
naturally), and this model doesn't "know" what documents were used to
train it (its a big matrix of topic <-> feature counts, that's all), so you
could
take a fully trained model, and then run the current LDA over it, using
new documents as input, and it will start learning from them.  Now, it
won't be able to learn with *new vocabulary* without changing the code
a little, however.  The dictionary stays fixed in the current codebase.


> What I would actually like to do is training a topical classifier on top of
> a lda model. Do you have any experience with that? I mean by changing the
> lda model, the inputs for the classifier would also change. Do I have to
> train a classifier from scratch again or can I reuse the classifier trained
> on top of the older lda model and just ajust that one slightly?
>

This is actually a rather different question, it seems.  Let me make sure
I'm understanding what you're asking:

You are using the p(topic | doc) values for each doc as *input features*
for another classifier, right?   Trying to update your model
(which itself can be thought of as a fuzzy classifier consisting of weights
p(feature | topic) and an inference algorithm to produce p(topic | doc)
if fed a document consisting of a weighted feature vector) and keeping
the p(topic | doc) vectors fixed will definitely be wrong - if the model
changes, these weights will need to be updated.  The only way
to do this is to run these documents against the model in some way.

But you don't really want to do that, you want to update your secondary
classifier, after your input topics drift a bit upon having been updated.
I think this problem still remains, however: your secondary classifier
was trained on input features which have changed, so it most likely
needs to be retrained as well.

If, on the other hand, you are training a joint classifier (like
Supervised LDA, or Labeled LDA), and you ran *this* in an online
mode, you could probably update your classifier continually as you
got new labeled training data to train on.  But I'm speculating at this
point. :)



>
> 2012/3/27 Dirk Weissenborn <di...@googlemail.com>
>
> > no problem. I ll post it
> >
> >
> > 2012/3/27 Jake Mannix <ja...@gmail.com>
> >
> >> Hey Dirk,
> >>
> >>   Do you mind continuing this discussion on the mailing list?  Lots of
> >> our users may ask this kind of question in the future...
> >>
> >> On Mon, Mar 26, 2012 at 3:36 PM, Dirk Weissenborn <
> >> dirk.weissenborn@googlemail.com> wrote:
> >>
> >>> Ok thanks,
> >>>
> >>> maybe one question. Is it possible to train an already trained model on
> >>> just a few new documents with the provided algorithms or do you have to
> >>> train through the whole corpus again. What I mean is whether you can
> train
> >>> an lda model incrementally or not?
> >>> What I would actually like to do is training a topical classifier on
> top
> >>> of a lda model. Do you have any experience with that? I mean by
> changing
> >>> the lda model, the inputs for the classifier would also change. Do I
> have
> >>> to train a classifier from scratch again or can I reuse the classifier
> >>> trained on top of the older lda model and just ajust that one?
> >>>
> >>>
> >>> 2012/3/26 Jake Mannix <ja...@gmail.com>
> >>>
> >>>> On Mon, Mar 26, 2012 at 12:58 PM, Dirk Weissenborn <
> >>>> dirk.weissenborn@googlemail.com> wrote:
> >>>>
> >>>> > Thank you for the quick response! It is possible that I need it in
> >>>> not too
> >>>> > far future maybe I ll implement on top what already exists, which
> >>>> should
> >>>> > not be that hard as you mentioned. I ll provide a patch when the
> time
> >>>> > comes.
> >>>> >
> >>>>
> >>>> Feel free to email any questions about using the
> >>>> InMemoryCollapsedVariationalBayes0
> >>>> class - it's mainly been used for testing, so far, but if you want to
> >>>> take
> >>>> that class
> >>>> and clean it up and look into fixing the online learning aspect of it,
> >>>> that'd be
> >>>> excellent.  Let me know if you make any progress, because I'll
> probably
> >>>> be
> >>>> looking
> >>>> to work on this at some point as well, but I won't if you're already
> >>>> working on it. :)
> >>>>
> >>>>
> >>>> >
> >>>> > 2012/3/26 Jake Mannix <ja...@gmail.com>
> >>>> >
> >>>> > > Hi Dirk,
> >>>> > >
> >>>> > >  This has not been implemented in Mahout, but the version of
> >>>> map-reduce
> >>>> > > (batch)-learned
> >>>> > > LDA which is done via (approximate+collapsed-) variational bayes
> >>>> [1] is
> >>>> > > reasonably easily
> >>>> > > modifiable to the methods in this paper, as the LDA learner we
> >>>> currently
> >>>> > do
> >>>> > > via iterative
> >>>> > > MR passes is essentially an ensemble learner: each subset of the
> >>>> data
> >>>> > > partially trains a
> >>>> > > full LDA model starting from the aggregate (summed) counts of all
> >>>> of the
> >>>> > > data from
> >>>> > > previous iterations (see essentially the method named
> "approximately
> >>>> > > distributed LDA" /
> >>>> > > AD-LDA in Ref-[2]).
> >>>> > >
> >>>> > >  The method in the paper you refer to turns traditional VB (the
> >>>> slower,
> >>>> > > uncollapsed kind,
> >>>> > > with the nasty digamma functions all over the place) into a
> >>>> streaming
> >>>> > > learner, by accreting
> >>>> > > the word-counts of each document onto the model you're using for
> >>>> > inference
> >>>> > > on the next
> >>>> > > documents.  The same exact idea can be done on the CVB0 inference
> >>>> > > technique, almost
> >>>> > > without change - as VB differs from CVB0 only in the E-step, not
> the
> >>>> > > M-step.
> >>>> > >
> >>>> > >  The problem which comes up when I've considered doing this kind
> of
> >>>> thing
> >>>> > > in the past
> >>>> > > is that if you do this in a distributed fashion, each member of
> the
> >>>> > > ensemble starts learning
> >>>> > > different topics simultaneously, and then the merge gets trickier.
> >>>>  You
> >>>> > can
> >>>> > > avoid this by
> >>>> > > doing some of the techniques mentioned in [2] for HDP, where you
> >>>> swap
> >>>> > > topic-ids on
> >>>> > > merge to make sure they match up, but I haven't investigated that
> >>>> very
> >>>> > > thoroughly.  The
> >>>> > > other way to avoid this problem is to use the parameter denoted
> >>>> \rho_t in
> >>>> > > Hoffman et al -
> >>>> > > this parameter is telling us how much to weight the model as it
> was
> >>>> up
> >>>> > > until now, against
> >>>> > > the updates from the latest document (alternatively, how much to
> >>>> "decay"
> >>>> > > previous
> >>>> > > documents).  If you don't let the topics drift *too much* during
> >>>> parallel
> >>>> > > learning, you could
> >>>> > > probably make sure that they match up just fine on each merge,
> while
> >>>> > still
> >>>> > > speeding up
> >>>> > > the process faster than fully batch learning.
> >>>> > >
> >>>> > >  So yeah, this is a great idea, but getting it to work in a
> >>>> distributed
> >>>> > > fashion is tricky.
> >>>> > > In a non-distributed form, this idea is almost completely
> >>>> implemented in
> >>>> > > the class
> >>>> > > InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
> >>>> > > technically in
> >>>> > > there already, as a parameter choice (initialModelCorpusFraction
> !=
> >>>> 0),
> >>>> > but
> >>>> > > I don't
> >>>> > > think it's working properly yet.  If you're interested in the
> >>>> problem,
> >>>> > > playing with this
> >>>> > > class would be a great place to start!
> >>>> > >
> >>>> > > Refrences:
> >>>> > > 1)
> >>>> > >
> >>>>
> http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
> >>>> > > 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
> >>>> > >
> >>>> > > On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
> >>>> > > dirk.weissenborn@googlemail.com> wrote:
> >>>> > >
> >>>> > > > Hello,
> >>>> > > >
> >>>> > > > I wanted to ask whether there is already an online learning
> >>>> algorithm
> >>>> > > > implementation for lda or not?
> >>>> > > >
> >>>> > > >
> http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
> >>>> > > >
> >>>> > > > cheers,
> >>>> > > > Dirk
> >>>> > > >
> >>>> > >
> >>>> > >
> >>>> > >
> >>>> > > --
> >>>> > >
> >>>> > >  -jake
> >>>> > >
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>>  -jake
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >>   -jake
> >>
> >>
> >
>



-- 

  -jake

Re: online lda learning algorithm

Posted by Dirk Weissenborn <di...@googlemail.com>.
Ok thanks,

maybe one question. Is it possible to train an already trained model on
just a few new documents with the provided algorithms or do you have to
train through the whole corpus again. What I mean is whether you can train
an lda model incrementally or not?
What I would actually like to do is training a topical classifier on top of
a lda model. Do you have any experience with that? I mean by changing the
lda model, the inputs for the classifier would also change. Do I have to
train a classifier from scratch again or can I reuse the classifier trained
on top of the older lda model and just ajust that one slightly?

2012/3/27 Dirk Weissenborn <di...@googlemail.com>

> no problem. I ll post it
>
>
> 2012/3/27 Jake Mannix <ja...@gmail.com>
>
>> Hey Dirk,
>>
>>   Do you mind continuing this discussion on the mailing list?  Lots of
>> our users may ask this kind of question in the future...
>>
>> On Mon, Mar 26, 2012 at 3:36 PM, Dirk Weissenborn <
>> dirk.weissenborn@googlemail.com> wrote:
>>
>>> Ok thanks,
>>>
>>> maybe one question. Is it possible to train an already trained model on
>>> just a few new documents with the provided algorithms or do you have to
>>> train through the whole corpus again. What I mean is whether you can train
>>> an lda model incrementally or not?
>>> What I would actually like to do is training a topical classifier on top
>>> of a lda model. Do you have any experience with that? I mean by changing
>>> the lda model, the inputs for the classifier would also change. Do I have
>>> to train a classifier from scratch again or can I reuse the classifier
>>> trained on top of the older lda model and just ajust that one?
>>>
>>>
>>> 2012/3/26 Jake Mannix <ja...@gmail.com>
>>>
>>>> On Mon, Mar 26, 2012 at 12:58 PM, Dirk Weissenborn <
>>>> dirk.weissenborn@googlemail.com> wrote:
>>>>
>>>> > Thank you for the quick response! It is possible that I need it in
>>>> not too
>>>> > far future maybe I ll implement on top what already exists, which
>>>> should
>>>> > not be that hard as you mentioned. I ll provide a patch when the time
>>>> > comes.
>>>> >
>>>>
>>>> Feel free to email any questions about using the
>>>> InMemoryCollapsedVariationalBayes0
>>>> class - it's mainly been used for testing, so far, but if you want to
>>>> take
>>>> that class
>>>> and clean it up and look into fixing the online learning aspect of it,
>>>> that'd be
>>>> excellent.  Let me know if you make any progress, because I'll probably
>>>> be
>>>> looking
>>>> to work on this at some point as well, but I won't if you're already
>>>> working on it. :)
>>>>
>>>>
>>>> >
>>>> > 2012/3/26 Jake Mannix <ja...@gmail.com>
>>>> >
>>>> > > Hi Dirk,
>>>> > >
>>>> > >  This has not been implemented in Mahout, but the version of
>>>> map-reduce
>>>> > > (batch)-learned
>>>> > > LDA which is done via (approximate+collapsed-) variational bayes
>>>> [1] is
>>>> > > reasonably easily
>>>> > > modifiable to the methods in this paper, as the LDA learner we
>>>> currently
>>>> > do
>>>> > > via iterative
>>>> > > MR passes is essentially an ensemble learner: each subset of the
>>>> data
>>>> > > partially trains a
>>>> > > full LDA model starting from the aggregate (summed) counts of all
>>>> of the
>>>> > > data from
>>>> > > previous iterations (see essentially the method named "approximately
>>>> > > distributed LDA" /
>>>> > > AD-LDA in Ref-[2]).
>>>> > >
>>>> > >  The method in the paper you refer to turns traditional VB (the
>>>> slower,
>>>> > > uncollapsed kind,
>>>> > > with the nasty digamma functions all over the place) into a
>>>> streaming
>>>> > > learner, by accreting
>>>> > > the word-counts of each document onto the model you're using for
>>>> > inference
>>>> > > on the next
>>>> > > documents.  The same exact idea can be done on the CVB0 inference
>>>> > > technique, almost
>>>> > > without change - as VB differs from CVB0 only in the E-step, not the
>>>> > > M-step.
>>>> > >
>>>> > >  The problem which comes up when I've considered doing this kind of
>>>> thing
>>>> > > in the past
>>>> > > is that if you do this in a distributed fashion, each member of the
>>>> > > ensemble starts learning
>>>> > > different topics simultaneously, and then the merge gets trickier.
>>>>  You
>>>> > can
>>>> > > avoid this by
>>>> > > doing some of the techniques mentioned in [2] for HDP, where you
>>>> swap
>>>> > > topic-ids on
>>>> > > merge to make sure they match up, but I haven't investigated that
>>>> very
>>>> > > thoroughly.  The
>>>> > > other way to avoid this problem is to use the parameter denoted
>>>> \rho_t in
>>>> > > Hoffman et al -
>>>> > > this parameter is telling us how much to weight the model as it was
>>>> up
>>>> > > until now, against
>>>> > > the updates from the latest document (alternatively, how much to
>>>> "decay"
>>>> > > previous
>>>> > > documents).  If you don't let the topics drift *too much* during
>>>> parallel
>>>> > > learning, you could
>>>> > > probably make sure that they match up just fine on each merge, while
>>>> > still
>>>> > > speeding up
>>>> > > the process faster than fully batch learning.
>>>> > >
>>>> > >  So yeah, this is a great idea, but getting it to work in a
>>>> distributed
>>>> > > fashion is tricky.
>>>> > > In a non-distributed form, this idea is almost completely
>>>> implemented in
>>>> > > the class
>>>> > > InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
>>>> > > technically in
>>>> > > there already, as a parameter choice (initialModelCorpusFraction !=
>>>> 0),
>>>> > but
>>>> > > I don't
>>>> > > think it's working properly yet.  If you're interested in the
>>>> problem,
>>>> > > playing with this
>>>> > > class would be a great place to start!
>>>> > >
>>>> > > Refrences:
>>>> > > 1)
>>>> > >
>>>> http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
>>>> > > 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
>>>> > >
>>>> > > On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
>>>> > > dirk.weissenborn@googlemail.com> wrote:
>>>> > >
>>>> > > > Hello,
>>>> > > >
>>>> > > > I wanted to ask whether there is already an online learning
>>>> algorithm
>>>> > > > implementation for lda or not?
>>>> > > >
>>>> > > > http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
>>>> > > >
>>>> > > > cheers,
>>>> > > > Dirk
>>>> > > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > >
>>>> > >  -jake
>>>> > >
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>  -jake
>>>>
>>>
>>>
>>
>>
>> --
>>
>>   -jake
>>
>>
>

Re: online lda learning algorithm

Posted by Jake Mannix <ja...@gmail.com>.
On Mon, Mar 26, 2012 at 12:58 PM, Dirk Weissenborn <
dirk.weissenborn@googlemail.com> wrote:

> Thank you for the quick response! It is possible that I need it in not too
> far future maybe I ll implement on top what already exists, which should
> not be that hard as you mentioned. I ll provide a patch when the time
> comes.
>

Feel free to email any questions about using the
InMemoryCollapsedVariationalBayes0
class - it's mainly been used for testing, so far, but if you want to take
that class
and clean it up and look into fixing the online learning aspect of it,
that'd be
excellent.  Let me know if you make any progress, because I'll probably be
looking
to work on this at some point as well, but I won't if you're already
working on it. :)


>
> 2012/3/26 Jake Mannix <ja...@gmail.com>
>
> > Hi Dirk,
> >
> >  This has not been implemented in Mahout, but the version of map-reduce
> > (batch)-learned
> > LDA which is done via (approximate+collapsed-) variational bayes [1] is
> > reasonably easily
> > modifiable to the methods in this paper, as the LDA learner we currently
> do
> > via iterative
> > MR passes is essentially an ensemble learner: each subset of the data
> > partially trains a
> > full LDA model starting from the aggregate (summed) counts of all of the
> > data from
> > previous iterations (see essentially the method named "approximately
> > distributed LDA" /
> > AD-LDA in Ref-[2]).
> >
> >  The method in the paper you refer to turns traditional VB (the slower,
> > uncollapsed kind,
> > with the nasty digamma functions all over the place) into a streaming
> > learner, by accreting
> > the word-counts of each document onto the model you're using for
> inference
> > on the next
> > documents.  The same exact idea can be done on the CVB0 inference
> > technique, almost
> > without change - as VB differs from CVB0 only in the E-step, not the
> > M-step.
> >
> >  The problem which comes up when I've considered doing this kind of thing
> > in the past
> > is that if you do this in a distributed fashion, each member of the
> > ensemble starts learning
> > different topics simultaneously, and then the merge gets trickier.  You
> can
> > avoid this by
> > doing some of the techniques mentioned in [2] for HDP, where you swap
> > topic-ids on
> > merge to make sure they match up, but I haven't investigated that very
> > thoroughly.  The
> > other way to avoid this problem is to use the parameter denoted \rho_t in
> > Hoffman et al -
> > this parameter is telling us how much to weight the model as it was up
> > until now, against
> > the updates from the latest document (alternatively, how much to "decay"
> > previous
> > documents).  If you don't let the topics drift *too much* during parallel
> > learning, you could
> > probably make sure that they match up just fine on each merge, while
> still
> > speeding up
> > the process faster than fully batch learning.
> >
> >  So yeah, this is a great idea, but getting it to work in a distributed
> > fashion is tricky.
> > In a non-distributed form, this idea is almost completely implemented in
> > the class
> > InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
> > technically in
> > there already, as a parameter choice (initialModelCorpusFraction != 0),
> but
> > I don't
> > think it's working properly yet.  If you're interested in the problem,
> > playing with this
> > class would be a great place to start!
> >
> > Refrences:
> > 1)
> > http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
> > 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
> >
> > On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
> > dirk.weissenborn@googlemail.com> wrote:
> >
> > > Hello,
> > >
> > > I wanted to ask whether there is already an online learning algorithm
> > > implementation for lda or not?
> > >
> > > http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
> > >
> > > cheers,
> > > Dirk
> > >
> >
> >
> >
> > --
> >
> >  -jake
> >
>



-- 

  -jake

Re: online lda learning algorithm

Posted by Dirk Weissenborn <di...@googlemail.com>.
Thank you for the quick response! It is possible that I need it in not too
far future maybe I ll implement on top what already exists, which should
not be that hard as you mentioned. I ll provide a patch when the time comes.

2012/3/26 Jake Mannix <ja...@gmail.com>

> Hi Dirk,
>
>  This has not been implemented in Mahout, but the version of map-reduce
> (batch)-learned
> LDA which is done via (approximate+collapsed-) variational bayes [1] is
> reasonably easily
> modifiable to the methods in this paper, as the LDA learner we currently do
> via iterative
> MR passes is essentially an ensemble learner: each subset of the data
> partially trains a
> full LDA model starting from the aggregate (summed) counts of all of the
> data from
> previous iterations (see essentially the method named "approximately
> distributed LDA" /
> AD-LDA in Ref-[2]).
>
>  The method in the paper you refer to turns traditional VB (the slower,
> uncollapsed kind,
> with the nasty digamma functions all over the place) into a streaming
> learner, by accreting
> the word-counts of each document onto the model you're using for inference
> on the next
> documents.  The same exact idea can be done on the CVB0 inference
> technique, almost
> without change - as VB differs from CVB0 only in the E-step, not the
> M-step.
>
>  The problem which comes up when I've considered doing this kind of thing
> in the past
> is that if you do this in a distributed fashion, each member of the
> ensemble starts learning
> different topics simultaneously, and then the merge gets trickier.  You can
> avoid this by
> doing some of the techniques mentioned in [2] for HDP, where you swap
> topic-ids on
> merge to make sure they match up, but I haven't investigated that very
> thoroughly.  The
> other way to avoid this problem is to use the parameter denoted \rho_t in
> Hoffman et al -
> this parameter is telling us how much to weight the model as it was up
> until now, against
> the updates from the latest document (alternatively, how much to "decay"
> previous
> documents).  If you don't let the topics drift *too much* during parallel
> learning, you could
> probably make sure that they match up just fine on each merge, while still
> speeding up
> the process faster than fully batch learning.
>
>  So yeah, this is a great idea, but getting it to work in a distributed
> fashion is tricky.
> In a non-distributed form, this idea is almost completely implemented in
> the class
> InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
> technically in
> there already, as a parameter choice (initialModelCorpusFraction != 0), but
> I don't
> think it's working properly yet.  If you're interested in the problem,
> playing with this
> class would be a great place to start!
>
> Refrences:
> 1)
> http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
> 2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf
>
> On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
> dirk.weissenborn@googlemail.com> wrote:
>
> > Hello,
> >
> > I wanted to ask whether there is already an online learning algorithm
> > implementation for lda or not?
> >
> > http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
> >
> > cheers,
> > Dirk
> >
>
>
>
> --
>
>  -jake
>

Re: online lda learning algorithm

Posted by Jake Mannix <ja...@gmail.com>.
Hi Dirk,

  This has not been implemented in Mahout, but the version of map-reduce
(batch)-learned
LDA which is done via (approximate+collapsed-) variational bayes [1] is
reasonably easily
modifiable to the methods in this paper, as the LDA learner we currently do
via iterative
MR passes is essentially an ensemble learner: each subset of the data
partially trains a
full LDA model starting from the aggregate (summed) counts of all of the
data from
previous iterations (see essentially the method named "approximately
distributed LDA" /
AD-LDA in Ref-[2]).

  The method in the paper you refer to turns traditional VB (the slower,
uncollapsed kind,
with the nasty digamma functions all over the place) into a streaming
learner, by accreting
the word-counts of each document onto the model you're using for inference
on the next
documents.  The same exact idea can be done on the CVB0 inference
technique, almost
without change - as VB differs from CVB0 only in the E-step, not the M-step.

  The problem which comes up when I've considered doing this kind of thing
in the past
is that if you do this in a distributed fashion, each member of the
ensemble starts learning
different topics simultaneously, and then the merge gets trickier.  You can
avoid this by
doing some of the techniques mentioned in [2] for HDP, where you swap
topic-ids on
merge to make sure they match up, but I haven't investigated that very
thoroughly.  The
other way to avoid this problem is to use the parameter denoted \rho_t in
Hoffman et al -
this parameter is telling us how much to weight the model as it was up
until now, against
the updates from the latest document (alternatively, how much to "decay"
previous
documents).  If you don't let the topics drift *too much* during parallel
learning, you could
probably make sure that they match up just fine on each merge, while still
speeding up
the process faster than fully batch learning.

  So yeah, this is a great idea, but getting it to work in a distributed
fashion is tricky.
In a non-distributed form, this idea is almost completely implemented in
the class
InMemoryCollapsedVariationalBayes0.  I say "almost" because it's
technically in
there already, as a parameter choice (initialModelCorpusFraction != 0), but
I don't
think it's working properly yet.  If you're interested in the problem,
playing with this
class would be a great place to start!

Refrences:
1) http://eprints.pascal-network.org/archive/00006729/01/AsuWelSmy2009a.pdf
2) http://www.csee.ogi.edu/~zak/cs506-pslc/dist_lda.pdf

On Mon, Mar 26, 2012 at 11:54 AM, Dirk Weissenborn <
dirk.weissenborn@googlemail.com> wrote:

> Hello,
>
> I wanted to ask whether there is already an online learning algorithm
> implementation for lda or not?
>
> http://www.cs.princeton.edu/~blei/papers/HoffmanBleiBach2010b.pdf
>
> cheers,
> Dirk
>



-- 

  -jake