You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@opennlp.apache.org by "Manoj B. Narayanan" <ma...@gmail.com> on 2018/01/18 13:23:15 UTC

Model file

Hi all,

Just curious to know what the content of the *.bin* file is. How are the
probabilities of the features calculated and how are they used for
prediction ?

I believe it will make my understanding better. Kindly guide me.

Thanks,
Manoj.

Re: Model file

Posted by "Manoj B. Narayanan" <ma...@gmail.com>.

Thanks a lot Dan.

On Thu, Jan 18, 2018 at 7:54 PM, Dan Russ <da...@gmail.com> wrote:

> Have you looked at either the GISBinaryModelWriter or Reader?  It’s fairly
> simple, something like
>
>
> For models trained with the GISTrainer…
>
> GIS
> # of outcomes
> <list of outcomes>
> # of predictors
> Predictor-1 # of outcomes for this predictor;
> outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;
> Predictor-2 # of outcomes for this predictor;
> outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;
> …
> Predictor-z # of outcomes for this predictor;
> outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;
>
> Model trained with other trainers are similar, but with slight
> variations.  I think the QNTrainer starts with “QN” instead of GIS and the
> predictors/outcomes are reversed.
>
> I’m doing this from memory, so it may be slightly different.  But this
> logic AND the source code should get you started.
>
> For you are looking at BaseModels, e.g. POSModel, SentenceDetectorModel,
> the format is a little more complicated and you will need to look at the
> code.  These models have more than just a maxent model, but associated code
> to make results what you expect.
>
> Hope it helps.
> Dan
>
> > On Jan 18, 2018, at 8:23 AM, Manoj B. Narayanan <
> manojb.narayanan2011@gmail.com> wrote:
> >
> > Hi all,
> >
> > Just curious to know what the content of the *.bin* file is. How are the
> > probabilities of the features calculated and how are they used for
> > prediction ?
> >
> > I believe it will make my understanding better. Kindly guide me.
> >
> > Thanks,
> > Manoj.
>
>


-- 
Regards,
Manoj.

Re: Model file

Posted by Dan Russ <da...@gmail.com>.

Have you looked at either the GISBinaryModelWriter or Reader?  It’s fairly simple, something like 

For models trained with the GISTrainer…

GIS
# of outcomes
<list of outcomes> 
# of predictors
Predictor-1 # of outcomes for this predictor; outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;
Predictor-2 # of outcomes for this predictor; outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;
…
Predictor-z # of outcomes for this predictor; outcome-1-id,weight;outcome-2,weight;…outcome-n,weight;

Model trained with other trainers are similar, but with slight variations.  I think the QNTrainer starts with “QN” instead of GIS and the predictors/outcomes are reversed.

I’m doing this from memory, so it may be slightly different.  But this logic AND the source code should get you started.

For you are looking at BaseModels, e.g. POSModel, SentenceDetectorModel, the format is a little more complicated and you will need to look at the code.  These models have more than just a maxent model, but associated code to make results what you expect.

Hope it helps.
Dan

> On Jan 18, 2018, at 8:23 AM, Manoj B. Narayanan <ma...@gmail.com> wrote:
> 
> Hi all,
> 
> Just curious to know what the content of the *.bin* file is. How are the
> probabilities of the features calculated and how are they used for
> prediction ?
> 
> I believe it will make my understanding better. Kindly guide me.
> 
> Thanks,
> Manoj.