You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by 张佳宝 <zh...@gmail.com> on 2010/05/12 04:04:46 UTC

named entity recognization though Hadoop map reduce frame

Hi,
I am working with named entity recognization though Hadoop map reduce frame
using large mount of website-data.It is a similar work to Mahout ,so I want
to know if there is  anyone have done this work?and if you are intersted in
it ,i can contribute it to you when I totally finished it .

Re: named entity recognization though Hadoop map reduce frame

Posted by Grant Ingersoll <gs...@apache.org>.
On May 13, 2010, at 2:38 AM, Jake Mannix wrote:

> It sounds like if someone is going to run CRF-NER on a massive
> data set (wikipedia, the web, etc), then parallelizing it on Hadoop
> makes total sense, yes.  In this case, the parallelization is of
> the "trivial" sort (null Reducer), most likely, but it's still a totally
> sensible thing to do.
> 
> If Mahout had a NLP subproject, this kind of thing would fit
> in there, and would certainly be a welcome contribution, but we
> don't yet.

I think we could still take the contribution, as it doesn't require a subproject/module to be setup just yet.  We have collocations now, so it is heading towards critical mass.


> 
> Maybe you should put it up on google-code or github, and if
> you Apache license, it could be easily incorporated (here or
> elsewhere) later.
> 
>  -jake
> 
> 2010/5/12 张佳宝 <zh...@gmail.com>
> 
>> I am not hadooping the traning,I am concerning the split of dateset using
>> hadoop,is this a useful work?
>> 
>> 2010/5/12 Benson Margulies <bi...@gmail.com>
>> 
>>> I assume that you are hadooping the training? The decoder is likely to
>>> fast to bother with.
>>> 
>>> 
>>> 2010/5/12 张佳宝 <zh...@gmail.com>:
>>>> CRF
>>>> 
>>>> 2010/5/12 Benson Margulies <bi...@gmail.com>
>>>> 
>>>>> What sort of model are you using?
>>>>> 
>>>>> On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
>>>>>> Hi,
>>>>>> I am working with named entity recognization though Hadoop map
>> reduce
>>>>> frame
>>>>>> using large mount of website-data.It is a similar work to Mahout ,so
>> I
>>>>> want
>>>>>> to know if there is  anyone have done this work?and if you are
>>> intersted
>>>>> in
>>>>>> it ,i can contribute it to you when I totally finished it .
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


Re: named entity recognization though Hadoop map reduce frame

Posted by Jake Mannix <ja...@gmail.com>.
It sounds like if someone is going to run CRF-NER on a massive
data set (wikipedia, the web, etc), then parallelizing it on Hadoop
makes total sense, yes.  In this case, the parallelization is of
the "trivial" sort (null Reducer), most likely, but it's still a totally
sensible thing to do.

If Mahout had a NLP subproject, this kind of thing would fit
in there, and would certainly be a welcome contribution, but we
don't yet.

Maybe you should put it up on google-code or github, and if
you Apache license, it could be easily incorporated (here or
elsewhere) later.

  -jake

2010/5/12 张佳宝 <zh...@gmail.com>

> I am not hadooping the traning,I am concerning the split of dateset using
> hadoop,is this a useful work?
>
> 2010/5/12 Benson Margulies <bi...@gmail.com>
>
> > I assume that you are hadooping the training? The decoder is likely to
> > fast to bother with.
> >
> >
> > 2010/5/12 张佳宝 <zh...@gmail.com>:
> > > CRF
> > >
> > > 2010/5/12 Benson Margulies <bi...@gmail.com>
> > >
> > >> What sort of model are you using?
> > >>
> > >> On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
> > >> > Hi,
> > >> > I am working with named entity recognization though Hadoop map
> reduce
> > >> frame
> > >> > using large mount of website-data.It is a similar work to Mahout ,so
> I
> > >> want
> > >> > to know if there is  anyone have done this work?and if you are
> > intersted
> > >> in
> > >> > it ,i can contribute it to you when I totally finished it .
> > >> >
> > >>
> > >
> >
>

Re: named entity recognization though Hadoop map reduce frame

Posted by 张佳宝 <zh...@gmail.com>.
I am not hadooping the traning,I am concerning the split of dateset using
hadoop,is this a useful work?

2010/5/12 Benson Margulies <bi...@gmail.com>

> I assume that you are hadooping the training? The decoder is likely to
> fast to bother with.
>
>
> 2010/5/12 张佳宝 <zh...@gmail.com>:
> > CRF
> >
> > 2010/5/12 Benson Margulies <bi...@gmail.com>
> >
> >> What sort of model are you using?
> >>
> >> On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
> >> > Hi,
> >> > I am working with named entity recognization though Hadoop map reduce
> >> frame
> >> > using large mount of website-data.It is a similar work to Mahout ,so I
> >> want
> >> > to know if there is  anyone have done this work?and if you are
> intersted
> >> in
> >> > it ,i can contribute it to you when I totally finished it .
> >> >
> >>
> >
>

Re: named entity recognization though Hadoop map reduce frame

Posted by Benson Margulies <bi...@gmail.com>.
I assume that you are hadooping the training? The decoder is likely to
fast to bother with.


2010/5/12 张佳宝 <zh...@gmail.com>:
> CRF
>
> 2010/5/12 Benson Margulies <bi...@gmail.com>
>
>> What sort of model are you using?
>>
>> On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
>> > Hi,
>> > I am working with named entity recognization though Hadoop map reduce
>> frame
>> > using large mount of website-data.It is a similar work to Mahout ,so I
>> want
>> > to know if there is  anyone have done this work?and if you are intersted
>> in
>> > it ,i can contribute it to you when I totally finished it .
>> >
>>
>

Re: named entity recognization though Hadoop map reduce frame

Posted by 张佳宝 <zh...@gmail.com>.
CRF

2010/5/12 Benson Margulies <bi...@gmail.com>

> What sort of model are you using?
>
> On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
> > Hi,
> > I am working with named entity recognization though Hadoop map reduce
> frame
> > using large mount of website-data.It is a similar work to Mahout ,so I
> want
> > to know if there is  anyone have done this work?and if you are intersted
> in
> > it ,i can contribute it to you when I totally finished it .
> >
>

Re: named entity recognization though Hadoop map reduce frame

Posted by Benson Margulies <bi...@gmail.com>.
What sort of model are you using?

On Tue, May 11, 2010 at 10:04 PM, 张佳宝 <zh...@gmail.com> wrote:
> Hi,
> I am working with named entity recognization though Hadoop map reduce frame
> using large mount of website-data.It is a similar work to Mahout ,so I want
> to know if there is  anyone have done this work?and if you are intersted in
> it ,i can contribute it to you when I totally finished it .
>