You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Luiz Antonio Falaguasta Barbosa <la...@gmail.com> on 2012/01/25 19:20:33 UTC

Ivory (per term indexing) with Hadoop

Hi people,

Please, does somebody know where could I find an implementation of per
term inverted indexing (Ivory), like that showed in figure 4 of paper
http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?

I just would like to take some source code like that
http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
with the per term indexing.

Does somebody have it?

Thanks in advance!

Regards,

Luiz

Re: Ivory (per term indexing) with Hadoop

Posted by Luiz Antonio Falaguasta Barbosa <la...@gmail.com>.
I don't know, Kumar. I started to handle Hadoop now.

2012/1/26 Ashwanth Kumar <as...@googlemail.com>

> Don't we have Lucene plugin kind of thing for Hadoop?
>
>  - Ashwanth
>
>
> On Thu, Jan 26, 2012 at 9:27 AM, Leonardo Gamas <leogamas@jusbrasil.com.br
> > wrote:
>
>> Luiz,
>>
>> You could create a class that implements the Writable interface and holds
>> Doc-ID and TF, let's say DocTF. Then you emit it with context.write(term,
>> DocTF) in your Mapper class.
>> Another option is to use a generic Pair<A,B>, that implements Writable,
>> to hold your data.
>>
>> P.S.: It's not mandatory to implement Writable. You could use another
>> serialization framework, but Writable will work without any additional
>> configuration.
>>
>> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>>
>>> People,
>>>
>>> Only for explain it better, this is figure 4:
>>>
>>> [image: image.png]
>>>
>>> It seems to be difficult to implement lines 9 to 11 of map method.
>>>
>>> Does anybody how to do this? I'd tried to find it in Ivory (
>>> http://lintool.github.com/Ivory/) and Cloud9 (
>>> https://github.com/lintool/**Cloud9 <https://github.com/lintool/Cloud9>)
>>> but I didn't get.
>>>
>>> Regards,
>>>
>>> Luiz
>>>
>>>
>>> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>>>
>>>> Hi people,
>>>>
>>>> Please, does somebody know where could I find an implementation of per
>>>> term inverted indexing (Ivory), like that showed in figure 4 of paper
>>>> http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?
>>>>
>>>> I just would like to take some source code like that
>>>> http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
>>>> with the per term indexing.
>>>>
>>>> Does somebody have it?
>>>>
>>>> Thanks in advance!
>>>>
>>>> Regards,
>>>>
>>>> Luiz
>>>
>>>
>>>
>>>
>>> --
>>> []s,
>>>
>>> Luiz
>>>
>>
>>
>>
>> --
>>
>> *Leonardo Gamas*
>> Software Engineer
>> T +55 (71) 3494-3514
>> C +55 (75) 8134-7440
>> leogamas@jusbrasil.com.br
>> www.jusbrasil.com.br
>>
>>
>


-- 
[]s,

Luiz

Re: Ivory (per term indexing) with Hadoop

Posted by Ashwanth Kumar <as...@googlemail.com>.
Don't we have Lucene plugin kind of thing for Hadoop?

 - Ashwanth

On Thu, Jan 26, 2012 at 9:27 AM, Leonardo Gamas
<le...@jusbrasil.com.br>wrote:

> Luiz,
>
> You could create a class that implements the Writable interface and holds
> Doc-ID and TF, let's say DocTF. Then you emit it with context.write(term,
> DocTF) in your Mapper class.
> Another option is to use a generic Pair<A,B>, that implements Writable, to
> hold your data.
>
> P.S.: It's not mandatory to implement Writable. You could use another
> serialization framework, but Writable will work without any additional
> configuration.
>
> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>
>> People,
>>
>> Only for explain it better, this is figure 4:
>>
>> [image: image.png]
>>
>> It seems to be difficult to implement lines 9 to 11 of map method.
>>
>> Does anybody how to do this? I'd tried to find it in Ivory (
>> http://lintool.github.com/Ivory/) and Cloud9 (https://github.com/lintool/
>> **Cloud9 <https://github.com/lintool/Cloud9>) but I didn't get.
>>
>> Regards,
>>
>> Luiz
>>
>>
>> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>>
>>> Hi people,
>>>
>>> Please, does somebody know where could I find an implementation of per
>>> term inverted indexing (Ivory), like that showed in figure 4 of paper
>>> http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?
>>>
>>> I just would like to take some source code like that
>>> http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
>>> with the per term indexing.
>>>
>>> Does somebody have it?
>>>
>>> Thanks in advance!
>>>
>>> Regards,
>>>
>>> Luiz
>>
>>
>>
>>
>> --
>> []s,
>>
>> Luiz
>>
>
>
>
> --
>
> *Leonardo Gamas*
> Software Engineer
> T +55 (71) 3494-3514
> C +55 (75) 8134-7440
> leogamas@jusbrasil.com.br
> www.jusbrasil.com.br
>
>

Re: Ivory (per term indexing) with Hadoop

Posted by Luiz Antonio Falaguasta Barbosa <la...@gmail.com>.
Thanks Leonardo!

I'll try it.

Luiz

2012/1/26 Leonardo Gamas <le...@jusbrasil.com.br>

> Luiz,
>
> You could create a class that implements the Writable interface and holds
> Doc-ID and TF, let's say DocTF. Then you emit it with context.write(term,
> DocTF) in your Mapper class.
> Another option is to use a generic Pair<A,B>, that implements Writable, to
> hold your data.
>
> P.S.: It's not mandatory to implement Writable. You could use another
> serialization framework, but Writable will work without any additional
> configuration.
>
> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>
>> People,
>>
>> Only for explain it better, this is figure 4:
>>
>> [image: image.png]
>>
>> It seems to be difficult to implement lines 9 to 11 of map method.
>>
>> Does anybody how to do this? I'd tried to find it in Ivory (
>> http://lintool.github.com/Ivory/) and Cloud9 (https://github.com/lintool/
>> **Cloud9 <https://github.com/lintool/Cloud9>) but I didn't get.
>>
>> Regards,
>>
>> Luiz
>>
>>
>> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>>
>>> Hi people,
>>>
>>> Please, does somebody know where could I find an implementation of per
>>> term inverted indexing (Ivory), like that showed in figure 4 of paper
>>> http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?
>>>
>>> I just would like to take some source code like that
>>> http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
>>> with the per term indexing.
>>>
>>> Does somebody have it?
>>>
>>> Thanks in advance!
>>>
>>> Regards,
>>>
>>> Luiz
>>
>>
>>
>>
>> --
>> []s,
>>
>> Luiz
>>
>
>
>
> --
>
> *Leonardo Gamas*
> Software Engineer
> T +55 (71) 3494-3514
> C +55 (75) 8134-7440
> leogamas@jusbrasil.com.br
> www.jusbrasil.com.br
>
>


-- 
[]s,

Luiz

Re: Ivory (per term indexing) with Hadoop

Posted by Leonardo Gamas <le...@jusbrasil.com.br>.
Luiz,

You could create a class that implements the Writable interface and holds
Doc-ID and TF, let's say DocTF. Then you emit it with context.write(term,
DocTF) in your Mapper class.
Another option is to use a generic Pair<A,B>, that implements Writable, to
hold your data.

P.S.: It's not mandatory to implement Writable. You could use another
serialization framework, but Writable will work without any additional
configuration.

2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>

> People,
>
> Only for explain it better, this is figure 4:
>
> [image: image.png]
>
> It seems to be difficult to implement lines 9 to 11 of map method.
>
> Does anybody how to do this? I'd tried to find it in Ivory (
> http://lintool.github.com/Ivory/) and Cloud9 (https://github.com/lintool/*
> *Cloud9 <https://github.com/lintool/Cloud9>) but I didn't get.
>
> Regards,
>
> Luiz
>
>
> 2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>
>
>> Hi people,
>>
>> Please, does somebody know where could I find an implementation of per
>> term inverted indexing (Ivory), like that showed in figure 4 of paper
>> http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?
>>
>> I just would like to take some source code like that
>> http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
>> with the per term indexing.
>>
>> Does somebody have it?
>>
>> Thanks in advance!
>>
>> Regards,
>>
>> Luiz
>
>
>
>
> --
> []s,
>
> Luiz
>



-- 

*Leonardo Gamas*
Software Engineer
T +55 (71) 3494-3514
C +55 (75) 8134-7440
leogamas@jusbrasil.com.br
www.jusbrasil.com.br

Re: Ivory (per term indexing) with Hadoop

Posted by Luiz Antonio Falaguasta Barbosa <la...@gmail.com>.
People,

Only for explain it better, this is figure 4:

[image: image.png]

It seems to be difficult to implement lines 9 to 11 of map method.

Does anybody how to do this? I'd tried to find it in Ivory (
http://lintool.github.com/Ivory/) and Cloud9 (https://github.com/lintool/**
Cloud9 <https://github.com/lintool/Cloud9>) but I didn't get.

Regards,

Luiz

2012/1/25 Luiz Antonio Falaguasta Barbosa <la...@gmail.com>

> Hi people,
>
> Please, does somebody know where could I find an implementation of per
> term inverted indexing (Ivory), like that showed in figure 4 of paper
> http://www.dcs.gla.ac.uk/~richardm/papers/IPM_MapReduce.pdf ?
>
> I just would like to take some source code like that
> http://developer.yahoo.com/hadoop/tutorial/module4.html and change it
> with the per term indexing.
>
> Does somebody have it?
>
> Thanks in advance!
>
> Regards,
>
> Luiz




-- 
[]s,

Luiz