You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by Lahiru Jayasekera <ml...@gmail.com> on 2019/04/06 19:53:28 UTC

POC for hazelcast jet integration

Hi all,
I have come up with a poc for hazelcast jet execution engine support for
gora.
https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java

Here I have wrote a custom source for jet. What this does is read the
AccessLog table created via LogManager example, and feed the PageView
objects to jet.

I have also looked into using hdfs connectors of jet. But I encountered a
problem with conflicting hadoop versions. InputFormat class used in gora is
from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
org.apache.hadoop.mapred.InputFormat'.

Anyway the proposed poc works as expected. Please add your suggestions here.

Thanks and regards

-- 
Lahiru Jayasekara
Batch 15
Faculty of Information Technology
University of Moratuwa
0716492170

Re: POC for hazelcast jet integration

Posted by Lahiru Jayasekera <ml...@gmail.com>.
Sure, will do that.

Thanks

On Mon, May 27, 2019 at 2:41 PM Kevin Ratnasekera <dj...@gmail.com>
wrote:

> Hi Lahiru,
>
> As per the offline chat we had, please start a new mail thread on adding
> design details, related to approach ( writing custom source and sink
> connectors [3] ) you have taken on implementation of Jet execution engine
> support. That way community can comment on your project work.
>
> [1] https://jet.hazelcast.org/connectors/batch-connectors/
>
> Regards
> Kevin
>
> On Sun, May 19, 2019 at 1:15 PM Lahiru Jayasekera <ml...@gmail.com>
> wrote:
>
>> Hi all,
>> This is the[1] hdfs connector for hazelcast jet.
>>
>> Here the problem is hazelcast jet is using `org.apache.hadoop.mapred`
>> package and gora is using `org.apache.hadoop.mapreduce` package.
>>
>> These are two APIs exposed by hadoop to create hadoop jobs and etc.
>> Eventhough these two are similar in functionality, both the packages are
>> shiped with hadoop. See this answer[2],
>>
>> So I thought of writing an InputFormat for gora with the support of older
>> package(`org.apache.hadoop.mapred`).
>>
>> Please correct me if I'm going in the wrong direction. Your Feedback is
>> appreciated.
>>
>> [1] https://docs.hazelcast.org/docs/jet/0.7/manual/#hdfs
>> [2] https://stackoverflow.com/a/7600339
>>
>> Thanks and regards
>> Lahiru
>>
>> On Fri, Apr 26, 2019 at 6:32 PM Lahiru Jayasekera <
>> mlpjayasekera@gmail.com>
>> wrote:
>>
>> > Hi Madhawa,
>> > Sorry for the late reply. Sure I'll try that and let you know.
>> >
>> > Thanks
>> >
>> > On Tue, Apr 23, 2019 at 2:24 PM Madhawa Kasun Gunasekara <
>> > madhawa30@gmail.com> wrote:
>> >
>> >> Hi Lahiru,
>> >>
>> >> Good initiative.
>> >> It seems like we need to use hadoop-hdfs and hadoop common with version
>> >> 2.8.3. Try adding these dependencies, and exclude the old dependency
>> from
>> >> the gora, or otherwise, we can do a version upgrade in gora for
>> Hadoop. at
>> >> the moment we use Hadoop version 2.5.2 in gora but I prefer to upgrade
>> the
>> >> Hadoop versions in the gora to 2.8.3.
>> >>
>> >> Recently hazelcast developers released 3.0 version, We can try that
>> also.
>> >>
>> >> Thanks,
>> >> Madhawa
>> >>
>> >>
>> >> On Sat, Apr 6, 2019 at 9:53 PM Lahiru Jayasekera <
>> mlpjayasekera@gmail.com
>> >> >
>> >> wrote:
>> >>
>> >> > Hi all,
>> >> > I have come up with a poc for hazelcast jet execution engine support
>> for
>> >> > gora.
>> >> >
>> >> >
>> >>
>> https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java
>> >> >
>> >> > Here I have wrote a custom source for jet. What this does is read the
>> >> > AccessLog table created via LogManager example, and feed the PageView
>> >> > objects to jet.
>> >> >
>> >> > I have also looked into using hdfs connectors of jet. But I
>> encountered
>> >> a
>> >> > problem with conflicting hadoop versions. InputFormat class used in
>> >> gora is
>> >> > from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
>> >> > org.apache.hadoop.mapred.InputFormat'.
>> >> >
>> >> > Anyway the proposed poc works as expected. Please add your
>> suggestions
>> >> > here.
>> >> >
>> >> > Thanks and regards
>> >> >
>> >> > --
>> >> > Lahiru Jayasekara
>> >> > Batch 15
>> >> > Faculty of Information Technology
>> >> > University of Moratuwa
>> >> > 0716492170
>> >> >
>> >>
>> >
>> >
>> > --
>> > Lahiru Jayasekara
>> > Batch 15
>> > Faculty of Information Technology
>> > University of Moratuwa
>> > 0716492170
>> >
>>
>>
>> --
>> Lahiru Jayasekara
>> Batch 15
>> Faculty of Information Technology
>> University of Moratuwa
>> 0716492170
>>
>

-- 
Lahiru Jayasekara
Batch 15
Faculty of Information Technology
University of Moratuwa
0716492170

Re: POC for hazelcast jet integration

Posted by Kevin Ratnasekera <dj...@gmail.com>.
Hi Lahiru,

As per the offline chat we had, please start a new mail thread on adding
design details, related to approach ( writing custom source and sink
connectors [3] ) you have taken on implementation of Jet execution engine
support. That way community can comment on your project work.

[1] https://jet.hazelcast.org/connectors/batch-connectors/

Regards
Kevin

On Sun, May 19, 2019 at 1:15 PM Lahiru Jayasekera <ml...@gmail.com>
wrote:

> Hi all,
> This is the[1] hdfs connector for hazelcast jet.
>
> Here the problem is hazelcast jet is using `org.apache.hadoop.mapred`
> package and gora is using `org.apache.hadoop.mapreduce` package.
>
> These are two APIs exposed by hadoop to create hadoop jobs and etc.
> Eventhough these two are similar in functionality, both the packages are
> shiped with hadoop. See this answer[2],
>
> So I thought of writing an InputFormat for gora with the support of older
> package(`org.apache.hadoop.mapred`).
>
> Please correct me if I'm going in the wrong direction. Your Feedback is
> appreciated.
>
> [1] https://docs.hazelcast.org/docs/jet/0.7/manual/#hdfs
> [2] https://stackoverflow.com/a/7600339
>
> Thanks and regards
> Lahiru
>
> On Fri, Apr 26, 2019 at 6:32 PM Lahiru Jayasekera <mlpjayasekera@gmail.com
> >
> wrote:
>
> > Hi Madhawa,
> > Sorry for the late reply. Sure I'll try that and let you know.
> >
> > Thanks
> >
> > On Tue, Apr 23, 2019 at 2:24 PM Madhawa Kasun Gunasekara <
> > madhawa30@gmail.com> wrote:
> >
> >> Hi Lahiru,
> >>
> >> Good initiative.
> >> It seems like we need to use hadoop-hdfs and hadoop common with version
> >> 2.8.3. Try adding these dependencies, and exclude the old dependency
> from
> >> the gora, or otherwise, we can do a version upgrade in gora for Hadoop.
> at
> >> the moment we use Hadoop version 2.5.2 in gora but I prefer to upgrade
> the
> >> Hadoop versions in the gora to 2.8.3.
> >>
> >> Recently hazelcast developers released 3.0 version, We can try that
> also.
> >>
> >> Thanks,
> >> Madhawa
> >>
> >>
> >> On Sat, Apr 6, 2019 at 9:53 PM Lahiru Jayasekera <
> mlpjayasekera@gmail.com
> >> >
> >> wrote:
> >>
> >> > Hi all,
> >> > I have come up with a poc for hazelcast jet execution engine support
> for
> >> > gora.
> >> >
> >> >
> >>
> https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java
> >> >
> >> > Here I have wrote a custom source for jet. What this does is read the
> >> > AccessLog table created via LogManager example, and feed the PageView
> >> > objects to jet.
> >> >
> >> > I have also looked into using hdfs connectors of jet. But I
> encountered
> >> a
> >> > problem with conflicting hadoop versions. InputFormat class used in
> >> gora is
> >> > from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
> >> > org.apache.hadoop.mapred.InputFormat'.
> >> >
> >> > Anyway the proposed poc works as expected. Please add your suggestions
> >> > here.
> >> >
> >> > Thanks and regards
> >> >
> >> > --
> >> > Lahiru Jayasekara
> >> > Batch 15
> >> > Faculty of Information Technology
> >> > University of Moratuwa
> >> > 0716492170
> >> >
> >>
> >
> >
> > --
> > Lahiru Jayasekara
> > Batch 15
> > Faculty of Information Technology
> > University of Moratuwa
> > 0716492170
> >
>
>
> --
> Lahiru Jayasekara
> Batch 15
> Faculty of Information Technology
> University of Moratuwa
> 0716492170
>

Re: POC for hazelcast jet integration

Posted by Lahiru Jayasekera <ml...@gmail.com>.
Hi all,
This is the[1] hdfs connector for hazelcast jet.

Here the problem is hazelcast jet is using `org.apache.hadoop.mapred`
package and gora is using `org.apache.hadoop.mapreduce` package.

These are two APIs exposed by hadoop to create hadoop jobs and etc.
Eventhough these two are similar in functionality, both the packages are
shiped with hadoop. See this answer[2],

So I thought of writing an InputFormat for gora with the support of older
package(`org.apache.hadoop.mapred`).

Please correct me if I'm going in the wrong direction. Your Feedback is
appreciated.

[1] https://docs.hazelcast.org/docs/jet/0.7/manual/#hdfs
[2] https://stackoverflow.com/a/7600339

Thanks and regards
Lahiru

On Fri, Apr 26, 2019 at 6:32 PM Lahiru Jayasekera <ml...@gmail.com>
wrote:

> Hi Madhawa,
> Sorry for the late reply. Sure I'll try that and let you know.
>
> Thanks
>
> On Tue, Apr 23, 2019 at 2:24 PM Madhawa Kasun Gunasekara <
> madhawa30@gmail.com> wrote:
>
>> Hi Lahiru,
>>
>> Good initiative.
>> It seems like we need to use hadoop-hdfs and hadoop common with version
>> 2.8.3. Try adding these dependencies, and exclude the old dependency from
>> the gora, or otherwise, we can do a version upgrade in gora for Hadoop. at
>> the moment we use Hadoop version 2.5.2 in gora but I prefer to upgrade the
>> Hadoop versions in the gora to 2.8.3.
>>
>> Recently hazelcast developers released 3.0 version, We can try that also.
>>
>> Thanks,
>> Madhawa
>>
>>
>> On Sat, Apr 6, 2019 at 9:53 PM Lahiru Jayasekera <mlpjayasekera@gmail.com
>> >
>> wrote:
>>
>> > Hi all,
>> > I have come up with a poc for hazelcast jet execution engine support for
>> > gora.
>> >
>> >
>> https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java
>> >
>> > Here I have wrote a custom source for jet. What this does is read the
>> > AccessLog table created via LogManager example, and feed the PageView
>> > objects to jet.
>> >
>> > I have also looked into using hdfs connectors of jet. But I encountered
>> a
>> > problem with conflicting hadoop versions. InputFormat class used in
>> gora is
>> > from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
>> > org.apache.hadoop.mapred.InputFormat'.
>> >
>> > Anyway the proposed poc works as expected. Please add your suggestions
>> > here.
>> >
>> > Thanks and regards
>> >
>> > --
>> > Lahiru Jayasekara
>> > Batch 15
>> > Faculty of Information Technology
>> > University of Moratuwa
>> > 0716492170
>> >
>>
>
>
> --
> Lahiru Jayasekara
> Batch 15
> Faculty of Information Technology
> University of Moratuwa
> 0716492170
>


-- 
Lahiru Jayasekara
Batch 15
Faculty of Information Technology
University of Moratuwa
0716492170

Re: POC for hazelcast jet integration

Posted by Lahiru Jayasekera <ml...@gmail.com>.
Hi Madhawa,
Sorry for the late reply. Sure I'll try that and let you know.

Thanks

On Tue, Apr 23, 2019 at 2:24 PM Madhawa Kasun Gunasekara <
madhawa30@gmail.com> wrote:

> Hi Lahiru,
>
> Good initiative.
> It seems like we need to use hadoop-hdfs and hadoop common with version
> 2.8.3. Try adding these dependencies, and exclude the old dependency from
> the gora, or otherwise, we can do a version upgrade in gora for Hadoop. at
> the moment we use Hadoop version 2.5.2 in gora but I prefer to upgrade the
> Hadoop versions in the gora to 2.8.3.
>
> Recently hazelcast developers released 3.0 version, We can try that also.
>
> Thanks,
> Madhawa
>
>
> On Sat, Apr 6, 2019 at 9:53 PM Lahiru Jayasekera <ml...@gmail.com>
> wrote:
>
> > Hi all,
> > I have come up with a poc for hazelcast jet execution engine support for
> > gora.
> >
> >
> https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java
> >
> > Here I have wrote a custom source for jet. What this does is read the
> > AccessLog table created via LogManager example, and feed the PageView
> > objects to jet.
> >
> > I have also looked into using hdfs connectors of jet. But I encountered a
> > problem with conflicting hadoop versions. InputFormat class used in gora
> is
> > from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
> > org.apache.hadoop.mapred.InputFormat'.
> >
> > Anyway the proposed poc works as expected. Please add your suggestions
> > here.
> >
> > Thanks and regards
> >
> > --
> > Lahiru Jayasekara
> > Batch 15
> > Faculty of Information Technology
> > University of Moratuwa
> > 0716492170
> >
>


-- 
Lahiru Jayasekara
Batch 15
Faculty of Information Technology
University of Moratuwa
0716492170

Re: POC for hazelcast jet integration

Posted by Madhawa Kasun Gunasekara <ma...@gmail.com>.
Hi Lahiru,

Good initiative.
It seems like we need to use hadoop-hdfs and hadoop common with version
2.8.3. Try adding these dependencies, and exclude the old dependency from
the gora, or otherwise, we can do a version upgrade in gora for Hadoop. at
the moment we use Hadoop version 2.5.2 in gora but I prefer to upgrade the
Hadoop versions in the gora to 2.8.3.

Recently hazelcast developers released 3.0 version, We can try that also.

Thanks,
Madhawa


On Sat, Apr 6, 2019 at 9:53 PM Lahiru Jayasekera <ml...@gmail.com>
wrote:

> Hi all,
> I have come up with a poc for hazelcast jet execution engine support for
> gora.
>
> https://github.com/LahiruJayasekara/gora/blob/poc-hazelcast-jet/gora-tutorial/src/main/java/org/apache/gora/tutorial/log/HazelcastJetPOC.java
>
> Here I have wrote a custom source for jet. What this does is read the
> AccessLog table created via LogManager example, and feed the PageView
> objects to jet.
>
> I have also looked into using hdfs connectors of jet. But I encountered a
> problem with conflicting hadoop versions. InputFormat class used in gora is
> from pacjage 'org.apache.hadoop.mapreduce', but in jet it is '
> org.apache.hadoop.mapred.InputFormat'.
>
> Anyway the proposed poc works as expected. Please add your suggestions
> here.
>
> Thanks and regards
>
> --
> Lahiru Jayasekara
> Batch 15
> Faculty of Information Technology
> University of Moratuwa
> 0716492170
>