You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Mustafa Elbehery <el...@gmail.com> on 2015/02/27 10:37:43 UTC

Re: Tweets Custom Input Format

Hi,

I am really sorry for being so late, it was a whole month of projects and
examination, I was really busy.

@Robert, it is IF for reading tweet into Pojo. I use an event-driven
parser, I retrieve most of the tweet into Java Pojos, it was tested on 1TB
dataset, for a Flink ETL job, and the performance was pretty good.



On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rm...@apache.org> wrote:

> Hey,
>
> is it a input format for reading JSON data or an IF for reading tweets in
> some format into a pojo?
>
> I think a JSON Input Format would be something very useful for our users.
> Maybe you can add that and use the Tweet IF as a concrete example for that?
> Do you have a preview of the code somewhere?
>
> Best,
> Robert
>
> On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> > Hi Mustafa,
> >
> > that would be a nice contribution!
> >
> > We are currently discussing how to add "non-core" API features into Flink
> > [1].
> > I will move this discussion onto the mailing list to decide where to add
> > cool add-ons like yours.
> >
> > Cheers, Fabian
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-1398
> >
> > 2015-01-23 20:42 GMT+01:00 Henry Saputra <he...@gmail.com>:
> >
> > > Contributions are welcomed!
> > >
> > > Here is the link on how to contribute to Apache Flink:
> > > http://flink.apache.org/how-to-contribute.html
> > >
> > > You can start by creating JIRA ticket [1] to help describe what you
> > > wanted to do and to get feedback from community.
> > >
> > >
> > > - Henry
> > >
> > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > >
> > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > <el...@gmail.com> wrote:
> > > > Hi,
> > > >
> > > > I have created a custom InputFormat for tweets on Flink, based on
> > > > JSON-Simple event driven parser. I would like to contribute my work
> > into
> > > > Flink,
> > > >
> > > > Regards.
> > > >
> > > > --
> > > > Mustafa Elbehery
> > > > EIT ICT Labs Master School <
> > http://www.masterschool.eitictlabs.eu/home/>
> > > > +49(0)15218676094
> > > > skype: mustafaelbehery87
> > >
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87

Re: Tweets Custom Input Format

Posted by Robert Metzger <rm...@apache.org>.
Great. Thank you.

I gave some feedback in the pull request and asked some questions there.

On Fri, Feb 27, 2015 at 5:43 PM, Mustafa Elbehery <elbeherymustafa@gmail.com
> wrote:

> @robert,
>
> I have created the PR https://github.com/apache/flink/pull/442,
>
>
>
> On Fri, Feb 27, 2015 at 11:58 AM, Mustafa Elbehery <
> elbeherymustafa@gmail.com> wrote:
>
> > @Robert,
> >
> > Thanks I was asking about the procedure. I have opened a Jira ticket for
> > Flink-Contrib and I will create a PR with the naming convention on Wiki,
> >
> > https://issues.apache.org/jira/browse/FLINK-1615,
> >
> >
> >
> > On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger <rm...@apache.org>
> > wrote:
> >
> >> I'm glad you've found the how to contribute guide.
> >>
> >> I can not describe the process to open a pull request better than
> already
> >> written in the guide.
> >> Maybe this link is also helpful for you:
> >> https://help.github.com/articles/creating-a-pull-request/
> >>
> >> Are you facing a particular error message? Maybe that helps me to help
> you
> >> better.
> >>
> >>
> >> On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
> >> elbeherymustafa@gmail.com> wrote:
> >>
> >> > Actually I am reading "How to contribute" now to push the code. Its
> >> working
> >> > and tested locally and on the cluster, and i have used it for an ETL.
> >> >
> >> > The structure as follow :-
> >> >
> >> > Java Pojos for the tweet object, and the nested objects.  Parser class
> >> > using event-driven approach, and the SimpleTweetInputFormat itself.
> >> >
> >> > Would you guide me how to push the code, just to save sometime :)
> >> >
> >> >
> >> > On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rmetzger@apache.org
> >
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > cool! Can you generalize the input format to read JSON into an
> >> arbitrary
> >> > > POJO?
> >> > >
> >> > > It would be great if you could contribute the InputFormat into the
> >> > > "flink-contrib" module. I've seen many users reading JSON data with
> >> > Flink,
> >> > > so its good to have a standard solution for that.
> >> > > If you want you can add the "Tweet into POJO" as an example into
> >> > > flink-contrib.
> >> > >
> >> > > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> >> > > elbeherymustafa@gmail.com> wrote:
> >> > >
> >> > > > Hi,
> >> > > >
> >> > > > I am really sorry for being so late, it was a whole month of
> >> projects
> >> > and
> >> > > > examination, I was really busy.
> >> > > >
> >> > > > @Robert, it is IF for reading tweet into Pojo. I use an
> event-driven
> >> > > > parser, I retrieve most of the tweet into Java Pojos, it was
> tested
> >> on
> >> > > 1TB
> >> > > > dataset, for a Flink ETL job, and the performance was pretty good.
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <
> >> rmetzger@apache.org>
> >> > > > wrote:
> >> > > >
> >> > > > > Hey,
> >> > > > >
> >> > > > > is it a input format for reading JSON data or an IF for reading
> >> > tweets
> >> > > in
> >> > > > > some format into a pojo?
> >> > > > >
> >> > > > > I think a JSON Input Format would be something very useful for
> our
> >> > > users.
> >> > > > > Maybe you can add that and use the Tweet IF as a concrete
> example
> >> for
> >> > > > that?
> >> > > > > Do you have a preview of the code somewhere?
> >> > > > >
> >> > > > > Best,
> >> > > > > Robert
> >> > > > >
> >> > > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <
> >> fhueske@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > > Hi Mustafa,
> >> > > > > >
> >> > > > > > that would be a nice contribution!
> >> > > > > >
> >> > > > > > We are currently discussing how to add "non-core" API features
> >> into
> >> > > > Flink
> >> > > > > > [1].
> >> > > > > > I will move this discussion onto the mailing list to decide
> >> where
> >> > to
> >> > > > add
> >> > > > > > cool add-ons like yours.
> >> > > > > >
> >> > > > > > Cheers, Fabian
> >> > > > > >
> >> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> >> > > > > >
> >> > > > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <
> >> henry.saputra@gmail.com
> >> > >:
> >> > > > > >
> >> > > > > > > Contributions are welcomed!
> >> > > > > > >
> >> > > > > > > Here is the link on how to contribute to Apache Flink:
> >> > > > > > > http://flink.apache.org/how-to-contribute.html
> >> > > > > > >
> >> > > > > > > You can start by creating JIRA ticket [1] to help describe
> >> what
> >> > you
> >> > > > > > > wanted to do and to get feedback from community.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > - Henry
> >> > > > > > >
> >> > > > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> >> > > > > > >
> >> > > > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> >> > > > > > > <el...@gmail.com> wrote:
> >> > > > > > > > Hi,
> >> > > > > > > >
> >> > > > > > > > I have created a custom InputFormat for tweets on Flink,
> >> based
> >> > on
> >> > > > > > > > JSON-Simple event driven parser. I would like to
> contribute
> >> my
> >> > > work
> >> > > > > > into
> >> > > > > > > > Flink,
> >> > > > > > > >
> >> > > > > > > > Regards.
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Mustafa Elbehery
> >> > > > > > > > EIT ICT Labs Master School <
> >> > > > > > http://www.masterschool.eitictlabs.eu/home/>
> >> > > > > > > > +49(0)15218676094
> >> > > > > > > > skype: mustafaelbehery87
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Mustafa Elbehery
> >> > > > EIT ICT Labs Master School <
> >> > http://www.masterschool.eitictlabs.eu/home/>
> >> > > > +49(0)15218676094
> >> > > > skype: mustafaelbehery87
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Mustafa Elbehery
> >> > EIT ICT Labs Master School <
> http://www.masterschool.eitictlabs.eu/home/
> >> >
> >> > +49(0)15218676094
> >> > skype: mustafaelbehery87
> >> >
> >>
> >
> >
> >
> > --
> > Mustafa Elbehery
> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> > +49(0)15218676094
> > skype: mustafaelbehery87
> >
> >
>
>
> --
> Mustafa Elbehery
> EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> +49(0)15218676094
> skype: mustafaelbehery87
>

Re: Tweets Custom Input Format

Posted by Mustafa Elbehery <el...@gmail.com>.
@robert,

I have created the PR https://github.com/apache/flink/pull/442,



On Fri, Feb 27, 2015 at 11:58 AM, Mustafa Elbehery <
elbeherymustafa@gmail.com> wrote:

> @Robert,
>
> Thanks I was asking about the procedure. I have opened a Jira ticket for
> Flink-Contrib and I will create a PR with the naming convention on Wiki,
>
> https://issues.apache.org/jira/browse/FLINK-1615,
>
>
>
> On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger <rm...@apache.org>
> wrote:
>
>> I'm glad you've found the how to contribute guide.
>>
>> I can not describe the process to open a pull request better than already
>> written in the guide.
>> Maybe this link is also helpful for you:
>> https://help.github.com/articles/creating-a-pull-request/
>>
>> Are you facing a particular error message? Maybe that helps me to help you
>> better.
>>
>>
>> On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
>> elbeherymustafa@gmail.com> wrote:
>>
>> > Actually I am reading "How to contribute" now to push the code. Its
>> working
>> > and tested locally and on the cluster, and i have used it for an ETL.
>> >
>> > The structure as follow :-
>> >
>> > Java Pojos for the tweet object, and the nested objects.  Parser class
>> > using event-driven approach, and the SimpleTweetInputFormat itself.
>> >
>> > Would you guide me how to push the code, just to save sometime :)
>> >
>> >
>> > On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rm...@apache.org>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > cool! Can you generalize the input format to read JSON into an
>> arbitrary
>> > > POJO?
>> > >
>> > > It would be great if you could contribute the InputFormat into the
>> > > "flink-contrib" module. I've seen many users reading JSON data with
>> > Flink,
>> > > so its good to have a standard solution for that.
>> > > If you want you can add the "Tweet into POJO" as an example into
>> > > flink-contrib.
>> > >
>> > > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
>> > > elbeherymustafa@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I am really sorry for being so late, it was a whole month of
>> projects
>> > and
>> > > > examination, I was really busy.
>> > > >
>> > > > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
>> > > > parser, I retrieve most of the tweet into Java Pojos, it was tested
>> on
>> > > 1TB
>> > > > dataset, for a Flink ETL job, and the performance was pretty good.
>> > > >
>> > > >
>> > > >
>> > > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <
>> rmetzger@apache.org>
>> > > > wrote:
>> > > >
>> > > > > Hey,
>> > > > >
>> > > > > is it a input format for reading JSON data or an IF for reading
>> > tweets
>> > > in
>> > > > > some format into a pojo?
>> > > > >
>> > > > > I think a JSON Input Format would be something very useful for our
>> > > users.
>> > > > > Maybe you can add that and use the Tweet IF as a concrete example
>> for
>> > > > that?
>> > > > > Do you have a preview of the code somewhere?
>> > > > >
>> > > > > Best,
>> > > > > Robert
>> > > > >
>> > > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <
>> fhueske@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > Hi Mustafa,
>> > > > > >
>> > > > > > that would be a nice contribution!
>> > > > > >
>> > > > > > We are currently discussing how to add "non-core" API features
>> into
>> > > > Flink
>> > > > > > [1].
>> > > > > > I will move this discussion onto the mailing list to decide
>> where
>> > to
>> > > > add
>> > > > > > cool add-ons like yours.
>> > > > > >
>> > > > > > Cheers, Fabian
>> > > > > >
>> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
>> > > > > >
>> > > > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <
>> henry.saputra@gmail.com
>> > >:
>> > > > > >
>> > > > > > > Contributions are welcomed!
>> > > > > > >
>> > > > > > > Here is the link on how to contribute to Apache Flink:
>> > > > > > > http://flink.apache.org/how-to-contribute.html
>> > > > > > >
>> > > > > > > You can start by creating JIRA ticket [1] to help describe
>> what
>> > you
>> > > > > > > wanted to do and to get feedback from community.
>> > > > > > >
>> > > > > > >
>> > > > > > > - Henry
>> > > > > > >
>> > > > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
>> > > > > > >
>> > > > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
>> > > > > > > <el...@gmail.com> wrote:
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > I have created a custom InputFormat for tweets on Flink,
>> based
>> > on
>> > > > > > > > JSON-Simple event driven parser. I would like to contribute
>> my
>> > > work
>> > > > > > into
>> > > > > > > > Flink,
>> > > > > > > >
>> > > > > > > > Regards.
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Mustafa Elbehery
>> > > > > > > > EIT ICT Labs Master School <
>> > > > > > http://www.masterschool.eitictlabs.eu/home/>
>> > > > > > > > +49(0)15218676094
>> > > > > > > > skype: mustafaelbehery87
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Mustafa Elbehery
>> > > > EIT ICT Labs Master School <
>> > http://www.masterschool.eitictlabs.eu/home/>
>> > > > +49(0)15218676094
>> > > > skype: mustafaelbehery87
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Mustafa Elbehery
>> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/
>> >
>> > +49(0)15218676094
>> > skype: mustafaelbehery87
>> >
>>
>
>
>
> --
> Mustafa Elbehery
> EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> +49(0)15218676094
> skype: mustafaelbehery87
>
>


-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87

Re: Tweets Custom Input Format

Posted by Mustafa Elbehery <el...@gmail.com>.
@Robert,

Thanks I was asking about the procedure. I have opened a Jira ticket for
Flink-Contrib and I will create a PR with the naming convention on Wiki,

https://issues.apache.org/jira/browse/FLINK-1615,



On Fri, Feb 27, 2015 at 11:55 AM, Robert Metzger <rm...@apache.org>
wrote:

> I'm glad you've found the how to contribute guide.
>
> I can not describe the process to open a pull request better than already
> written in the guide.
> Maybe this link is also helpful for you:
> https://help.github.com/articles/creating-a-pull-request/
>
> Are you facing a particular error message? Maybe that helps me to help you
> better.
>
>
> On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
> elbeherymustafa@gmail.com> wrote:
>
> > Actually I am reading "How to contribute" now to push the code. Its
> working
> > and tested locally and on the cluster, and i have used it for an ETL.
> >
> > The structure as follow :-
> >
> > Java Pojos for the tweet object, and the nested objects.  Parser class
> > using event-driven approach, and the SimpleTweetInputFormat itself.
> >
> > Would you guide me how to push the code, just to save sometime :)
> >
> >
> > On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rm...@apache.org>
> > wrote:
> >
> > > Hi,
> > >
> > > cool! Can you generalize the input format to read JSON into an
> arbitrary
> > > POJO?
> > >
> > > It would be great if you could contribute the InputFormat into the
> > > "flink-contrib" module. I've seen many users reading JSON data with
> > Flink,
> > > so its good to have a standard solution for that.
> > > If you want you can add the "Tweet into POJO" as an example into
> > > flink-contrib.
> > >
> > > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> > > elbeherymustafa@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I am really sorry for being so late, it was a whole month of projects
> > and
> > > > examination, I was really busy.
> > > >
> > > > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > > > parser, I retrieve most of the tweet into Java Pojos, it was tested
> on
> > > 1TB
> > > > dataset, for a Flink ETL job, and the performance was pretty good.
> > > >
> > > >
> > > >
> > > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rmetzger@apache.org
> >
> > > > wrote:
> > > >
> > > > > Hey,
> > > > >
> > > > > is it a input format for reading JSON data or an IF for reading
> > tweets
> > > in
> > > > > some format into a pojo?
> > > > >
> > > > > I think a JSON Input Format would be something very useful for our
> > > users.
> > > > > Maybe you can add that and use the Tweet IF as a concrete example
> for
> > > > that?
> > > > > Do you have a preview of the code somewhere?
> > > > >
> > > > > Best,
> > > > > Robert
> > > > >
> > > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fhueske@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Mustafa,
> > > > > >
> > > > > > that would be a nice contribution!
> > > > > >
> > > > > > We are currently discussing how to add "non-core" API features
> into
> > > > Flink
> > > > > > [1].
> > > > > > I will move this discussion onto the mailing list to decide where
> > to
> > > > add
> > > > > > cool add-ons like yours.
> > > > > >
> > > > > > Cheers, Fabian
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > > > >
> > > > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <
> henry.saputra@gmail.com
> > >:
> > > > > >
> > > > > > > Contributions are welcomed!
> > > > > > >
> > > > > > > Here is the link on how to contribute to Apache Flink:
> > > > > > > http://flink.apache.org/how-to-contribute.html
> > > > > > >
> > > > > > > You can start by creating JIRA ticket [1] to help describe what
> > you
> > > > > > > wanted to do and to get feedback from community.
> > > > > > >
> > > > > > >
> > > > > > > - Henry
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > > > > >
> > > > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > > > > <el...@gmail.com> wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I have created a custom InputFormat for tweets on Flink,
> based
> > on
> > > > > > > > JSON-Simple event driven parser. I would like to contribute
> my
> > > work
> > > > > > into
> > > > > > > > Flink,
> > > > > > > >
> > > > > > > > Regards.
> > > > > > > >
> > > > > > > > --
> > > > > > > > Mustafa Elbehery
> > > > > > > > EIT ICT Labs Master School <
> > > > > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > > > > +49(0)15218676094
> > > > > > > > skype: mustafaelbehery87
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Mustafa Elbehery
> > > > EIT ICT Labs Master School <
> > http://www.masterschool.eitictlabs.eu/home/>
> > > > +49(0)15218676094
> > > > skype: mustafaelbehery87
> > > >
> > >
> >
> >
> >
> > --
> > Mustafa Elbehery
> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> > +49(0)15218676094
> > skype: mustafaelbehery87
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87

Re: Tweets Custom Input Format

Posted by Robert Metzger <rm...@apache.org>.
I'm glad you've found the how to contribute guide.

I can not describe the process to open a pull request better than already
written in the guide.
Maybe this link is also helpful for you:
https://help.github.com/articles/creating-a-pull-request/

Are you facing a particular error message? Maybe that helps me to help you
better.


On Fri, Feb 27, 2015 at 10:46 AM, Mustafa Elbehery <
elbeherymustafa@gmail.com> wrote:

> Actually I am reading "How to contribute" now to push the code. Its working
> and tested locally and on the cluster, and i have used it for an ETL.
>
> The structure as follow :-
>
> Java Pojos for the tweet object, and the nested objects.  Parser class
> using event-driven approach, and the SimpleTweetInputFormat itself.
>
> Would you guide me how to push the code, just to save sometime :)
>
>
> On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rm...@apache.org>
> wrote:
>
> > Hi,
> >
> > cool! Can you generalize the input format to read JSON into an arbitrary
> > POJO?
> >
> > It would be great if you could contribute the InputFormat into the
> > "flink-contrib" module. I've seen many users reading JSON data with
> Flink,
> > so its good to have a standard solution for that.
> > If you want you can add the "Tweet into POJO" as an example into
> > flink-contrib.
> >
> > On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> > elbeherymustafa@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I am really sorry for being so late, it was a whole month of projects
> and
> > > examination, I was really busy.
> > >
> > > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > > parser, I retrieve most of the tweet into Java Pojos, it was tested on
> > 1TB
> > > dataset, for a Flink ETL job, and the performance was pretty good.
> > >
> > >
> > >
> > > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rm...@apache.org>
> > > wrote:
> > >
> > > > Hey,
> > > >
> > > > is it a input format for reading JSON data or an IF for reading
> tweets
> > in
> > > > some format into a pojo?
> > > >
> > > > I think a JSON Input Format would be something very useful for our
> > users.
> > > > Maybe you can add that and use the Tweet IF as a concrete example for
> > > that?
> > > > Do you have a preview of the code somewhere?
> > > >
> > > > Best,
> > > > Robert
> > > >
> > > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fh...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Mustafa,
> > > > >
> > > > > that would be a nice contribution!
> > > > >
> > > > > We are currently discussing how to add "non-core" API features into
> > > Flink
> > > > > [1].
> > > > > I will move this discussion onto the mailing list to decide where
> to
> > > add
> > > > > cool add-ons like yours.
> > > > >
> > > > > Cheers, Fabian
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > > >
> > > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <henry.saputra@gmail.com
> >:
> > > > >
> > > > > > Contributions are welcomed!
> > > > > >
> > > > > > Here is the link on how to contribute to Apache Flink:
> > > > > > http://flink.apache.org/how-to-contribute.html
> > > > > >
> > > > > > You can start by creating JIRA ticket [1] to help describe what
> you
> > > > > > wanted to do and to get feedback from community.
> > > > > >
> > > > > >
> > > > > > - Henry
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > > > >
> > > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > > > <el...@gmail.com> wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I have created a custom InputFormat for tweets on Flink, based
> on
> > > > > > > JSON-Simple event driven parser. I would like to contribute my
> > work
> > > > > into
> > > > > > > Flink,
> > > > > > >
> > > > > > > Regards.
> > > > > > >
> > > > > > > --
> > > > > > > Mustafa Elbehery
> > > > > > > EIT ICT Labs Master School <
> > > > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > > > +49(0)15218676094
> > > > > > > skype: mustafaelbehery87
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Mustafa Elbehery
> > > EIT ICT Labs Master School <
> http://www.masterschool.eitictlabs.eu/home/>
> > > +49(0)15218676094
> > > skype: mustafaelbehery87
> > >
> >
>
>
>
> --
> Mustafa Elbehery
> EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> +49(0)15218676094
> skype: mustafaelbehery87
>

Re: Tweets Custom Input Format

Posted by Mustafa Elbehery <el...@gmail.com>.
Actually I am reading "How to contribute" now to push the code. Its working
and tested locally and on the cluster, and i have used it for an ETL.

The structure as follow :-

Java Pojos for the tweet object, and the nested objects.  Parser class
using event-driven approach, and the SimpleTweetInputFormat itself.

Would you guide me how to push the code, just to save sometime :)


On Fri, Feb 27, 2015 at 10:42 AM, Robert Metzger <rm...@apache.org>
wrote:

> Hi,
>
> cool! Can you generalize the input format to read JSON into an arbitrary
> POJO?
>
> It would be great if you could contribute the InputFormat into the
> "flink-contrib" module. I've seen many users reading JSON data with Flink,
> so its good to have a standard solution for that.
> If you want you can add the "Tweet into POJO" as an example into
> flink-contrib.
>
> On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
> elbeherymustafa@gmail.com> wrote:
>
> > Hi,
> >
> > I am really sorry for being so late, it was a whole month of projects and
> > examination, I was really busy.
> >
> > @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> > parser, I retrieve most of the tweet into Java Pojos, it was tested on
> 1TB
> > dataset, for a Flink ETL job, and the performance was pretty good.
> >
> >
> >
> > On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rm...@apache.org>
> > wrote:
> >
> > > Hey,
> > >
> > > is it a input format for reading JSON data or an IF for reading tweets
> in
> > > some format into a pojo?
> > >
> > > I think a JSON Input Format would be something very useful for our
> users.
> > > Maybe you can add that and use the Tweet IF as a concrete example for
> > that?
> > > Do you have a preview of the code somewhere?
> > >
> > > Best,
> > > Robert
> > >
> > > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fh...@gmail.com>
> > wrote:
> > >
> > > > Hi Mustafa,
> > > >
> > > > that would be a nice contribution!
> > > >
> > > > We are currently discussing how to add "non-core" API features into
> > Flink
> > > > [1].
> > > > I will move this discussion onto the mailing list to decide where to
> > add
> > > > cool add-ons like yours.
> > > >
> > > > Cheers, Fabian
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > > >
> > > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <he...@gmail.com>:
> > > >
> > > > > Contributions are welcomed!
> > > > >
> > > > > Here is the link on how to contribute to Apache Flink:
> > > > > http://flink.apache.org/how-to-contribute.html
> > > > >
> > > > > You can start by creating JIRA ticket [1] to help describe what you
> > > > > wanted to do and to get feedback from community.
> > > > >
> > > > >
> > > > > - Henry
> > > > >
> > > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > > >
> > > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > > <el...@gmail.com> wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I have created a custom InputFormat for tweets on Flink, based on
> > > > > > JSON-Simple event driven parser. I would like to contribute my
> work
> > > > into
> > > > > > Flink,
> > > > > >
> > > > > > Regards.
> > > > > >
> > > > > > --
> > > > > > Mustafa Elbehery
> > > > > > EIT ICT Labs Master School <
> > > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > > +49(0)15218676094
> > > > > > skype: mustafaelbehery87
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Mustafa Elbehery
> > EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> > +49(0)15218676094
> > skype: mustafaelbehery87
> >
>



-- 
Mustafa Elbehery
EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
+49(0)15218676094
skype: mustafaelbehery87

Re: Tweets Custom Input Format

Posted by Robert Metzger <rm...@apache.org>.
Hi,

cool! Can you generalize the input format to read JSON into an arbitrary
POJO?

It would be great if you could contribute the InputFormat into the
"flink-contrib" module. I've seen many users reading JSON data with Flink,
so its good to have a standard solution for that.
If you want you can add the "Tweet into POJO" as an example into
flink-contrib.

On Fri, Feb 27, 2015 at 10:37 AM, Mustafa Elbehery <
elbeherymustafa@gmail.com> wrote:

> Hi,
>
> I am really sorry for being so late, it was a whole month of projects and
> examination, I was really busy.
>
> @Robert, it is IF for reading tweet into Pojo. I use an event-driven
> parser, I retrieve most of the tweet into Java Pojos, it was tested on 1TB
> dataset, for a Flink ETL job, and the performance was pretty good.
>
>
>
> On Sun, Jan 25, 2015 at 7:38 PM, Robert Metzger <rm...@apache.org>
> wrote:
>
> > Hey,
> >
> > is it a input format for reading JSON data or an IF for reading tweets in
> > some format into a pojo?
> >
> > I think a JSON Input Format would be something very useful for our users.
> > Maybe you can add that and use the Tweet IF as a concrete example for
> that?
> > Do you have a preview of the code somewhere?
> >
> > Best,
> > Robert
> >
> > On Sat, Jan 24, 2015 at 11:06 AM, Fabian Hueske <fh...@gmail.com>
> wrote:
> >
> > > Hi Mustafa,
> > >
> > > that would be a nice contribution!
> > >
> > > We are currently discussing how to add "non-core" API features into
> Flink
> > > [1].
> > > I will move this discussion onto the mailing list to decide where to
> add
> > > cool add-ons like yours.
> > >
> > > Cheers, Fabian
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-1398
> > >
> > > 2015-01-23 20:42 GMT+01:00 Henry Saputra <he...@gmail.com>:
> > >
> > > > Contributions are welcomed!
> > > >
> > > > Here is the link on how to contribute to Apache Flink:
> > > > http://flink.apache.org/how-to-contribute.html
> > > >
> > > > You can start by creating JIRA ticket [1] to help describe what you
> > > > wanted to do and to get feedback from community.
> > > >
> > > >
> > > > - Henry
> > > >
> > > > [1] https://issues.apache.org/jira/secure/Dashboard.jspa
> > > >
> > > > On Fri, Jan 23, 2015 at 10:54 AM, Mustafa Elbehery
> > > > <el...@gmail.com> wrote:
> > > > > Hi,
> > > > >
> > > > > I have created a custom InputFormat for tweets on Flink, based on
> > > > > JSON-Simple event driven parser. I would like to contribute my work
> > > into
> > > > > Flink,
> > > > >
> > > > > Regards.
> > > > >
> > > > > --
> > > > > Mustafa Elbehery
> > > > > EIT ICT Labs Master School <
> > > http://www.masterschool.eitictlabs.eu/home/>
> > > > > +49(0)15218676094
> > > > > skype: mustafaelbehery87
> > > >
> > >
> >
>
>
>
> --
> Mustafa Elbehery
> EIT ICT Labs Master School <http://www.masterschool.eitictlabs.eu/home/>
> +49(0)15218676094
> skype: mustafaelbehery87
>