You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Gábor Gévay <gg...@gmail.com> on 2015/03/26 19:50:48 UTC

GSoC proposal

Hello,

I will be applying to the Google Summer of Code, and I wrote most of
the proposal:
http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
I would appreciate it if you could comment on it.

Gyula Fora, git blame is telling me that you wrote most of the
relevant parts of the windowing code, so I would be especially
interested in what you think of my improvement ideas.

Best regards,
Gabor

Re: GSoC proposal

Posted by Gábor Gévay <gg...@gmail.com>.
Hello,

Thank you very much for your comments! I will remove the part about
the windowing optimizations (though, that was my favourite part :) ),
and think about what other statistics could be added. And thank you
for the link with the collection of many relevant algorithms, they are
very interesting!

Best regards,
Gabor



2015-03-26 17:35 GMT-05:00 Paris Carbone <pa...@kth.se>:
> Hi Gabor,
>
> Approximate statistics is a really good topic, I think there is a lot to do if you focus there. One idea would also be to include some of your contributions to the incremental machine learning library that will be available by June. From there you will be able to also use sampling and stream mining primitives out-of-the-box among others. Regarding window optimisations, as Gyula said, there is not much to do simply because we are working heavily on it already. Good luck and thanks for the proposal!
>
> Paris
>
>> On 26 Mar 2015, at 22:59, Gyula Fóra <gy...@gmail.com> wrote:
>>
>> Hey Gabor,
>>
>> Thank you for the proposal. It has many interesting ideas and a good
>> potential.
>>
>> My comments:
>>
>> We already have a large amount of ongoing work on the windowing
>> optimizations, covering your suggestions in section 1. It would be better
>> to drop that part from the project because thats very heavily on the
>> research side and as I said we are working on this at SICS.
>>
>> I like the list that you made for section 2., and this should be the main
>> emphasis on the project. It would indeed be very nice to have a wide range
>> of statistics that we can compute (or approximate - this should be optional
>> thoug) on streams and windows (maybe we should also add some practical
>> stuff like top-k, distinct etc).
>>
>> Here is a list of interesting papers that seems to be related to this
>> project
>>
>> https://gist.github.com/debasishg/8172796
>>
>> Cheers,
>> Gyula
>>
>> On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay <gg...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I will be applying to the Google Summer of Code, and I wrote most of
>>> the proposal:
>>> http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
>>> I would appreciate it if you could comment on it.
>>>
>>> Gyula Fora, git blame is telling me that you wrote most of the
>>> relevant parts of the windowing code, so I would be especially
>>> interested in what you think of my improvement ideas.
>>>
>>> Best regards,
>>> Gabor
>>>
>

Re: GSoC proposal

Posted by Paris Carbone <pa...@kth.se>.
Hi Gabor,

Approximate statistics is a really good topic, I think there is a lot to do if you focus there. One idea would also be to include some of your contributions to the incremental machine learning library that will be available by June. From there you will be able to also use sampling and stream mining primitives out-of-the-box among others. Regarding window optimisations, as Gyula said, there is not much to do simply because we are working heavily on it already. Good luck and thanks for the proposal! 

Paris

> On 26 Mar 2015, at 22:59, Gyula Fóra <gy...@gmail.com> wrote:
> 
> Hey Gabor,
> 
> Thank you for the proposal. It has many interesting ideas and a good
> potential.
> 
> My comments:
> 
> We already have a large amount of ongoing work on the windowing
> optimizations, covering your suggestions in section 1. It would be better
> to drop that part from the project because thats very heavily on the
> research side and as I said we are working on this at SICS.
> 
> I like the list that you made for section 2., and this should be the main
> emphasis on the project. It would indeed be very nice to have a wide range
> of statistics that we can compute (or approximate - this should be optional
> thoug) on streams and windows (maybe we should also add some practical
> stuff like top-k, distinct etc).
> 
> Here is a list of interesting papers that seems to be related to this
> project
> 
> https://gist.github.com/debasishg/8172796
> 
> Cheers,
> Gyula
> 
> On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay <gg...@gmail.com> wrote:
> 
>> Hello,
>> 
>> I will be applying to the Google Summer of Code, and I wrote most of
>> the proposal:
>> http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
>> I would appreciate it if you could comment on it.
>> 
>> Gyula Fora, git blame is telling me that you wrote most of the
>> relevant parts of the windowing code, so I would be especially
>> interested in what you think of my improvement ideas.
>> 
>> Best regards,
>> Gabor
>> 


Re: GSoC proposal

Posted by Gyula Fóra <gy...@gmail.com>.
Hey Gabor,

Thank you for the proposal. It has many interesting ideas and a good
potential.

My comments:

We already have a large amount of ongoing work on the windowing
optimizations, covering your suggestions in section 1. It would be better
to drop that part from the project because thats very heavily on the
research side and as I said we are working on this at SICS.

I like the list that you made for section 2., and this should be the main
emphasis on the project. It would indeed be very nice to have a wide range
of statistics that we can compute (or approximate - this should be optional
thoug) on streams and windows (maybe we should also add some practical
stuff like top-k, distinct etc).

Here is a list of interesting papers that seems to be related to this
project

https://gist.github.com/debasishg/8172796

Cheers,
Gyula

On Thu, Mar 26, 2015 at 7:50 PM, Gábor Gévay <gg...@gmail.com> wrote:

> Hello,
>
> I will be applying to the Google Summer of Code, and I wrote most of
> the proposal:
> http://compalg.inf.elte.hu/~ggevay/Proposal.pdf
> I would appreciate it if you could comment on it.
>
> Gyula Fora, git blame is telling me that you wrote most of the
> relevant parts of the windowing code, so I would be especially
> interested in what you think of my improvement ideas.
>
> Best regards,
> Gabor
>