You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by George Kousiouris <gk...@mail.ntua.gr> on 2012/09/05 18:11:28 UTC

access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Hi all,

As part of the research for an ongoing project, we are interested in 
investigating the ability  to predict data access patterns on a hadoop 
cluster. The purpose is to study the file access patterns (in a time 
series manner), so that proactive manipulation of data may be achieved. 
This for example may involve the increase/decrease of the replication 
factor in an Apache Hadoop cluster (and according HDFS) to deal with an 
upcoming predicted increase/decrease of data accesses.

So we would like your advise on some issues:
1) is this the correct mailing list? :)
2) would a changed replication factor translate to a better performance 
of a MR job (either by experience you may have or if you have in mind a 
report/paper etc. that has studied this)
3) do you find this interesting in general and something we should pursue?
4) are you aware of any related work on the topic we could use as a 
starting point?

Thanks for your help,
George


Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by George Kousiouris <gk...@mail.ntua.gr>.
Sorry, something i forgot: are you aware of any available datasets that 
could help us in that direction?

On 9/5/2012 7:11 PM, George Kousiouris wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in 
> investigating the ability  to predict data access patterns on a hadoop 
> cluster. The purpose is to study the file access patterns (in a time 
> series manner), so that proactive manipulation of data may be 
> achieved. This for example may involve the increase/decrease of the 
> replication factor in an Apache Hadoop cluster (and according HDFS) to 
> deal with an upcoming predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better 
> performance of a MR job (either by experience you may have or if you 
> have in mind a report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should 
> pursue?
> 4) are you aware of any related work on the topic we could use as a 
> starting point?
>
> Thanks for your help,
> George
>
>
>


-- 

---------------------------

George Kousiouris, PhD
Electrical and Computer Engineer
Division of Communications,
Electronics and Information Engineering
School of Electrical and Computer Engineering
Tel: +30 210 772 2546
Mobile: +30 6939354121
Fax: +30 210 772 2569
Email: gkousiou@mail.ntua.gr
Site: http://users.ntua.gr/gkousiou/

National Technical University of Athens
9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece


Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by George Kousiouris <gk...@mail.ntua.gr>.
Sorry, something i forgot: are you aware of any available datasets that 
could help us in that direction?

On 9/5/2012 7:11 PM, George Kousiouris wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in 
> investigating the ability  to predict data access patterns on a hadoop 
> cluster. The purpose is to study the file access patterns (in a time 
> series manner), so that proactive manipulation of data may be 
> achieved. This for example may involve the increase/decrease of the 
> replication factor in an Apache Hadoop cluster (and according HDFS) to 
> deal with an upcoming predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better 
> performance of a MR job (either by experience you may have or if you 
> have in mind a report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should 
> pursue?
> 4) are you aware of any related work on the topic we could use as a 
> starting point?
>
> Thanks for your help,
> George
>
>
>


-- 

---------------------------

George Kousiouris, PhD
Electrical and Computer Engineer
Division of Communications,
Electronics and Information Engineering
School of Electrical and Computer Engineering
Tel: +30 210 772 2546
Mobile: +30 6939354121
Fax: +30 210 772 2569
Email: gkousiou@mail.ntua.gr
Site: http://users.ntua.gr/gkousiou/

National Technical University of Athens
9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece


Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by Bruce Durling <bl...@otfrom.com>.
I find this interesting. If this isn't the place to pursue it then I'd
be interested in subscribing to that mailing list. :-D

cheers,
Bruce

On Wed, Sep 5, 2012 at 5:11 PM, George Kousiouris <gk...@mail.ntua.gr> wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in
> investigating the ability  to predict data access patterns on a hadoop
> cluster. The purpose is to study the file access patterns (in a time series
> manner), so that proactive manipulation of data may be achieved. This for
> example may involve the increase/decrease of the replication factor in an
> Apache Hadoop cluster (and according HDFS) to deal with an upcoming
> predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better performance of a
> MR job (either by experience you may have or if you have in mind a
> report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should pursue?
> 4) are you aware of any related work on the topic we could use as a starting
> point?
>
> Thanks for your help,
> George
>



-- 
@otfrom | CTO & co-founder @MastodonC | mastodonc.com

Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by George Kousiouris <gk...@mail.ntua.gr>.
Sorry, something i forgot: are you aware of any available datasets that 
could help us in that direction?

On 9/5/2012 7:11 PM, George Kousiouris wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in 
> investigating the ability  to predict data access patterns on a hadoop 
> cluster. The purpose is to study the file access patterns (in a time 
> series manner), so that proactive manipulation of data may be 
> achieved. This for example may involve the increase/decrease of the 
> replication factor in an Apache Hadoop cluster (and according HDFS) to 
> deal with an upcoming predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better 
> performance of a MR job (either by experience you may have or if you 
> have in mind a report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should 
> pursue?
> 4) are you aware of any related work on the topic we could use as a 
> starting point?
>
> Thanks for your help,
> George
>
>
>


-- 

---------------------------

George Kousiouris, PhD
Electrical and Computer Engineer
Division of Communications,
Electronics and Information Engineering
School of Electrical and Computer Engineering
Tel: +30 210 772 2546
Mobile: +30 6939354121
Fax: +30 210 772 2569
Email: gkousiou@mail.ntua.gr
Site: http://users.ntua.gr/gkousiou/

National Technical University of Athens
9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece


Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by George Kousiouris <gk...@mail.ntua.gr>.
Sorry, something i forgot: are you aware of any available datasets that 
could help us in that direction?

On 9/5/2012 7:11 PM, George Kousiouris wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in 
> investigating the ability  to predict data access patterns on a hadoop 
> cluster. The purpose is to study the file access patterns (in a time 
> series manner), so that proactive manipulation of data may be 
> achieved. This for example may involve the increase/decrease of the 
> replication factor in an Apache Hadoop cluster (and according HDFS) to 
> deal with an upcoming predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better 
> performance of a MR job (either by experience you may have or if you 
> have in mind a report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should 
> pursue?
> 4) are you aware of any related work on the topic we could use as a 
> starting point?
>
> Thanks for your help,
> George
>
>
>


-- 

---------------------------

George Kousiouris, PhD
Electrical and Computer Engineer
Division of Communications,
Electronics and Information Engineering
School of Electrical and Computer Engineering
Tel: +30 210 772 2546
Mobile: +30 6939354121
Fax: +30 210 772 2569
Email: gkousiou@mail.ntua.gr
Site: http://users.ntua.gr/gkousiou/

National Technical University of Athens
9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece


Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by Bruce Durling <bl...@otfrom.com>.
I find this interesting. If this isn't the place to pursue it then I'd
be interested in subscribing to that mailing list. :-D

cheers,
Bruce

On Wed, Sep 5, 2012 at 5:11 PM, George Kousiouris <gk...@mail.ntua.gr> wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in
> investigating the ability  to predict data access patterns on a hadoop
> cluster. The purpose is to study the file access patterns (in a time series
> manner), so that proactive manipulation of data may be achieved. This for
> example may involve the increase/decrease of the replication factor in an
> Apache Hadoop cluster (and according HDFS) to deal with an upcoming
> predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better performance of a
> MR job (either by experience you may have or if you have in mind a
> report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should pursue?
> 4) are you aware of any related work on the topic we could use as a starting
> point?
>
> Thanks for your help,
> George
>



-- 
@otfrom | CTO & co-founder @MastodonC | mastodonc.com

Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by Bruce Durling <bl...@otfrom.com>.
I find this interesting. If this isn't the place to pursue it then I'd
be interested in subscribing to that mailing list. :-D

cheers,
Bruce

On Wed, Sep 5, 2012 at 5:11 PM, George Kousiouris <gk...@mail.ntua.gr> wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in
> investigating the ability  to predict data access patterns on a hadoop
> cluster. The purpose is to study the file access patterns (in a time series
> manner), so that proactive manipulation of data may be achieved. This for
> example may involve the increase/decrease of the replication factor in an
> Apache Hadoop cluster (and according HDFS) to deal with an upcoming
> predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better performance of a
> MR job (either by experience you may have or if you have in mind a
> report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should pursue?
> 4) are you aware of any related work on the topic we could use as a starting
> point?
>
> Thanks for your help,
> George
>



-- 
@otfrom | CTO & co-founder @MastodonC | mastodonc.com

Re: access patterns investigation to dynamically toggle the replication factor in a hadoop cluster

Posted by Bruce Durling <bl...@otfrom.com>.
I find this interesting. If this isn't the place to pursue it then I'd
be interested in subscribing to that mailing list. :-D

cheers,
Bruce

On Wed, Sep 5, 2012 at 5:11 PM, George Kousiouris <gk...@mail.ntua.gr> wrote:
>
> Hi all,
>
> As part of the research for an ongoing project, we are interested in
> investigating the ability  to predict data access patterns on a hadoop
> cluster. The purpose is to study the file access patterns (in a time series
> manner), so that proactive manipulation of data may be achieved. This for
> example may involve the increase/decrease of the replication factor in an
> Apache Hadoop cluster (and according HDFS) to deal with an upcoming
> predicted increase/decrease of data accesses.
>
> So we would like your advise on some issues:
> 1) is this the correct mailing list? :)
> 2) would a changed replication factor translate to a better performance of a
> MR job (either by experience you may have or if you have in mind a
> report/paper etc. that has studied this)
> 3) do you find this interesting in general and something we should pursue?
> 4) are you aware of any related work on the topic we could use as a starting
> point?
>
> Thanks for your help,
> George
>



-- 
@otfrom | CTO & co-founder @MastodonC | mastodonc.com