You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by Julian Bui <ju...@gmail.com> on 2013/05/31 18:57:31 UTC

possible to change replication factor at file creation time (with copyFromLocal)?

Hi hadoop users,

I am aware that you can set the replication factor of a file after it's
been created, but can you do it as you copy files to the HDFS?  My
hope/intuition is that if you were able to reduce the replication factor of
a file while copying, the copy time would decrease.  I'm finding it
difficult waiting for large data sets to copy over.

I am currently doing:

hadoop dfs -copyFromLocal "/copy/from/path/" input

and am wondering if it's possible to also specify something like -setrep on
the same line.  -setsrep requires you to specify the file, which implies
that it has to exist first, requiring two separate commands.

Thanks in advance,
-Julian

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Julian Bui <ju...@gmail.com>.
there you are again!

thanks!

On Fri, May 31, 2013 at 10:03 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi Julian,
>
> Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
> dfs -Ddfs.replication=X -copyFromLocal …".
>
> On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> > Hi hadoop users,
> >
> > I am aware that you can set the replication factor of a file after it's
> been
> > created, but can you do it as you copy files to the HDFS?  My
> hope/intuition
> > is that if you were able to reduce the replication factor of a file while
> > copying, the copy time would decrease.  I'm finding it difficult waiting
> for
> > large data sets to copy over.
> >
> > I am currently doing:
> >
> > hadoop dfs -copyFromLocal "/copy/from/path/" input
> >
> > and am wondering if it's possible to also specify something like -setrep
> on
> > the same line.  -setsrep requires you to specify the file, which implies
> > that it has to exist first, requiring two separate commands.
> >
> > Thanks in advance,
> > -Julian
>
>
>
> --
> Harsh J
>

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Julian Bui <ju...@gmail.com>.
there you are again!

thanks!

On Fri, May 31, 2013 at 10:03 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi Julian,
>
> Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
> dfs -Ddfs.replication=X -copyFromLocal …".
>
> On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> > Hi hadoop users,
> >
> > I am aware that you can set the replication factor of a file after it's
> been
> > created, but can you do it as you copy files to the HDFS?  My
> hope/intuition
> > is that if you were able to reduce the replication factor of a file while
> > copying, the copy time would decrease.  I'm finding it difficult waiting
> for
> > large data sets to copy over.
> >
> > I am currently doing:
> >
> > hadoop dfs -copyFromLocal "/copy/from/path/" input
> >
> > and am wondering if it's possible to also specify something like -setrep
> on
> > the same line.  -setsrep requires you to specify the file, which implies
> > that it has to exist first, requiring two separate commands.
> >
> > Thanks in advance,
> > -Julian
>
>
>
> --
> Harsh J
>

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Julian Bui <ju...@gmail.com>.
there you are again!

thanks!

On Fri, May 31, 2013 at 10:03 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi Julian,
>
> Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
> dfs -Ddfs.replication=X -copyFromLocal …".
>
> On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> > Hi hadoop users,
> >
> > I am aware that you can set the replication factor of a file after it's
> been
> > created, but can you do it as you copy files to the HDFS?  My
> hope/intuition
> > is that if you were able to reduce the replication factor of a file while
> > copying, the copy time would decrease.  I'm finding it difficult waiting
> for
> > large data sets to copy over.
> >
> > I am currently doing:
> >
> > hadoop dfs -copyFromLocal "/copy/from/path/" input
> >
> > and am wondering if it's possible to also specify something like -setrep
> on
> > the same line.  -setsrep requires you to specify the file, which implies
> > that it has to exist first, requiring two separate commands.
> >
> > Thanks in advance,
> > -Julian
>
>
>
> --
> Harsh J
>

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Julian Bui <ju...@gmail.com>.
there you are again!

thanks!

On Fri, May 31, 2013 at 10:03 AM, Harsh J <ha...@cloudera.com> wrote:

> Hi Julian,
>
> Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
> dfs -Ddfs.replication=X -copyFromLocal …".
>
> On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> > Hi hadoop users,
> >
> > I am aware that you can set the replication factor of a file after it's
> been
> > created, but can you do it as you copy files to the HDFS?  My
> hope/intuition
> > is that if you were able to reduce the replication factor of a file while
> > copying, the copy time would decrease.  I'm finding it difficult waiting
> for
> > large data sets to copy over.
> >
> > I am currently doing:
> >
> > hadoop dfs -copyFromLocal "/copy/from/path/" input
> >
> > and am wondering if it's possible to also specify something like -setrep
> on
> > the same line.  -setsrep requires you to specify the file, which implies
> > that it has to exist first, requiring two separate commands.
> >
> > Thanks in advance,
> > -Julian
>
>
>
> --
> Harsh J
>

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Harsh J <ha...@cloudera.com>.
Hi Julian,

Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
dfs -Ddfs.replication=X -copyFromLocal …".

On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> Hi hadoop users,
>
> I am aware that you can set the replication factor of a file after it's been
> created, but can you do it as you copy files to the HDFS?  My hope/intuition
> is that if you were able to reduce the replication factor of a file while
> copying, the copy time would decrease.  I'm finding it difficult waiting for
> large data sets to copy over.
>
> I am currently doing:
>
> hadoop dfs -copyFromLocal "/copy/from/path/" input
>
> and am wondering if it's possible to also specify something like -setrep on
> the same line.  -setsrep requires you to specify the file, which implies
> that it has to exist first, requiring two separate commands.
>
> Thanks in advance,
> -Julian



-- 
Harsh J

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Harsh J <ha...@cloudera.com>.
Hi Julian,

Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
dfs -Ddfs.replication=X -copyFromLocal …".

On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> Hi hadoop users,
>
> I am aware that you can set the replication factor of a file after it's been
> created, but can you do it as you copy files to the HDFS?  My hope/intuition
> is that if you were able to reduce the replication factor of a file while
> copying, the copy time would decrease.  I'm finding it difficult waiting for
> large data sets to copy over.
>
> I am currently doing:
>
> hadoop dfs -copyFromLocal "/copy/from/path/" input
>
> and am wondering if it's possible to also specify something like -setrep on
> the same line.  -setsrep requires you to specify the file, which implies
> that it has to exist first, requiring two separate commands.
>
> Thanks in advance,
> -Julian



-- 
Harsh J

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Harsh J <ha...@cloudera.com>.
Hi Julian,

Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
dfs -Ddfs.replication=X -copyFromLocal …".

On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> Hi hadoop users,
>
> I am aware that you can set the replication factor of a file after it's been
> created, but can you do it as you copy files to the HDFS?  My hope/intuition
> is that if you were able to reduce the replication factor of a file while
> copying, the copy time would decrease.  I'm finding it difficult waiting for
> large data sets to copy over.
>
> I am currently doing:
>
> hadoop dfs -copyFromLocal "/copy/from/path/" input
>
> and am wondering if it's possible to also specify something like -setrep on
> the same line.  -setsrep requires you to specify the file, which implies
> that it has to exist first, requiring two separate commands.
>
> Thanks in advance,
> -Julian



-- 
Harsh J

Re: possible to change replication factor at file creation time (with copyFromLocal)?

Posted by Harsh J <ha...@cloudera.com>.
Hi Julian,

Yes, "dfs" subcommand accepts config overrides via -D. Just do "hadoop
dfs -Ddfs.replication=X -copyFromLocal …".

On Fri, May 31, 2013 at 10:27 PM, Julian Bui <ju...@gmail.com> wrote:
> Hi hadoop users,
>
> I am aware that you can set the replication factor of a file after it's been
> created, but can you do it as you copy files to the HDFS?  My hope/intuition
> is that if you were able to reduce the replication factor of a file while
> copying, the copy time would decrease.  I'm finding it difficult waiting for
> large data sets to copy over.
>
> I am currently doing:
>
> hadoop dfs -copyFromLocal "/copy/from/path/" input
>
> and am wondering if it's possible to also specify something like -setrep on
> the same line.  -setsrep requires you to specify the file, which implies
> that it has to exist first, requiring two separate commands.
>
> Thanks in advance,
> -Julian



-- 
Harsh J