You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Amit Mittal <am...@gmail.com> on 2014/10/20 12:35:53 UTC

File system permission for copying file through shell script

Hi All,

I am having a Oozie workflow to read the ingest the files into HDFS. The
source system is windows system and we have source directory mounted on the
unix box from where we execute oozie workflow, since it is my local machine
1 node setup, all is well.
Now I need to move to a cluster of 20 nodes. So my question is, since I
will be calling the unix file copy script from Oozie, do I need to mount
the windows source directory to all data nodes or only the node from where
I am initiating the Oozie workflow.

Thanks
Amit

Re: File system permission for copying file through shell script

Posted by Amit Mittal <am...@gmail.com>.
Thanks Shwetha.

On Mon, Oct 20, 2014 at 6:01 PM, Shwetha GS <sh...@inmobi.com> wrote:

> Yes, all the cluster nodes should have access  to the source system as pig
> script can run anywhere
>
> On Mon, Oct 20, 2014 at 5:54 PM, Amit Mittal <am...@gmail.com>
> wrote:
>
> > Hi Swetha,
> >
> > Yes, the pig script (in my case, shell script) needs to be in HDFS.
> However
> > my question was that the script is doing a file transfer from an external
> > windows file system to HDFS. So from the node the script will be
> executed,
> > that should have the access to the source system so that it can read the
> > files from source to data nodes.
> > What I understand now is that since the script can be executed from any
> > node, so all data nodes should have the external windows folder mounted
> to
> > it.
> >
> > Thanks
> > Amit
> >
> > On Mon, Oct 20, 2014 at 4:30 PM, Shwetha GS <sh...@inmobi.com>
> wrote:
> >
> > > The workflow is read just by oozie. But if the workflow contains say
> pig
> > > action which executes on the cluster, the pig script needs to be on the
> > > cluster
> > >
> > > -Shwetha
> > >
> > > On Mon, Oct 20, 2014 at 4:05 PM, Amit Mittal <am...@gmail.com>
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I am having a Oozie workflow to read the ingest the files into HDFS.
> > The
> > > > source system is windows system and we have source directory mounted
> on
> > > the
> > > > unix box from where we execute oozie workflow, since it is my local
> > > machine
> > > > 1 node setup, all is well.
> > > > Now I need to move to a cluster of 20 nodes. So my question is,
> since I
> > > > will be calling the unix file copy script from Oozie, do I need to
> > mount
> > > > the windows source directory to all data nodes or only the node from
> > > where
> > > > I am initiating the Oozie workflow.
> > > >
> > > > Thanks
> > > > Amit
> > > >
> > >
> > > --
> > > _____________________________________________________________
> > > The information contained in this communication is intended solely for
> > the
> > > use of the individual or entity to whom it is addressed and others
> > > authorized to receive it. It may contain confidential or legally
> > privileged
> > > information. If you are not the intended recipient you are hereby
> > notified
> > > that any disclosure, copying, distribution or taking any action in
> > reliance
> > > on the contents of this information is strictly prohibited and may be
> > > unlawful. If you have received this communication in error, please
> notify
> > > us immediately by responding to this email and then delete it from your
> > > system. The firm is neither liable for the proper and complete
> > transmission
> > > of the information contained in this communication nor for any delay in
> > its
> > > receipt.
> > >
> >
>
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

Re: File system permission for copying file through shell script

Posted by Shwetha GS <sh...@inmobi.com>.
Yes, all the cluster nodes should have access  to the source system as pig
script can run anywhere

On Mon, Oct 20, 2014 at 5:54 PM, Amit Mittal <am...@gmail.com> wrote:

> Hi Swetha,
>
> Yes, the pig script (in my case, shell script) needs to be in HDFS. However
> my question was that the script is doing a file transfer from an external
> windows file system to HDFS. So from the node the script will be executed,
> that should have the access to the source system so that it can read the
> files from source to data nodes.
> What I understand now is that since the script can be executed from any
> node, so all data nodes should have the external windows folder mounted to
> it.
>
> Thanks
> Amit
>
> On Mon, Oct 20, 2014 at 4:30 PM, Shwetha GS <sh...@inmobi.com> wrote:
>
> > The workflow is read just by oozie. But if the workflow contains say pig
> > action which executes on the cluster, the pig script needs to be on the
> > cluster
> >
> > -Shwetha
> >
> > On Mon, Oct 20, 2014 at 4:05 PM, Amit Mittal <am...@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > I am having a Oozie workflow to read the ingest the files into HDFS.
> The
> > > source system is windows system and we have source directory mounted on
> > the
> > > unix box from where we execute oozie workflow, since it is my local
> > machine
> > > 1 node setup, all is well.
> > > Now I need to move to a cluster of 20 nodes. So my question is, since I
> > > will be calling the unix file copy script from Oozie, do I need to
> mount
> > > the windows source directory to all data nodes or only the node from
> > where
> > > I am initiating the Oozie workflow.
> > >
> > > Thanks
> > > Amit
> > >
> >
> > --
> > _____________________________________________________________
> > The information contained in this communication is intended solely for
> the
> > use of the individual or entity to whom it is addressed and others
> > authorized to receive it. It may contain confidential or legally
> privileged
> > information. If you are not the intended recipient you are hereby
> notified
> > that any disclosure, copying, distribution or taking any action in
> reliance
> > on the contents of this information is strictly prohibited and may be
> > unlawful. If you have received this communication in error, please notify
> > us immediately by responding to this email and then delete it from your
> > system. The firm is neither liable for the proper and complete
> transmission
> > of the information contained in this communication nor for any delay in
> its
> > receipt.
> >
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: File system permission for copying file through shell script

Posted by Amit Mittal <am...@gmail.com>.
Hi Swetha,

Yes, the pig script (in my case, shell script) needs to be in HDFS. However
my question was that the script is doing a file transfer from an external
windows file system to HDFS. So from the node the script will be executed,
that should have the access to the source system so that it can read the
files from source to data nodes.
What I understand now is that since the script can be executed from any
node, so all data nodes should have the external windows folder mounted to
it.

Thanks
Amit

On Mon, Oct 20, 2014 at 4:30 PM, Shwetha GS <sh...@inmobi.com> wrote:

> The workflow is read just by oozie. But if the workflow contains say pig
> action which executes on the cluster, the pig script needs to be on the
> cluster
>
> -Shwetha
>
> On Mon, Oct 20, 2014 at 4:05 PM, Amit Mittal <am...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > I am having a Oozie workflow to read the ingest the files into HDFS. The
> > source system is windows system and we have source directory mounted on
> the
> > unix box from where we execute oozie workflow, since it is my local
> machine
> > 1 node setup, all is well.
> > Now I need to move to a cluster of 20 nodes. So my question is, since I
> > will be calling the unix file copy script from Oozie, do I need to mount
> > the windows source directory to all data nodes or only the node from
> where
> > I am initiating the Oozie workflow.
> >
> > Thanks
> > Amit
> >
>
> --
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>

Re: File system permission for copying file through shell script

Posted by Shwetha GS <sh...@inmobi.com>.
The workflow is read just by oozie. But if the workflow contains say pig
action which executes on the cluster, the pig script needs to be on the
cluster

-Shwetha

On Mon, Oct 20, 2014 at 4:05 PM, Amit Mittal <am...@gmail.com> wrote:

> Hi All,
>
> I am having a Oozie workflow to read the ingest the files into HDFS. The
> source system is windows system and we have source directory mounted on the
> unix box from where we execute oozie workflow, since it is my local machine
> 1 node setup, all is well.
> Now I need to move to a cluster of 20 nodes. So my question is, since I
> will be calling the unix file copy script from Oozie, do I need to mount
> the windows source directory to all data nodes or only the node from where
> I am initiating the Oozie workflow.
>
> Thanks
> Amit
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.