You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mohammad Tariq <do...@gmail.com> on 2013/04/02 23:39:23 UTC

Fwd: MapReduce on Local files

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


---------- Forwarded message ----------
From: Mohammad Tariq <do...@gmail.com>
Date: Tue, Apr 2, 2013 at 5:16 PM
Subject: MapReduce on Local files
To: mapreduce-user@hadoop.apache.org


Hello list,

           Is a MR job capable of reading even the hidden temp files
present inside a directory located on my local FS?I have noticed this thing
today for the first time because till now I never tried running MR jobs on
local files.

Thank you so much for your time?

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you Azuryy. It was about the files ending with a tilde "~".
These files are actually backup files, hidden to the users but my
job was able to see them. I am working on Ubuntu(Gnome DE).

Nothing serious, just out of curiosity :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:36 PM, Azuryy Yu <az...@gmail.com> wrote:

> For FileInputFormat, start with "_" is hidden file by default. you can
> write a custom PathFilter, and pass it to the InputFormat.
>
>
> On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
>> Environments) consider ~-suffix files as hidden but not the general
>> standards (try ls for example, or even shell expansions, it will
>> ignore . prefixes, but not ~ suffixes) :)
>>
>> To answer specifically though, no, the base FileInputFormat does not
>> recognize ~ today, but if you want it to, you can pass a custom path
>> filter to your InputFormat's implementation for when it calls the
>> listStatus method.
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> > Hello Harsh,
>> >
>> >         Thank you for the response. I am sorry for being unclear.
>> > Actually I was talking about the backup files which end with "~"
>> > I mean these files are not visible normally, but my job is able to
>> > see them. Does FileInputFormat behave in the same way for "~"
>> > as it does in the case of "." and "_"?
>> >
>> > Thanks.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> Not quite sure if I got your question. These tidbits may help though,
>> >> from what I can understand:
>> >>
>> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> >> has no concept of what a hidden file is on its own. It retrieves the
>> >> whole list.
>> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> >> and "_" starting path names, from added input paths to the job.
>> >>
>> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > ---------- Forwarded message ----------
>> >> > From: Mohammad Tariq <do...@gmail.com>
>> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> >> > Subject: MapReduce on Local files
>> >> > To: mapreduce-user@hadoop.apache.org
>> >> >
>> >> >
>> >> > Hello list,
>> >> >
>> >> >            Is a MR job capable of reading even the hidden temp files
>> >> > present
>> >> > inside a directory located on my local FS?I have noticed this thing
>> >> > today
>> >> > for the first time because till now I never tried running MR jobs on
>> >> > local
>> >> > files.
>> >> >
>> >> > Thank you so much for your time?
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you Azuryy. It was about the files ending with a tilde "~".
These files are actually backup files, hidden to the users but my
job was able to see them. I am working on Ubuntu(Gnome DE).

Nothing serious, just out of curiosity :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:36 PM, Azuryy Yu <az...@gmail.com> wrote:

> For FileInputFormat, start with "_" is hidden file by default. you can
> write a custom PathFilter, and pass it to the InputFormat.
>
>
> On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
>> Environments) consider ~-suffix files as hidden but not the general
>> standards (try ls for example, or even shell expansions, it will
>> ignore . prefixes, but not ~ suffixes) :)
>>
>> To answer specifically though, no, the base FileInputFormat does not
>> recognize ~ today, but if you want it to, you can pass a custom path
>> filter to your InputFormat's implementation for when it calls the
>> listStatus method.
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> > Hello Harsh,
>> >
>> >         Thank you for the response. I am sorry for being unclear.
>> > Actually I was talking about the backup files which end with "~"
>> > I mean these files are not visible normally, but my job is able to
>> > see them. Does FileInputFormat behave in the same way for "~"
>> > as it does in the case of "." and "_"?
>> >
>> > Thanks.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> Not quite sure if I got your question. These tidbits may help though,
>> >> from what I can understand:
>> >>
>> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> >> has no concept of what a hidden file is on its own. It retrieves the
>> >> whole list.
>> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> >> and "_" starting path names, from added input paths to the job.
>> >>
>> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > ---------- Forwarded message ----------
>> >> > From: Mohammad Tariq <do...@gmail.com>
>> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> >> > Subject: MapReduce on Local files
>> >> > To: mapreduce-user@hadoop.apache.org
>> >> >
>> >> >
>> >> > Hello list,
>> >> >
>> >> >            Is a MR job capable of reading even the hidden temp files
>> >> > present
>> >> > inside a directory located on my local FS?I have noticed this thing
>> >> > today
>> >> > for the first time because till now I never tried running MR jobs on
>> >> > local
>> >> > files.
>> >> >
>> >> > Thank you so much for your time?
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you Azuryy. It was about the files ending with a tilde "~".
These files are actually backup files, hidden to the users but my
job was able to see them. I am working on Ubuntu(Gnome DE).

Nothing serious, just out of curiosity :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:36 PM, Azuryy Yu <az...@gmail.com> wrote:

> For FileInputFormat, start with "_" is hidden file by default. you can
> write a custom PathFilter, and pass it to the InputFormat.
>
>
> On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
>> Environments) consider ~-suffix files as hidden but not the general
>> standards (try ls for example, or even shell expansions, it will
>> ignore . prefixes, but not ~ suffixes) :)
>>
>> To answer specifically though, no, the base FileInputFormat does not
>> recognize ~ today, but if you want it to, you can pass a custom path
>> filter to your InputFormat's implementation for when it calls the
>> listStatus method.
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> > Hello Harsh,
>> >
>> >         Thank you for the response. I am sorry for being unclear.
>> > Actually I was talking about the backup files which end with "~"
>> > I mean these files are not visible normally, but my job is able to
>> > see them. Does FileInputFormat behave in the same way for "~"
>> > as it does in the case of "." and "_"?
>> >
>> > Thanks.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> Not quite sure if I got your question. These tidbits may help though,
>> >> from what I can understand:
>> >>
>> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> >> has no concept of what a hidden file is on its own. It retrieves the
>> >> whole list.
>> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> >> and "_" starting path names, from added input paths to the job.
>> >>
>> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > ---------- Forwarded message ----------
>> >> > From: Mohammad Tariq <do...@gmail.com>
>> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> >> > Subject: MapReduce on Local files
>> >> > To: mapreduce-user@hadoop.apache.org
>> >> >
>> >> >
>> >> > Hello list,
>> >> >
>> >> >            Is a MR job capable of reading even the hidden temp files
>> >> > present
>> >> > inside a directory located on my local FS?I have noticed this thing
>> >> > today
>> >> > for the first time because till now I never tried running MR jobs on
>> >> > local
>> >> > files.
>> >> >
>> >> > Thank you so much for your time?
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Thank you Azuryy. It was about the files ending with a tilde "~".
These files are actually backup files, hidden to the users but my
job was able to see them. I am working on Ubuntu(Gnome DE).

Nothing serious, just out of curiosity :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:36 PM, Azuryy Yu <az...@gmail.com> wrote:

> For FileInputFormat, start with "_" is hidden file by default. you can
> write a custom PathFilter, and pass it to the InputFormat.
>
>
> On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
>> Environments) consider ~-suffix files as hidden but not the general
>> standards (try ls for example, or even shell expansions, it will
>> ignore . prefixes, but not ~ suffixes) :)
>>
>> To answer specifically though, no, the base FileInputFormat does not
>> recognize ~ today, but if you want it to, you can pass a custom path
>> filter to your InputFormat's implementation for when it calls the
>> listStatus method.
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> > Hello Harsh,
>> >
>> >         Thank you for the response. I am sorry for being unclear.
>> > Actually I was talking about the backup files which end with "~"
>> > I mean these files are not visible normally, but my job is able to
>> > see them. Does FileInputFormat behave in the same way for "~"
>> > as it does in the case of "." and "_"?
>> >
>> > Thanks.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>> >>
>> >> Not quite sure if I got your question. These tidbits may help though,
>> >> from what I can understand:
>> >>
>> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> >> has no concept of what a hidden file is on its own. It retrieves the
>> >> whole list.
>> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> >> and "_" starting path names, from added input paths to the job.
>> >>
>> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
>> wrote:
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > ---------- Forwarded message ----------
>> >> > From: Mohammad Tariq <do...@gmail.com>
>> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> >> > Subject: MapReduce on Local files
>> >> > To: mapreduce-user@hadoop.apache.org
>> >> >
>> >> >
>> >> > Hello list,
>> >> >
>> >> >            Is a MR job capable of reading even the hidden temp files
>> >> > present
>> >> > inside a directory located on my local FS?I have noticed this thing
>> >> > today
>> >> > for the first time because till now I never tried running MR jobs on
>> >> > local
>> >> > files.
>> >> >
>> >> > Thank you so much for your time?
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: MapReduce on Local files

Posted by Azuryy Yu <az...@gmail.com>.
For FileInputFormat, start with "_" is hidden file by default. you can
write a custom PathFilter, and pass it to the InputFormat.


On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
I see.
Thank you so much the clarification Harsh :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:28 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
I see.
Thank you so much the clarification Harsh :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:28 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Azuryy Yu <az...@gmail.com>.
For FileInputFormat, start with "_" is hidden file by default. you can
write a custom PathFilter, and pass it to the InputFormat.


On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Azuryy Yu <az...@gmail.com>.
For FileInputFormat, start with "_" is hidden file by default. you can
write a custom PathFilter, and pass it to the InputFormat.


On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
I see.
Thank you so much the clarification Harsh :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:28 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Azuryy Yu <az...@gmail.com>.
For FileInputFormat, start with "_" is hidden file by default. you can
write a custom PathFilter, and pass it to the InputFormat.


On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
I see.
Thank you so much the clarification Harsh :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 3:28 PM, Harsh J <ha...@cloudera.com> wrote:

> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
> Environments) consider ~-suffix files as hidden but not the general
> standards (try ls for example, or even shell expansions, it will
> ignore . prefixes, but not ~ suffixes) :)
>
> To answer specifically though, no, the base FileInputFormat does not
> recognize ~ today, but if you want it to, you can pass a custom path
> filter to your InputFormat's implementation for when it calls the
> listStatus method.
>
> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> > Hello Harsh,
> >
> >         Thank you for the response. I am sorry for being unclear.
> > Actually I was talking about the backup files which end with "~"
> > I mean these files are not visible normally, but my job is able to
> > see them. Does FileInputFormat behave in the same way for "~"
> > as it does in the case of "." and "_"?
> >
> > Thanks.
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
> >>
> >> Not quite sure if I got your question. These tidbits may help though,
> >> from what I can understand:
> >>
> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> >> has no concept of what a hidden file is on its own. It retrieves the
> >> whole list.
> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
> >> and "_" starting path names, from added input paths to the job.
> >>
> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com>
> wrote:
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >> >
> >> > ---------- Forwarded message ----------
> >> > From: Mohammad Tariq <do...@gmail.com>
> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
> >> > Subject: MapReduce on Local files
> >> > To: mapreduce-user@hadoop.apache.org
> >> >
> >> >
> >> > Hello list,
> >> >
> >> >            Is a MR job capable of reading even the hidden temp files
> >> > present
> >> > inside a directory located on my local FS?I have noticed this thing
> >> > today
> >> > for the first time because till now I never tried running MR jobs on
> >> > local
> >> > files.
> >> >
> >> > Thank you so much for your time?
> >> >
> >> > Warm Regards,
> >> > Tariq
> >> > https://mtariq.jux.com/
> >> > cloudfront.blogspot.com
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
Environments) consider ~-suffix files as hidden but not the general
standards (try ls for example, or even shell expansions, it will
ignore . prefixes, but not ~ suffixes) :)

To answer specifically though, no, the base FileInputFormat does not
recognize ~ today, but if you want it to, you can pass a custom path
filter to your InputFormat's implementation for when it calls the
listStatus method.

On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Harsh,
>
>         Thank you for the response. I am sorry for being unclear.
> Actually I was talking about the backup files which end with "~"
> I mean these files are not visible normally, but my job is able to
> see them. Does FileInputFormat behave in the same way for "~"
> as it does in the case of "." and "_"?
>
> Thanks.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Not quite sure if I got your question. These tidbits may help though,
>> from what I can understand:
>>
>> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> has no concept of what a hidden file is on its own. It retrieves the
>> whole list.
>> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> and "_" starting path names, from added input paths to the job.
>>
>> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: Mohammad Tariq <do...@gmail.com>
>> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> > Subject: MapReduce on Local files
>> > To: mapreduce-user@hadoop.apache.org
>> >
>> >
>> > Hello list,
>> >
>> >            Is a MR job capable of reading even the hidden temp files
>> > present
>> > inside a directory located on my local FS?I have noticed this thing
>> > today
>> > for the first time because till now I never tried running MR jobs on
>> > local
>> > files.
>> >
>> > Thank you so much for your time?
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
Environments) consider ~-suffix files as hidden but not the general
standards (try ls for example, or even shell expansions, it will
ignore . prefixes, but not ~ suffixes) :)

To answer specifically though, no, the base FileInputFormat does not
recognize ~ today, but if you want it to, you can pass a custom path
filter to your InputFormat's implementation for when it calls the
listStatus method.

On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Harsh,
>
>         Thank you for the response. I am sorry for being unclear.
> Actually I was talking about the backup files which end with "~"
> I mean these files are not visible normally, but my job is able to
> see them. Does FileInputFormat behave in the same way for "~"
> as it does in the case of "." and "_"?
>
> Thanks.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Not quite sure if I got your question. These tidbits may help though,
>> from what I can understand:
>>
>> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> has no concept of what a hidden file is on its own. It retrieves the
>> whole list.
>> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> and "_" starting path names, from added input paths to the job.
>>
>> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: Mohammad Tariq <do...@gmail.com>
>> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> > Subject: MapReduce on Local files
>> > To: mapreduce-user@hadoop.apache.org
>> >
>> >
>> > Hello list,
>> >
>> >            Is a MR job capable of reading even the hidden temp files
>> > present
>> > inside a directory located on my local FS?I have noticed this thing
>> > today
>> > for the first time because till now I never tried running MR jobs on
>> > local
>> > files.
>> >
>> > Thank you so much for your time?
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
Environments) consider ~-suffix files as hidden but not the general
standards (try ls for example, or even shell expansions, it will
ignore . prefixes, but not ~ suffixes) :)

To answer specifically though, no, the base FileInputFormat does not
recognize ~ today, but if you want it to, you can pass a custom path
filter to your InputFormat's implementation for when it calls the
listStatus method.

On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Harsh,
>
>         Thank you for the response. I am sorry for being unclear.
> Actually I was talking about the backup files which end with "~"
> I mean these files are not visible normally, but my job is able to
> see them. Does FileInputFormat behave in the same way for "~"
> as it does in the case of "." and "_"?
>
> Thanks.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Not quite sure if I got your question. These tidbits may help though,
>> from what I can understand:
>>
>> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> has no concept of what a hidden file is on its own. It retrieves the
>> whole list.
>> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> and "_" starting path names, from added input paths to the job.
>>
>> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: Mohammad Tariq <do...@gmail.com>
>> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> > Subject: MapReduce on Local files
>> > To: mapreduce-user@hadoop.apache.org
>> >
>> >
>> > Hello list,
>> >
>> >            Is a MR job capable of reading even the hidden temp files
>> > present
>> > inside a directory located on my local FS?I have noticed this thing
>> > today
>> > for the first time because till now I never tried running MR jobs on
>> > local
>> > files.
>> >
>> > Thank you so much for your time?
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
Environments) consider ~-suffix files as hidden but not the general
standards (try ls for example, or even shell expansions, it will
ignore . prefixes, but not ~ suffixes) :)

To answer specifically though, no, the base FileInputFormat does not
recognize ~ today, but if you want it to, you can pass a custom path
filter to your InputFormat's implementation for when it calls the
listStatus method.

On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <do...@gmail.com> wrote:
> Hello Harsh,
>
>         Thank you for the response. I am sorry for being unclear.
> Actually I was talking about the backup files which end with "~"
> I mean these files are not visible normally, but my job is able to
> see them. Does FileInputFormat behave in the same way for "~"
> as it does in the case of "." and "_"?
>
> Thanks.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:
>>
>> Not quite sure if I got your question. These tidbits may help though,
>> from what I can understand:
>>
>> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> has no concept of what a hidden file is on its own. It retrieves the
>> whole list.
>> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> and "_" starting path names, from added input paths to the job.
>>
>> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: Mohammad Tariq <do...@gmail.com>
>> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> > Subject: MapReduce on Local files
>> > To: mapreduce-user@hadoop.apache.org
>> >
>> >
>> > Hello list,
>> >
>> >            Is a MR job capable of reading even the hidden temp files
>> > present
>> > inside a directory located on my local FS?I have noticed this thing
>> > today
>> > for the first time because till now I never tried running MR jobs on
>> > local
>> > files.
>> >
>> > Thank you so much for your time?
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Harsh,

        Thank you for the response. I am sorry for being unclear.
Actually I was talking about the backup files which end with "~"
I mean these files are not visible normally, but my job is able to
see them. Does FileInputFormat behave in the same way for "~"
as it does in the case of "." and "_"?

Thanks.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Not quite sure if I got your question. These tidbits may help though,
> from what I can understand:
>
> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> has no concept of what a hidden file is on its own. It retrieves the
> whole list.
> * MR's FileInputFormat (and normal derivatives) does filter away "."
> and "_" starting path names, from added input paths to the job.
>
> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > ---------- Forwarded message ----------
> > From: Mohammad Tariq <do...@gmail.com>
> > Date: Tue, Apr 2, 2013 at 5:16 PM
> > Subject: MapReduce on Local files
> > To: mapreduce-user@hadoop.apache.org
> >
> >
> > Hello list,
> >
> >            Is a MR job capable of reading even the hidden temp files
> present
> > inside a directory located on my local FS?I have noticed this thing today
> > for the first time because till now I never tried running MR jobs on
> local
> > files.
> >
> > Thank you so much for your time?
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Harsh,

        Thank you for the response. I am sorry for being unclear.
Actually I was talking about the backup files which end with "~"
I mean these files are not visible normally, but my job is able to
see them. Does FileInputFormat behave in the same way for "~"
as it does in the case of "." and "_"?

Thanks.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Not quite sure if I got your question. These tidbits may help though,
> from what I can understand:
>
> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> has no concept of what a hidden file is on its own. It retrieves the
> whole list.
> * MR's FileInputFormat (and normal derivatives) does filter away "."
> and "_" starting path names, from added input paths to the job.
>
> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > ---------- Forwarded message ----------
> > From: Mohammad Tariq <do...@gmail.com>
> > Date: Tue, Apr 2, 2013 at 5:16 PM
> > Subject: MapReduce on Local files
> > To: mapreduce-user@hadoop.apache.org
> >
> >
> > Hello list,
> >
> >            Is a MR job capable of reading even the hidden temp files
> present
> > inside a directory located on my local FS?I have noticed this thing today
> > for the first time because till now I never tried running MR jobs on
> local
> > files.
> >
> > Thank you so much for your time?
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Harsh,

        Thank you for the response. I am sorry for being unclear.
Actually I was talking about the backup files which end with "~"
I mean these files are not visible normally, but my job is able to
see them. Does FileInputFormat behave in the same way for "~"
as it does in the case of "." and "_"?

Thanks.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Not quite sure if I got your question. These tidbits may help though,
> from what I can understand:
>
> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> has no concept of what a hidden file is on its own. It retrieves the
> whole list.
> * MR's FileInputFormat (and normal derivatives) does filter away "."
> and "_" starting path names, from added input paths to the job.
>
> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > ---------- Forwarded message ----------
> > From: Mohammad Tariq <do...@gmail.com>
> > Date: Tue, Apr 2, 2013 at 5:16 PM
> > Subject: MapReduce on Local files
> > To: mapreduce-user@hadoop.apache.org
> >
> >
> > Hello list,
> >
> >            Is a MR job capable of reading even the hidden temp files
> present
> > inside a directory located on my local FS?I have noticed this thing today
> > for the first time because till now I never tried running MR jobs on
> local
> > files.
> >
> > Thank you so much for your time?
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Mohammad Tariq <do...@gmail.com>.
Hello Harsh,

        Thank you for the response. I am sorry for being unclear.
Actually I was talking about the backup files which end with "~"
I mean these files are not visible normally, but my job is able to
see them. Does FileInputFormat behave in the same way for "~"
as it does in the case of "." and "_"?

Thanks.

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com


On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <ha...@cloudera.com> wrote:

> Not quite sure if I got your question. These tidbits may help though,
> from what I can understand:
>
> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
> has no concept of what a hidden file is on its own. It retrieves the
> whole list.
> * MR's FileInputFormat (and normal derivatives) does filter away "."
> and "_" starting path names, from added input paths to the job.
>
> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
> >
> > ---------- Forwarded message ----------
> > From: Mohammad Tariq <do...@gmail.com>
> > Date: Tue, Apr 2, 2013 at 5:16 PM
> > Subject: MapReduce on Local files
> > To: mapreduce-user@hadoop.apache.org
> >
> >
> > Hello list,
> >
> >            Is a MR job capable of reading even the hidden temp files
> present
> > inside a directory located on my local FS?I have noticed this thing today
> > for the first time because till now I never tried running MR jobs on
> local
> > files.
> >
> > Thank you so much for your time?
> >
> > Warm Regards,
> > Tariq
> > https://mtariq.jux.com/
> > cloudfront.blogspot.com
> >
>
>
>
> --
> Harsh J
>

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
Not quite sure if I got your question. These tidbits may help though,
from what I can understand:

* LocalFileSystem's listing uses Java's APIs for file/dir listing, and
has no concept of what a hidden file is on its own. It retrieves the
whole list.
* MR's FileInputFormat (and normal derivatives) does filter away "."
and "_" starting path names, from added input paths to the job.

On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> ---------- Forwarded message ----------
> From: Mohammad Tariq <do...@gmail.com>
> Date: Tue, Apr 2, 2013 at 5:16 PM
> Subject: MapReduce on Local files
> To: mapreduce-user@hadoop.apache.org
>
>
> Hello list,
>
>            Is a MR job capable of reading even the hidden temp files present
> inside a directory located on my local FS?I have noticed this thing today
> for the first time because till now I never tried running MR jobs on local
> files.
>
> Thank you so much for your time?
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
Not quite sure if I got your question. These tidbits may help though,
from what I can understand:

* LocalFileSystem's listing uses Java's APIs for file/dir listing, and
has no concept of what a hidden file is on its own. It retrieves the
whole list.
* MR's FileInputFormat (and normal derivatives) does filter away "."
and "_" starting path names, from added input paths to the job.

On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> ---------- Forwarded message ----------
> From: Mohammad Tariq <do...@gmail.com>
> Date: Tue, Apr 2, 2013 at 5:16 PM
> Subject: MapReduce on Local files
> To: mapreduce-user@hadoop.apache.org
>
>
> Hello list,
>
>            Is a MR job capable of reading even the hidden temp files present
> inside a directory located on my local FS?I have noticed this thing today
> for the first time because till now I never tried running MR jobs on local
> files.
>
> Thank you so much for your time?
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
Not quite sure if I got your question. These tidbits may help though,
from what I can understand:

* LocalFileSystem's listing uses Java's APIs for file/dir listing, and
has no concept of what a hidden file is on its own. It retrieves the
whole list.
* MR's FileInputFormat (and normal derivatives) does filter away "."
and "_" starting path names, from added input paths to the job.

On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> ---------- Forwarded message ----------
> From: Mohammad Tariq <do...@gmail.com>
> Date: Tue, Apr 2, 2013 at 5:16 PM
> Subject: MapReduce on Local files
> To: mapreduce-user@hadoop.apache.org
>
>
> Hello list,
>
>            Is a MR job capable of reading even the hidden temp files present
> inside a directory located on my local FS?I have noticed this thing today
> for the first time because till now I never tried running MR jobs on local
> files.
>
> Thank you so much for your time?
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>



-- 
Harsh J

Re: MapReduce on Local files

Posted by Harsh J <ha...@cloudera.com>.
Not quite sure if I got your question. These tidbits may help though,
from what I can understand:

* LocalFileSystem's listing uses Java's APIs for file/dir listing, and
has no concept of what a hidden file is on its own. It retrieves the
whole list.
* MR's FileInputFormat (and normal derivatives) does filter away "."
and "_" starting path names, from added input paths to the job.

On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <do...@gmail.com> wrote:
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> ---------- Forwarded message ----------
> From: Mohammad Tariq <do...@gmail.com>
> Date: Tue, Apr 2, 2013 at 5:16 PM
> Subject: MapReduce on Local files
> To: mapreduce-user@hadoop.apache.org
>
>
> Hello list,
>
>            Is a MR job capable of reading even the hidden temp files present
> inside a directory located on my local FS?I have noticed this thing today
> for the first time because till now I never tried running MR jobs on local
> files.
>
> Thank you so much for your time?
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>



-- 
Harsh J