You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Andrew Holway <an...@otternetworks.de> on 2016/01/23 13:08:25 UTC

python - list objects in HDFS directory

Hello,

I would like to make a list of files (parquet or json) in a specific
HDFS directory with python so I can do some logic on which files to
load into a dataframe.

Any ideas?

Thanks,

Andrew

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Re: python - list objects in HDFS directory

Posted by Ted Yu <yu...@gmail.com>.
Is 'hadoop' / 'hdfs' command accessible to your python script ?

If so, you can call 'hdfs dfs -ls' from python.

Cheers

On Sat, Jan 23, 2016 at 4:08 AM, Andrew Holway <
andrew.holway@otternetworks.de> wrote:

> Hello,
>
> I would like to make a list of files (parquet or json) in a specific
> HDFS directory with python so I can do some logic on which files to
> load into a dataframe.
>
> Any ideas?
>
> Thanks,
>
> Andrew
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>