You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yana Kadiyska (JIRA)" <ji...@apache.org> on 2015/05/07 15:39:01 UTC

[jira] [Comment Edited] (SPARK-3928) Support wildcard matches on Parquet files

    [ https://issues.apache.org/jira/browse/SPARK-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532624#comment-14532624 ] 

Yana Kadiyska edited comment on SPARK-3928 at 5/7/15 1:38 PM:
--------------------------------------------------------------

I am observing the same issue.

Downloaded a pre-built CDH4 1.3.1 distro.

{quote}
scala> sc.textFile("/rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet").first
res0: String = PAR1????? L??????? ?p??????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?��??????? ???p?????p??????? ,????��???��????�q?�qL?� ????�8��????{???????????????%??? ???/???(???�???????�???9???�???????�???????2???#???M???????????????????0??? ???6???�???4???�???P???*??????????????? ???????????????????�???
s
scala> hc.parquetFile("/rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet")
java.io.FileNotFoundException: File does not exist: /rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet

{quote}


was (Author: yanakad):
I am observing the same issue.

Downloaded a pre-built CDH4 1.3.1 distro.

{quote}
scala> sc.textFile("/rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet").first
res0: String = PAR1????? L??????? ?p??????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?�� ?????? ???p?????p??????? ,????�� ??�� ????? ,?��??????? ???p?????p??????? ,????��???��????�q?�qL?� ????�8��????{???????????????%??? ???/???(???�???????�???9???�???????�???????2???#???M???????????????????0??? ???6???�???4???�???P???*??????????????? ???????????????????�???
s
scala> hc.parquetFile("/rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet")
java.io.FileNotFoundException: File does not exist: hdfs://cdh4-21968-nn/rum/warehouse/hive/pkey=0000-2015-04/0000-2015-04-serialno-750/*.parquet

{quote}

> Support wildcard matches on Parquet files
> -----------------------------------------
>
>                 Key: SPARK-3928
>                 URL: https://issues.apache.org/jira/browse/SPARK-3928
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, SQL
>            Reporter: Nicholas Chammas
>            Priority: Minor
>             Fix For: 1.3.0
>
>
> {{SparkContext.textFile()}} supports patterns like {{part-*}} and {{2014-\?\?-\?\?}}. 
> It would be nice if {{SparkContext.parquetFile()}} did the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org