You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by roryofbyrne <ro...@gmail.com> on 2016/02/18 11:40:59 UTC

How do I stream in Parquet files using fileStream() and ParquetInputFormat

Hi, 

I'm trying to understand how to stream Parquet files into Spark using
StreamingContext.fileStream[Key, Value, Format](). 

I am struggling to understand a) what should be passed as Key and Value
(assuming ParquetInputFormat - is this the correct format?), and b) how - if
at all - to configure the ParquetInputFormat with a ReadSupport class, 
RecordMaterializer etc.. 

Any help is appreciated as I have almost no knowledge of Hadoop so this low
level use of Hadoop is very confusing for me.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-do-I-stream-in-Parquet-files-using-fileStream-and-ParquetInputFormat-tp26262.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org