You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by unk1102 <um...@gmail.com> on 2015/10/03 15:01:09 UTC

Can we using Spark Streaming to stream data from Hive table partitions?

Hi I have couple of Spark jobs which reads Hive table partitions data and
processes it independently in different threads in a driver. Now data to
process is huge in terms of TB my jobs are not scaling and running slow. So
I am thinking to use Spark Streaming as and when data is added into Hive
partitions so that I dont need to process only loaded partitions.

Can we read directly Hive table partitions data using Spark streaming?
Please guide. Also please share best practices to process TBs of data
generated everyday. Please guide. Thanks in advance.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-we-using-Spark-Streaming-to-stream-data-from-Hive-table-partitions-tp24915.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org