You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by siva kumar <si...@gmail.com> on 2016/02/10 13:34:07 UTC
Spark streaming
Hi,
I'm pulling some twitter data and trying to save the data into
persistent table.This is the code written.
case class Tweet(createdAt:Long, text:String)
twt.map(status=>
Tweet(status.getCreatedAt().getTime()/1000, status.getText())
).foreachRDD(rdd=>
rdd.toDF().saveAsTable("stream",SaveMode.Append)
)
When I go to spark-sql an check , i can see the table created. When im
trying to retrieve data im getting below error.
* java.lang.RuntimeException:
file:/user/hive/warehouse/stream/_temporary/0/_temporary/attempt_201602101609_0383_r_000014_0/part-r-00664.parquet
is not a Parquet file (too small)*
Is this the correct way to store the streaming data into a persistent table?
Any help?
Thanks in Advance
Siva.