You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Ashok Kumar <as...@yahoo.com.INVALID> on 2016/05/31 12:04:25 UTC

processing twitter data

hi all,
i know very little about the subject.
we would like to get streaming data from twitter and facebook.
so questions please may i
   
   - what format is data from twitter. is it jason format
   - can i use spark and spark streaming for analyzing data
   - can data be fed in/streamed via kafka from twitter
   - what would be the optimum batch interval, windows interval and windows sliding interval?
   - what is the best method of storing this data in a database. can i use hive tables for it and which one is most stuable please

thanking you