You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by anshu shukla <an...@gmail.com> on 2015/04/30 07:13:44 UTC

Event generator for SPARK-Streaming from csv

I have the real DEBS-TAxi data in csv file , in order to operate over it
how to simulate a "Spout" kind  of thing as event generator using the
timestamps in CSV file.

-- 
SERC-IISC
Thanks & Regards,
Anshu Shukla

Re: Event generator for SPARK-Streaming from csv

Posted by anshu shukla <an...@gmail.com>.
I know these methods , but i need to create events using the timestamps in
the data tuples ,means every time a new tuple  is generated using the
timestamp in a CSV file .this will be useful to simulate the data rate
 with time just like real sensor data .

On Fri, May 1, 2015 at 2:52 PM, Juan Rodríguez Hortalá <
juan.rodriguez.hortala@gmail.com> wrote:

> Hi,
>
> Maybe you could use streamingContext.fileStream like in the example from
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers,
> you can read "from files on any file system compatible with the HDFS API
> (that is, HDFS, S3, NFS, etc.)". You could split the file into several
> smaller files, and move them to the target folder one by one with some
> sleep time in between to simulate a stream of data with custom granularity.
>
> Hope that helps,
>
> Greetings,
>
> Juan
>
> 2015-05-01 9:30 GMT+02:00 anshu shukla <an...@gmail.com>:
>
>>
>>
>>
>>
>> I have the real DEBS-TAxi data in csv file , in order to operate over it
>> how to simulate a "Spout" kind  of thing as event generator using the
>> timestamps in CSV file.
>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Anshu Shukla
>>
>
>


-- 
Thanks & Regards,
Anshu Shukla

Re: Event generator for SPARK-Streaming from csv

Posted by anshu shukla <an...@gmail.com>.
I know these methods , but i need to create events using the timestamps in
the data tuples ,means every time a new tuple  is generated using the
timestamp in a CSV file .this will be useful to simulate the data rate
 with time just like real sensor data .

On Fri, May 1, 2015 at 2:52 PM, Juan Rodríguez Hortalá <
juan.rodriguez.hortala@gmail.com> wrote:

> Hi,
>
> Maybe you could use streamingContext.fileStream like in the example from
> https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers,
> you can read "from files on any file system compatible with the HDFS API
> (that is, HDFS, S3, NFS, etc.)". You could split the file into several
> smaller files, and move them to the target folder one by one with some
> sleep time in between to simulate a stream of data with custom granularity.
>
> Hope that helps,
>
> Greetings,
>
> Juan
>
> 2015-05-01 9:30 GMT+02:00 anshu shukla <an...@gmail.com>:
>
>>
>>
>>
>>
>> I have the real DEBS-TAxi data in csv file , in order to operate over it
>> how to simulate a "Spout" kind  of thing as event generator using the
>> timestamps in CSV file.
>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Anshu Shukla
>>
>
>


-- 
Thanks & Regards,
Anshu Shukla

Re: Event generator for SPARK-Streaming from csv

Posted by Juan Rodríguez Hortalá <ju...@gmail.com>.
Hi,

Maybe you could use streamingContext.fileStream like in the example from
https://spark.apache.org/docs/latest/streaming-programming-guide.html#input-dstreams-and-receivers,
you can read "from files on any file system compatible with the HDFS API
(that is, HDFS, S3, NFS, etc.)". You could split the file into several
smaller files, and move them to the target folder one by one with some
sleep time in between to simulate a stream of data with custom granularity.

Hope that helps,

Greetings,

Juan

2015-05-01 9:30 GMT+02:00 anshu shukla <an...@gmail.com>:

>
>
>
>
> I have the real DEBS-TAxi data in csv file , in order to operate over it
> how to simulate a "Spout" kind  of thing as event generator using the
> timestamps in CSV file.
>
>
>
>
> --
> Thanks & Regards,
> Anshu Shukla
>

Fwd: Event generator for SPARK-Streaming from csv

Posted by anshu shukla <an...@gmail.com>.
I have the real DEBS-TAxi data in csv file , in order to operate over it
how to simulate a "Spout" kind  of thing as event generator using the
timestamps in CSV file.




-- 
Thanks & Regards,
Anshu Shukla

Fwd: Event generator for SPARK-Streaming from csv

Posted by anshu shukla <an...@gmail.com>.
I have the real DEBS-TAxi data in csv file , in order to operate over it
how to simulate a "Spout" kind  of thing as event generator using the
timestamps in CSV file.




-- 
Thanks & Regards,
Anshu Shukla