You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Cassa L <lc...@gmail.com> on 2016/01/12 00:09:06 UTC
Regarding sliding window example from Databricks for DStream
Hi,
I'm trying to work with sliding window example given by databricks.
https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/windows.html
It works fine as expected.
My question is how do I determine when the last phase of of slider has
reached. I want to perform final operation and notify other system when end
of the slider has reched to the window duarions. e.g. in below example
from databricks,
JavaDStream<ApacheAccessLog> windowDStream =
accessLogDStream.window(WINDOW_LENGTH, SLIDE_INTERVAL);
windowDStream.foreachRDD(accessLogs -> {
if (accessLogs.count() == 0) {
System.out.println("No access logs in this time interval");
return null;
}
// Insert code verbatim from LogAnalyzer.java or LogAnalyzerSQL.java here.
// Calculate statistics based on the content size.
JavaRDD<Long> contentSizes =
accessLogs.map(ApacheAccessLog::getContentSize).cache();
System.out.println(String.format("Content Size Avg: %s, Min: %s, Max: %s",
contentSizes.reduce(SUM_REDUCER) / contentSizes.count(),
contentSizes.min(Comparator.naturalOrder()),
contentSizes.max(Comparator.naturalOrder())));
//.....
}
I want to check if entire average at the end of window falls below
certain value and send alert. How do I get this?
Thanks,
LCassa
Re: Regarding sliding window example from Databricks for DStream
Posted by Cassa L <lc...@gmail.com>.
Any thoughts over this? I want to know when window duration is complete
and not the sliding window. Is there a way I can catch end of Window
Duration or do I need to keep track of it and how?
LCassa
On Mon, Jan 11, 2016 at 3:09 PM, Cassa L <lc...@gmail.com> wrote:
> Hi,
> I'm trying to work with sliding window example given by databricks.
>
> https://databricks.gitbooks.io/databricks-spark-reference-applications/content/logs_analyzer/chapter1/windows.html
>
> It works fine as expected.
> My question is how do I determine when the last phase of of slider has
> reached. I want to perform final operation and notify other system when end
> of the slider has reched to the window duarions. e.g. in below example
> from databricks,
>
>
> JavaDStream<ApacheAccessLog> windowDStream =
> accessLogDStream.window(WINDOW_LENGTH, SLIDE_INTERVAL);
> windowDStream.foreachRDD(accessLogs -> {
> if (accessLogs.count() == 0) {
> System.out.println("No access logs in this time interval");
> return null;
> }
>
> // Insert code verbatim from LogAnalyzer.java or LogAnalyzerSQL.java here.
>
> // Calculate statistics based on the content size.
> JavaRDD<Long> contentSizes =
> accessLogs.map(ApacheAccessLog::getContentSize).cache();
> System.out.println(String.format("Content Size Avg: %s, Min: %s, Max: %s",
> contentSizes.reduce(SUM_REDUCER) / contentSizes.count(),
> contentSizes.min(Comparator.naturalOrder()),
> contentSizes.max(Comparator.naturalOrder())));
>
> //.....
> }
>
> I want to check if entire average at the end of window falls below certain value and send alert. How do I get this?
>
>
> Thanks,
> LCassa
>
>