You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "RJ Nowling (JIRA)" <ji...@apache.org> on 2014/12/03 18:24:12 UTC
[jira] [Created] (SPARK-4727) Add "dimensional" RDDs (time series,
spatial)
RJ Nowling created SPARK-4727:
---------------------------------
Summary: Add "dimensional" RDDs (time series, spatial)
Key: SPARK-4727
URL: https://issues.apache.org/jira/browse/SPARK-4727
Project: Spark
Issue Type: Brainstorming
Components: Spark Core
Affects Versions: 1.1.0
Reporter: RJ Nowling
Certain types of data (times series, spatial) can benefit from specialized RDDs. I'd like to open a discussion about this.
For example, time series data should be ordered by time and would benefit from operations like:
* Subsampling (taking every n data points)
* Signal processing (correlations, FFTs, filtering)
* Windowing functions
Spatial data benefits from ordering and partitioning along a 2D or 3D grid. For example, path finding algorithms can optimized by only comparing points within a set distance, which can be computed more efficiently by partitioning data into a grid.
Although the operations on time series and spatial data may be different, there is some commonality in the sense of the data having ordered dimensions and the implementations may overlap.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org