You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by stshruthi <gi...@git.apache.org> on 2017/11/28 19:21:36 UTC
[GitHub] spark pull request #537: SPARK-1795 - Add recursive directory file search to...
Github user stshruthi commented on a diff in the pull request:
https://github.com/apache/spark/pull/537#discussion_r153595926
--- Diff: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala ---
@@ -327,18 +327,18 @@ class StreamingContext private[streaming] (
* @param directory HDFS directory to monitor for new file
* @param filter Function to filter paths to process
* @param newFilesOnly Should process only new files and ignore existing files in the directory
+ * @param recursive Should search through the directory recursively to find new files
* @tparam K Key type for reading HDFS file
* @tparam V Value type for reading HDFS file
* @tparam F Input format for reading HDFS file
*/
def fileStream[
- K: ClassTag,
- V: ClassTag,
- F <: NewInputFormat[K, V]: ClassTag
- ] (directory: String, filter: Path => Boolean, newFilesOnly: Boolean): InputDStream[(K, V)] = {
- new FileInputDStream[K, V, F](this, directory, filter, newFilesOnly)
- }
-
+ K: ClassTag,
+ V: ClassTag,
+ F <: NewInputFormat[K, V]: ClassTag
+ ] (directory: String, filter: Path => Boolean, newFilesOnly: Boolean, recursive: Boolean): DStream[(K, V)] = {
+ new FileInputDStream[K, V, F](this, directory, filter, newFilesOnly, recursive)
+ }
--- End diff --
In which version of spark can we get the API with support for nested directory streaming?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org