You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by mccheah <gi...@git.apache.org> on 2018/11/02 20:55:20 UTC

[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

Github user mccheah commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22547#discussion_r230505785
  
    --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaContinuousInputStream.scala ---
    @@ -46,17 +45,22 @@ import org.apache.spark.sql.types.StructType
      *                       scenarios, where some offsets after the specified initial ones can't be
      *                       properly read.
      */
    -class KafkaContinuousReadSupport(
    +class KafkaContinuousInputStream(
    --- End diff --
    
    +1 for this. A lot of the changes right now are for moving around the streaming code especially, which makes it harder to isolate just the proposed API for review.
    
    An alternative is to make this PR separate commits that, while the commits themselves may not compile because of mismatching signatures - but all the commits taken together would compile, and each commit can be reviewed individually for assessing the API and then the implementation.
    
    For example I'd propose 3 PRs:
    
    * Batch reading, with a commit for the interface changes and a separate commit for the implementation changes
    * Micro Batch Streaming read, with a commit for the interface changes and a separate commit for the implementation changes
    * Continuous streaming read, similar to above
    
    Thoughts?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org