You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by sjbrunst <gi...@git.apache.org> on 2014/08/01 17:29:46 UTC

[GitHub] spark pull request: [STREAMING] SPARK-2788 Add location filtering ...

GitHub user sjbrunst opened a pull request:

    https://github.com/apache/spark/pull/1717

    [STREAMING] SPARK-2788 Add location filtering to Twitter streams

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sjbrunst/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1717.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1717
    
----
commit 27a02ca2b8e8168b7ad9d56730f191174efcf799
Author: Shawn Brunsting <sj...@users.noreply.github.com>
Date:   2014-07-31T18:59:09Z

    Add locations parameter to Twitter streams.

commit 9ad4a9706aae16ca0b548c1221fe11e8e8f39597
Author: Shawn Brunsting <sj...@users.noreply.github.com>
Date:   2014-07-31T19:34:05Z

    Implement location filtering for Twitter streams.

commit 0747cb29dffe8c768d7a5fd105272c0a0265c878
Author: Shawn Brunsting <sj...@users.noreply.github.com>
Date:   2014-08-01T15:12:47Z

    Add documentation for Twitter location filtering.

commit 9dcad31b0169b398bfbc59eabd93ba659843848c
Author: Shawn Brunsting <sj...@users.noreply.github.com>
Date:   2014-08-01T15:19:00Z

    Remove commented out code from previous commit.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58916787
  
    @surendramarupudi Are you using a released version of Spark or a copy of my repo? Either way, using a double[][] will not work. This change has not been pulled into Spark, and the current version of this PR uses a BoundingBox instead of a double[][]. My initial commit in this PR used a double[][], but that has been changed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68613002
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25017/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16729438
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -85,9 +89,14 @@ class TwitterReceiver(
             }
           })
     
    -      val query = new FilterQuery
    -      if (filters.size > 0) {
    -        query.track(filters.toArray)
    +      if ((filters.size > 0) || (locations.size > 0)) {
    +        val query = new FilterQuery
    +        if (filters.size > 0) {
    +          query.track(filters.toArray)
    +        }
    --- End diff --
    
    Yes, text filters and locations can be added simultaneously. If both are added, then Twitter will return a mixture of tweets that satisfy either filter.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53218083
  
    The units tests failed because these new functions are not binary compatible with previous versions of Spark. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57560702
  
    Twitter4j Authorization already leaked into public API, so FilterQuery doesn't make things much worse. 
    
    I personally don't like Twitter4j for too javaish API (two dimensional array) and not super clear documentation. But adding completely new type-based scala facade and keeping Authorization at the same time, I don't feel that it's right.
    
    Building another twitter client... does't sound fun as well. I think that FilterQuery gives enough flexibility with minimal costs. And scala-typed "facade" can be out of Spark, depending on requirements (geo, lang, etc...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59415270
  
    @tdas I've built the facade. I tried to add the language parameter, but it failed to compile saying that the language method does not exist. Apparently it hasn't been implemented yet, even though it is in the FilterQuery documentation: http://stackoverflow.com/a/18194658


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59422438
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21814/consoleFull) for   PR 1717 at commit [`59b1194`](https://github.com/apache/spark/commit/59b119422a5bd72e7d107d9ecb3c9f14fe42435a).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57397587
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21075/consoleFull) for   PR 1717 at commit [`3e2ddd7`](https://github.com/apache/spark/commit/3e2ddd7fa5232d69be505d218f995064a00272ea).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57055491
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20918/consoleFull) for   PR 1717 at commit [`79a4870`](https://github.com/apache/spark/commit/79a48703f817f1069b01dd4726a8a72de51e9ab2).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72789165
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26709/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-61988263
  
    Ping @tdas Have you had a chance to look at this yet?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17217079
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -42,6 +44,7 @@ class TwitterInputDStream(
         @transient ssc_ : StreamingContext,
         twitterAuth: Option[Authorization],
         filters: Seq[String],
    +    locations: Seq[Seq[Double]],
    --- End diff --
    
    Good question. It definitely is confusing. I went with ``Seq[Seq[Double]]`` because the ``FilterQuery`` created in TwitterInputDStream.scala requires a ``double[][]`` (http://twitter4j.org/javadoc/twitter4j/FilterQuery.html#locations-double:A:A-). This way the only change I have to make to the input is to change between Scala sequences and Java arrays.
    
    The ``Location`` case class you described still does not remove all ambiguity, because the ``FilterQuery`` requires the south-west corner then the north-east corner for the boundary, and that would not prevent someone from giving them in the wrong order and getting unexpected results. If we're going to define a ``case class`` anyways, I think it would be better to make something like ``case class Boundary(west: Double, south: Double, east: Double, north: Double)``. Then the locations parameter would be of type ``Seq[Boundary]``, and I can convert it to a ``double[][]`` just before passing it to the ``FilterQuery`` in TwitterInputDStream.scala. Should I go ahead and implement that?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57164020
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20972/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68615610
  
      [Test build #25019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25019/consoleFull) for   PR 1717 at commit [`77dedcb`](https://github.com/apache/spark/commit/77dedcb2a00027c9119f8dba79f25c84d14d17c8).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56596924
  
    @sjbrunst this is again failing the binary compatibility check. I think its same problem once again, that you had fixed in the earlier version of the patch. Do not change the existing `TwitterUtils.createstream` methods, just add new methods to Scala and Java API. So the examples have no need to change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16639523
  
    --- Diff: external/twitter/src/test/java/org/apache/spark/streaming/twitter/JavaTwitterStreamSuite.java ---
    @@ -31,16 +31,19 @@
       @Test
       public void testTwitterStream() {
         String[] filters = (String[])Arrays.<String>asList("filter1", "filter2").toArray();
    +    double[][] locations = {{-180.0,-90.0},{180.0,90.0}};
         Authorization auth = NullAuthorization.getInstance();
     
         // tests the API, does not actually test data receiving
         JavaDStream<Status> test1 = TwitterUtils.createStream(ssc);
         JavaDStream<Status> test2 = TwitterUtils.createStream(ssc, filters);
    -    JavaDStream<Status> test3 = TwitterUtils.createStream(
    -      ssc, filters, StorageLevel.MEMORY_AND_DISK_SER_2());
    -    JavaDStream<Status> test4 = TwitterUtils.createStream(ssc, auth);
    -    JavaDStream<Status> test5 = TwitterUtils.createStream(ssc, auth, filters);
    -    JavaDStream<Status> test6 = TwitterUtils.createStream(ssc,
    -      auth, filters, StorageLevel.MEMORY_AND_DISK_SER_2());
    +    JavaDStream<Status> test3 = TwitterUtils.createStream(ssc, filters, locations);
    --- End diff --
    
    These tests, should not have to be deleted if signatures are not changed. Only new tests should be added to account for the new signature.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57103886
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20952/consoleFull) for   PR 1717 at commit [`ee44c29`](https://github.com/apache/spark/commit/ee44c29f4940a15d75cee37b818facc06982cd49).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev-at-pellucid <gi...@git.apache.org>.
Github user ezhulenev-at-pellucid commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57023036
  
    @sjbrunst I tested my branch with standalone Spark 1.1.0 and it works fine, even without additional constructors, so I removed them. Looks like it's binary compatible now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53475085
  
    @tdas Thanks for the comments! I'll work on fixing the binary compatibility, though I might not have it done until sometime next week since I'm currently on vacation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57060929
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20920/consoleFull) for   PR 1717 at commit [`4b5b09d`](https://github.com/apache/spark/commit/4b5b09d5a70a120ebd8f9f13ea3ba77611d06b10).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57103889
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20952/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59422448
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21814/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16639408
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -85,9 +89,14 @@ class TwitterReceiver(
             }
           })
     
    -      val query = new FilterQuery
    -      if (filters.size > 0) {
    -        query.track(filters.toArray)
    +      if ((filters.size > 0) || (locations.size > 0)) {
    +        val query = new FilterQuery
    +        if (filters.size > 0) {
    +          query.track(filters.toArray)
    +        }
    --- End diff --
    
    Just to confirm, can text filters and locations filters be added simultaneously?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57402782
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21073/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68616927
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25018/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57378916
  
    Something has really messed up this PR!! There are 112 changed files, with 51 commits. Please fix this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59079944
  
    @surendramarupudi A DStream is a series of RDDs, so you can't cast a DStream to an RDD. Try using the ``DStream.foreachRDD(func)`` to apply ``func`` to each RDD in the stream (see http://spark.apache.org/docs/latest/streaming-programming-guide.html#output-operations-on-dstreams). You should be able to output to MongoDB that way.
    
    Also, that question is not directly related to this pull request, so I think questions like that are best directed at the Spark User mailing list (http://apache-spark-user-list.1001560.n3.nabble.com/).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54395588
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19720/consoleFull) for   PR 1717 at commit [`9f35379`](https://github.com/apache/spark/commit/9f35379e4e9ab122d9c985356ced613d7ae8aa67).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57056019
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20919/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57059419
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20920/consoleFull) for   PR 1717 at commit [`4b5b09d`](https://github.com/apache/spark/commit/4b5b09d5a70a120ebd8f9f13ea3ba77611d06b10).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53214500
  
    Jenkins, this is ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57059126
  
    @sjbrunst you need to rollback your changes in TwitterAlgebirdCMD & TwitterAlgebirdHLL  (remove Nil for locations), and after that project will compile and I should pass all tests. I tried it locally but didn't commit to my repo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by surendramarupudi <gi...@git.apache.org>.
Github user surendramarupudi commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58956675
  
    @sjbrunst Thanks a lot for your clarification. I am able to get the geotagged tweets but I found one issue that is filtering based on keywords is not working, even I give the filters parameter it's resulting only from tweets from geo location bounding box I mentioned and include all the tweets instead of giving only filtered tweets from that location.
    I am using TwitterUtils.createStream(ssc,words,bb); where words are keywords to filter tweets and bb is bounding box array.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by surendramarupudi <gi...@git.apache.org>.
Github user surendramarupudi commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59057521
  
    @sjbrunst do you have any idea how to cast JavaDStream<Status> to JavaRDD<Status>, I need to cast this to save retrieved tweets to mongoDB.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57055518
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20918/consoleFull) for   PR 1717 at commit [`79a4870`](https://github.com/apache/spark/commit/79a48703f817f1069b01dd4726a8a72de51e9ab2).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54474415
  
    Unit tests fail because my changes are not completely binary compatible yet. I'm having some trouble overloading the Scala version of the `createStream` method. See my comment in TwitterUtils.scala.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55511388
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20285/consoleFull) for   PR 1717 at commit [`8937fc7`](https://github.com/apache/spark/commit/8937fc7406d019452f4b650f284ad47bd9401a23).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57055921
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20919/consoleFull) for   PR 1717 at commit [`1e4b204`](https://github.com/apache/spark/commit/1e4b204d87f84c93d08f1a8535d52849355c793a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68616923
  
      [Test build #25018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25018/consoleFull) for   PR 1717 at commit [`155de3f`](https://github.com/apache/spark/commit/155de3fe0ec57330b84cb459d1aaf5ef370871fd).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57408771
  
    Aah I looking at the wrong docs (twitter4j.Query rather than twitter4j.FilterQuery). Looking at FilterQuery, I find that it has many functionalities, tracking, etc. Not just locations, and count, one may want to do any of the things that is provided in FilterQuery. It makes me wonder that, rather than adding location, and count, a better, more scalable approach is to add the ability to add your own filter query object. The user can set whatever filters they want in a FilterQuery object and pass it on to the `TwitterUtils.createStream(filterQuery)`. This is prevent the bloat in the number of combinations of `TwitterUtils.createStream`. 
    
    What do you think?  @sjbrunst @ezhulenev 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57538997
  
    I thought about this a little bit more.
    
    @sjbrunst That is a possibility. But I dont want to add two new sets of methods ( 2 x ( 4 + 4 ) new methods). Also, now that I think about it, we kind of dont want to take the responsibility of specifying the ordering logic of twitter4j's API. Maybe twitter4j hasnt specified the logic for certain reasons? Maybe the ordering is subject to change in the future? 
    
    As it is, we (spark team), think twice about exposing third party data structures through our API for future stability. This is because 
    (i) API gets tied to specific implementation. Exposing FilterQuery is sort of violating that thought already. What if in future we want to not use Twitter4j (Twitter API changes, and Twitter4j does not get updated)? What if we need to make a completely different implementation of twitterStream not using twitter4j. 
    (ii) Binary compatibility. I dont know of Twitter4j maintains binary compatibility across versions. This ties 
    
    Let me think about this a little bit more, and consult my colleagues as well. Any thoughts from you guys?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57536802
  
    @sjbrunst any thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57464410
  
    @tdas That would be a better long-term solution, as I'm sure there will be some users down the road who want to use the other parts of FilterQuery. That way they won't have to submit a PR like this every time they want to use a feature of FilterQuery that ```createStream``` doesn't have yet. I can implement that and push it to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57102225
  
    @ezhulenev I added it in, but I'm having trouble getting the tests to run on my own computer.
    
    Also, I noticed you only added count to the Scala API. Shouldn't it be added to the Java API too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by surendramarupudi <gi...@git.apache.org>.
Github user surendramarupudi commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58950787
  
    @sjbrunst  I am using the copy of your repo. Could you please give me one example for TwitterUtils.createStream(ssc,filters,bounding_box) and how to use BoundingBox and how to set bounds.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68617147
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25019/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55354607
  
    @sjbrunst ping! Any updates on this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54392381
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19720/consoleFull) for   PR 1717 at commit [`9f35379`](https://github.com/apache/spark/commit/9f35379e4e9ab122d9c985356ced613d7ae8aa67).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [STREAMING] SPARK-2788 Add location filtering ...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-50898276
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56578914
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20722/consoleFull) for   PR 1717 at commit [`40e78d5`](https://github.com/apache/spark/commit/40e78d505eaf53e34fd2a14e44fad7bc04efbb5c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54694528
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55355516
  
    I have the new case class written, I just haven't tested it with an actual stream yet. It should be ready sometime tomorrow or Saturday.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-96770284
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17972016
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -33,15 +33,38 @@ object TwitterUtils {
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
        * @param filters Set of filter strings to get only those tweets that match them
    +   * @param locations   Bounding boxes to get only geotagged tweets within them. Example: 
    +            Seq(BoundingBox(-180.0,-90.0,180.0,90.0)) gives any geotagged tweet. If locations and
    +            filters are both nonempty, then any tweet matching either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           ssc: StreamingContext,
           twitterAuth: Option[Authorization],
           filters: Seq[String] = Nil,
    +      locations: Seq[BoundingBox] = Nil,
    --- End diff --
    
    It looks like I'm changing the method here, but this whole method is new. The original one is below.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17205863
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -33,15 +33,20 @@ object TwitterUtils {
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
        * @param filters Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: Seq(Seq(-180.0,-90.0),Seq(180.0,90.0))
    +            gives any geotagged tweet. If locations and filters are both nonempty, then any tweet
    +            matching either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           ssc: StreamingContext,
           twitterAuth: Option[Authorization],
           filters: Seq[String] = Nil,
    +      locations: Seq[Seq[Double]] = Nil,
    --- End diff --
    
    I didnt get you, why are you adding this this method? This does not have the locations. Instead isnt it better to keep the existing function as is and add 
    ```
     def createStream(
          ssc: StreamingContext,
          twitterAuth: Option[Authorization],
          filters: Seq[String],
          locations: Seq[String],
          storageLevel: StorageLevel
        ): ReceiverInputDStream[Status] = {
        createStream(ssc, twitterAuth, filters, Nil, storageLevel)
      }
    ```
    
    Also worth remember that Scala does not allow two overloaded functions to have default params. That would give compiler errors.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17202369
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -42,9 +44,19 @@ class TwitterInputDStream(
         @transient ssc_ : StreamingContext,
         twitterAuth: Option[Authorization],
         filters: Seq[String],
    +    locations: Seq[Seq[Double]],
         storageLevel: StorageLevel
       ) extends ReceiverInputDStream[Status](ssc_)  {
     
    +  def this(
    --- End diff --
    
    Actually, no need to create two constructors. Since this is a non-public class internal to Spark, we dont need to maintain binary compatibility. So one common constructor is fine enough.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72783897
  
      [Test build #26709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26709/consoleFull) for   PR 1717 at commit [`250407e`](https://github.com/apache/spark/commit/250407e57ee49779147194533e244f54e3e59a37).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17210353
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -42,9 +44,19 @@ class TwitterInputDStream(
         @transient ssc_ : StreamingContext,
         twitterAuth: Option[Authorization],
         filters: Seq[String],
    +    locations: Seq[Seq[Double]],
         storageLevel: StorageLevel
       ) extends ReceiverInputDStream[Status](ssc_)  {
     
    +  def this(
    --- End diff --
    
    Sounds good. I'll take that out.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57102315
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20952/consoleFull) for   PR 1717 at commit [`ee44c29`](https://github.com/apache/spark/commit/ee44c29f4940a15d75cee37b818facc06982cd49).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53475387
  
    That's cool.
    
    
    On Tue, Aug 26, 2014 at 12:27 PM, Shawn Brunsting <no...@github.com>
    wrote:
    
    > @tdas <https://github.com/tdas> Thanks for the comments! I'll work on
    > fixing the binary compatibility, though I might not have it done until
    > sometime next week since I'm currently on vacation.
    >
    > —
    > Reply to this email directly or view it on GitHub
    > <https://github.com/apache/spark/pull/1717#issuecomment-53475085>.
    >


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17397617
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -42,6 +44,7 @@ class TwitterInputDStream(
         @transient ssc_ : StreamingContext,
         twitterAuth: Option[Authorization],
         filters: Seq[String],
    +    locations: Seq[Seq[Double]],
    --- End diff --
    
    Yes, that makes sense! Please go ahead a do so. Can you make the order of directions same as the order in the expected twitter4j API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56578032
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53217871
  
    @sjbrunst This is great addition! Thanks for the effort. However, from the patch, I can see that this changes the signature of a few methods, which required the examples to be changed. This is not desirable as we want to maintain binary compatibility as much as possible across different Spark versions. So I strongly suggest that the existing methods in TwitterUtils not be touched and new methods with the new location parameter by added. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54737322
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19938/consoleFull) for   PR 1717 at commit [`1e88a04`](https://github.com/apache/spark/commit/1e88a043f2efd3e2af89875e4f9ea19e3f15facf).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55512517
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20285/consoleFull) for   PR 1717 at commit [`8937fc7`](https://github.com/apache/spark/commit/8937fc7406d019452f4b650f284ad47bd9401a23).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`
      * `class RatingDeserializer(FramedSerializer):`
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder[T <: NativeType](columnType: NativeColumnType[T]) extends compression.Encoder[T] `
      * `  class Encoder extends compression.Encoder[IntegerType.type] `
      * `  class Decoder(buffer: ByteBuffer, columnType: NativeColumnType[IntegerType.type])`
      * `  class Encoder extends compression.Encoder[LongType.type] `
      * `  class Decoder(buffer: ByteBuffer, columnType: NativeColumnType[LongType.type])`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55364717
  
    Okie dokie!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57552179
  
    @tdas The way I understand it, we have two options here. The first is that we allow users to pass in their own FilterQuery object. That will minimize the number of methods we need, but comes with the problems you just described.
    
    The second option is to build the FilterQuery as it is in this PR, but that will require many methods for all the possible parameters one could want. I have an application that would benefit from the location parameter, which inspired this PR. I'm sure @ezhulenev has a use for the count parameter. If we go with this second option, I think it would be best to have one PR that adds parameters for everything that FilterQuery takes, so if anyone wants to use any of the other features (such as language or follow), they won't have to submit another PR that needs extra methods to keep binary compatibility the way this one does.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16639413
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -33,15 +33,20 @@ object TwitterUtils {
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
        * @param filters Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: Seq(Seq(-180.0,-90.0),Seq(180.0,90.0))
    +            gives any geotagged tweet. If locations and filters are both nonempty, then any tweet
    +            matching either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           ssc: StreamingContext,
           twitterAuth: Option[Authorization],
           filters: Seq[String] = Nil,
    +      locations: Seq[Seq[Double]] = Nil,
    --- End diff --
    
    This changes method signature, probably not binary compatible. Please added a new method and leave this method untouched. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54736339
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19938/consoleFull) for   PR 1717 at commit [`1e88a04`](https://github.com/apache/spark/commit/1e88a043f2efd3e2af89875e4f9ea19e3f15facf).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by surendramarupudi <gi...@git.apache.org>.
Github user surendramarupudi commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58811881
  
    Hi, I am trying to use this new location feature TwitterUtils.createStream(ssc,filters,double[][]) for this filters are working but filtering for location is not working. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68613000
  
      [Test build #25017 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25017/consoleFull) for   PR 1717 at commit [`887cf07`](https://github.com/apache/spark/commit/887cf073a0e120bafd5c488761b036784c4e4284).
     * This patch **fails to build**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56588183
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20722/consoleFull) for   PR 1717 at commit [`40e78d5`](https://github.com/apache/spark/commit/40e78d505eaf53e34fd2a14e44fad7bc04efbb5c).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53175414
  
    It looks like this and #2098 are both trying to add geolocation filters to TwitterStream.
    
    /cc @tdas for review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72789158
  
      [Test build #26709 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26709/consoleFull) for   PR 1717 at commit [`250407e`](https://github.com/apache/spark/commit/250407e57ee49779147194533e244f54e3e59a37).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53217956
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19129/consoleFull) for   PR 1717 at commit [`9dcad31`](https://github.com/apache/spark/commit/9dcad31b0169b398bfbc59eabd93ba659843848c).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72748803
  
      [Test build #26676 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26676/consoleFull) for   PR 1717 at commit [`6919b0d`](https://github.com/apache/spark/commit/6919b0d5a03b83cdead4c4814a0b4e964aa68d08).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-54735824
  
    Jenkins, this is ok to test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57071591
  
    @sjbrunst aargh, TwitterStreamSuite.scala:53 requred to add count parameter


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57164005
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20972/consoleFull) for   PR 1717 at commit [`a2a03ba`](https://github.com/apache/spark/commit/a2a03ba4a804ae6fc4eb31d75a2b8db32dbffd50).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72739435
  
      [Test build #26676 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/26676/consoleFull) for   PR 1717 at commit [`6919b0d`](https://github.com/apache/spark/commit/6919b0d5a03b83cdead4c4814a0b4e964aa68d08).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-55511357
  
    @tdas It's ready for another look! I added a BoundingBox class that can be used to pass in the coordinates, which should be much more intuitive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst closed the pull request at:

    https://github.com/apache/spark/pull/1717


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57403344
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21075/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56673018
  
    @tdas The current version of TwitterUtils.scala only has new methods. The diff makes it look like I changed the original methods, but they are all there. The original unit tests from the StreamSuites pass, so I don't know why we're still getting the binary compatibility error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17212834
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterInputDStream.scala ---
    @@ -42,6 +44,7 @@ class TwitterInputDStream(
         @transient ssc_ : StreamingContext,
         twitterAuth: Option[Authorization],
         filters: Seq[String],
    +    locations: Seq[Seq[Double]],
    --- End diff --
    
    I should have caught and commented on this earlier, but why is this `Seq[Seq[Double]]` and not of `Seq[(Double, Double)]` ? Its not like that the location will ever be a sequence of more two doubles. So having a Seq[Double] for latitude and longitude is pretty confusing. In fact having (Double, Double) is still confusing, as it is not obvious which one is latitude and which one is longitude. Hence, i think that its best to define a `case class Location(latitude: Double, longitude: Double)` (within the `org.apache.spark.streaming.twitter` package), and use that. This should be most intuitive and least ambiguous.
    
    What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59427005
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21817/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57056016
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20919/consoleFull) for   PR 1717 at commit [`1e4b204`](https://github.com/apache/spark/commit/1e4b204d87f84c93d08f1a8535d52849355c793a).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58447539
  
    @tdas Have you had any more thoughts on this? Which direction should we take here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58725605
  
    Sorry for taking so long to get back on this. I agree with @ezhulenev that the Twitter4J Authorization has already leaked into public API, so we are essentially tied with twitter4j for as long as twitter4j survives. :)
    
    I spoke to my @JoshRosen and pwendell, and they agreed that there are two possible options which are equally okay (specially since we are anyways tied to Authorization). 
    1. Either take twitter4j object directly. Low overhead, and API as nice as twitter4j's FilterQuery, which isnt very good.
    2. Or build a facade. Here we take in all the risks if twitter4j changes subtly. It is hard to test all the functionality that twitter4j's FilterQuery provide. However it makes the API much nicer. 
    
    Another new aspect that I realized is that we want to expose the twitter stream functionality through Python API (https://github.com/apache/spark/pull/2538). That will definitely require a facade (maybe python-only) on FilterQuery causing the same issues as 2. 
    
    So I propose actually building that facade. What say?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57402777
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21073/consoleFull) for   PR 1717 at commit [`bf201ff`](https://github.com/apache/spark/commit/bf201ff3c54929f00e02acafd6fefe371d4d2718).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57508615
  
    @sjbrunst Seems like @ezhulenev has submitted #2618 to add the filterquery based stream. I think I prefer that over adding this explicit location support. I know that you have put in a bit of work in this PR, but would you mind if we switch gears to this FilterQuery based approach?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17091665
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -33,15 +33,20 @@ object TwitterUtils {
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
        * @param filters Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: Seq(Seq(-180.0,-90.0),Seq(180.0,90.0))
    +            gives any geotagged tweet. If locations and filters are both nonempty, then any tweet
    +            matching either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           ssc: StreamingContext,
           twitterAuth: Option[Authorization],
           filters: Seq[String] = Nil,
    +      locations: Seq[Seq[Double]] = Nil,
    --- End diff --
    
    I'm having a bit of trouble here. I tried adding the following code to ensure binary compatibility, but it doesn't compile:
    ```scala
      def createStream(
          ssc: StreamingContext,
          twitterAuth: Option[Authorization],
          filters: Seq[String] = Nil,
          storageLevel: StorageLevel = StorageLevel.MEMORY_AND_DISK_SER_2
        ): ReceiverInputDStream[Status] = {
        createStream(ssc, twitterAuth, filters, Nil, storageLevel)
      }
    ```
    
    When include that I get the following error:
    ```
    [error] /home/shawn/Research/git-spark/spark/external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala:70: overloaded method value createStream with alternatives:
    [error]   (jssc: org.apache.spark.streaming.api.java.JavaStreamingContext,twitterAuth: twitter4j.auth.Authorization)org.apache.spark.streaming.api.java.JavaReceiverInputDStream[twitter4j.Status] <and>
    [error]   (jssc: org.apache.spark.streaming.api.java.JavaStreamingContext,filters: Array[String])org.apache.spark.streaming.api.java.JavaReceiverInputDStream[twitter4j.Status] <and>
    [error]   (ssc: org.apache.spark.streaming.StreamingContext,twitterAuth: Option[twitter4j.auth.Authorization],filters: Seq[String],storageLevel: org.apache.spark.storage.StorageLevel)org.apache.spark.streaming.dstream.ReceiverInputDStream[twitter4j.Status] <and>
    [error]   (ssc: org.apache.spark.streaming.StreamingContext,twitterAuth: Option[twitter4j.auth.Authorization],filters: Seq[String],locations: Seq[Seq[Double]],storageLevel: org.apache.spark.storage.StorageLevel)org.apache.spark.streaming.dstream.ReceiverInputDStream[twitter4j.Status]
    [error]  cannot be applied to (org.apache.spark.streaming.StreamingContext, None.type)
    [error]     createStream(jssc.ssc, None)
    ```
    And similar errors for three more of the Java functions. @tdas Do you have any suggestions for how I can get this working?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-72748809
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26676/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16639433
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -115,17 +148,45 @@ object TwitterUtils {
     
       /**
        * Create a input stream that returns tweets received from Twitter.
    +   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
    +   * @param jssc        JavaStreamingContext object
    +   * @param twitterAuth Twitter4J Authorization
    +   * @param filters     Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: {{-180.0,-90.0},{180.0,90.0}} gives any
    +            geotagged tweet. If locations and filters are both nonempty, then any tweet matching
    +            either condition may be returned.
    +   */
    +  def createStream(
    +      jssc: JavaStreamingContext,
    +      twitterAuth: Authorization,
    +      filters: Array[String],
    +      locations: Array[Array[Double]]
    +    ): JavaReceiverInputDStream[Status] = {
    +    // Scala implicitly converts Array[T] to Seq[T], but not Array[Array[T]] to Seq[Seq[T]]
    +    createStream(jssc.ssc, Some(twitterAuth), filters, locations.map(_.toList))
    +  }
    +
    +
    +  /**
    +   * Create a input stream that returns tweets received from Twitter.
    --- End diff --
    
    Rather than changing this signature and adding another one (the above one), it probably better (in terms binary compatibility) to add a single new method, that is 
    ```
    def createStream(
          jssc: JavaStreamingContext,
          twitterAuth: Authorization,
          filters: Array[String],
          locations: Array[Array[Double]],
          storageLevel: StorageLevel
        ): JavaReceiverInputDStream[Status] = {
    ```
    
    Same applies to the Scala API. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57405752
  
    @tdas count sets number of status updates from the past, that will be added to stream before switching to "live" mode


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57536861
  
    @ezhulenev @tdas The FilterQuery approach should definitely go in, since that would give users the most control. But @ezhulenev makes a good point in that the BoundingBox makes more sense than a 2D array, especially since FilterQuery's documentation is not very clear on how to put coordinates into that 2D array.
    
    What about a hybrid approach where we submit #2618, but also submit this one (without the count parameter) for users who only want location or text filters? That way they could figure out how to pass in their location parameters without having to go to the unclear FilterQuery documentation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-53214781
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19129/consoleFull) for   PR 1717 at commit [`9dcad31`](https://github.com/apache/spark/commit/9dcad31b0169b398bfbc59eabd93ba659843848c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56394656
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20655/consoleFull) for   PR 1717 at commit [`40e78d5`](https://github.com/apache/spark/commit/40e78d505eaf53e34fd2a14e44fad7bc04efbb5c).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68612972
  
      [Test build #25017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25017/consoleFull) for   PR 1717 at commit [`887cf07`](https://github.com/apache/spark/commit/887cf073a0e120bafd5c488761b036784c4e4284).
     * This patch **does not merge cleanly**.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68615337
  
      [Test build #25018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25018/consoleFull) for   PR 1717 at commit [`155de3f`](https://github.com/apache/spark/commit/155de3fe0ec57330b84cb459d1aaf5ef370871fd).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57396987
  
    @tdas Should be fixed now.
    
    Should @ezhulenev 's changes be part of this PR? They're not part of location filtering, but they require similar changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57536764
  
    That is an additional and optional thing, beyond the functionality of creating streams with FilterQuery object. Just defining an additional layer, that has to replicate and expose all the FilterQuery methods, just to make the bounding box thing more convenient, seems like an overkill. If we use implicit conversions in scala to do it, then its only applicable to Scala, not Java (and would could create more problems for Python in the future).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58955712
  
    @surendramarupudi When you create a BoundingBox, you give it latitudes and longitudes to define a rectangular area, and any tweet inside that rectangle may be returned. Here are 3 examples: 
    * If you want to request any geotagged tweet, your location parameter would be ``Seq(BoundingBox(-180.0,-90.0,180.0,90.0))``
    * If you want to request geotagged tweets from New York City, your location parameter would be ``Seq(BoundingBox(-74.0,40.0,-73.0,41.0))``
    * If you want to request geotagged tweets from either New York City or San Francisco, your location parameter would be ``Seq(BoundingBox(-74.0,40.0,-73.0,41.0),BoundingBox(-122.75,36.8,-121.75,37.8))``
    
    You should also be aware that if you use both the filters and the location parameters, Twitter will return tweets that match one of the keyword filters OR one of the locations. Not all tweets will satisfy both the filters and locations.
    
    Also note that the current version also has a count parameter, so you would need to call ``TwitterUtils.createStream(ssc,None,filters,0,locations)``.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57056007
  
    @ezhulenev I've pulled in your changes and fixed a small scalastyle error.
    
    I agree that we should avoid having too many methods for all the parameter combinations, but I don't have any experience with making factories so I don't know how we would implement that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57403336
  
      [QA tests have finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21075/consoleFull) for   PR 1717 at commit [`3e2ddd7`](https://github.com/apache/spark/commit/3e2ddd7fa5232d69be505d218f995064a00272ea).
     * This patch **fails** unit tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57059353
  
    @ezhulenev I've rolled back those changes now. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r16639417
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -75,16 +80,44 @@ object TwitterUtils {
        * OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
        * twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        * twitter4j.oauth.accessTokenSecret.
    +   * Storage level of the data will be the default StorageLevel.MEMORY_AND_DISK_SER_2.
    +   * @param jssc      JavaStreamingContext object
    +   * @param filters   Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: {{-180.0,-90.0},{180.0,90.0}} gives any
    +            geotagged tweet. If locations and filters are both nonempty, then any tweet matching
    +            either condition may be returned.
    +   */
    +  def createStream(
    +      jssc: JavaStreamingContext,
    +      filters: Array[String],
    +      locations: Array[Array[Double]]
    +    ): JavaReceiverInputDStream[Status] = {
    +    // Scala implicitly converts Array[T] to Seq[T], but not Array[Array[T]] to Seq[Seq[T]]
    +    createStream(jssc.ssc, None, filters, locations.map(_.toList))
    +  }
    +
    +  /**
    +   * Create a input stream that returns tweets received from Twitter using Twitter4J's default
    +   * OAuth authentication; this requires the system properties twitter4j.oauth.consumerKey,
    +   * twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
    +   * twitter4j.oauth.accessTokenSecret.
        * @param jssc         JavaStreamingContext object
        * @param filters      Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: {{-180.0,-90.0},{180.0,90.0}} gives any
    +            geotagged tweet. If locations and filters are both nonempty, then any tweet matching
    +            either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           jssc: JavaStreamingContext,
           filters: Array[String],
    +      locations: Array[Array[Double]],
    --- End diff --
    
    Same comment as above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57397139
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21073/consoleFull) for   PR 1717 at commit [`bf201ff`](https://github.com/apache/spark/commit/bf201ff3c54929f00e02acafd6fefe371d4d2718).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57515222
  
    Honestly I would prefer passing BoundingBox as a restriction for geolocation instead of 2-dimensional array, but passing Twitter4j object looks for me more natural, especially when authorization from Twitter4j is already a part of public API.
    
    I think it would be better to create some layer on top of FilterQuery, to be able define restrictions in bounding-boxes and (?implicitly) transform them into FilterQuery object later


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57055519
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20918/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57157005
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20972/consoleFull) for   PR 1717 at commit [`a2a03ba`](https://github.com/apache/spark/commit/a2a03ba4a804ae6fc4eb31d75a2b8db32dbffd50).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-56385915
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20655/consoleFull) for   PR 1717 at commit [`40e78d5`](https://github.com/apache/spark/commit/40e78d505eaf53e34fd2a14e44fad7bc04efbb5c).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r17210396
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -33,15 +33,20 @@ object TwitterUtils {
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
        * @param filters Set of filter strings to get only those tweets that match them
    +   * @param locations Set of longitude and latitude coordinates to get only those tweets within the
    +            bounding box defined by those points. Example: Seq(Seq(-180.0,-90.0),Seq(180.0,90.0))
    +            gives any geotagged tweet. If locations and filters are both nonempty, then any tweet
    +            matching either condition may be returned.
        * @param storageLevel Storage level to use for storing the received objects
        */
       def createStream(
           ssc: StreamingContext,
           twitterAuth: Option[Authorization],
           filters: Seq[String] = Nil,
    +      locations: Seq[Seq[Double]] = Nil,
    --- End diff --
    
    Ah, I didn't realize overloaded functions with default params would be an issue (I'm fairly new to Scala). It seems to be working now, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57060931
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/20920/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by tdas <gi...@git.apache.org>.
Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57402267
  
    @sjbrunst @ezhulenev Can you explain what is the point of count? If I understood Twitter4j API correctly, count sets the pagination size. How does that help here? Or is my understand wrong?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by agsachin <gi...@git.apache.org>.
Github user agsachin commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-181379130
  
    @srowen Is anyone working on third-party package for twitter,
    
    I suggest to add FilterQuery support to existing api, and move twitter Api's out of spark whenever a bigger changes is done on twitter4j api side till that time we can have twitter in spark, continuing existing api with some enhancements.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-59415277
  
      [QA tests have started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/21814/consoleFull) for   PR 1717 at commit [`59b1194`](https://github.com/apache/spark/commit/59b119422a5bd72e7d107d9ecb3c9f14fe42435a).
     * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1717#discussion_r18979675
  
    --- Diff: external/twitter/src/main/scala/org/apache/spark/streaming/twitter/TwitterUtils.scala ---
    @@ -32,16 +32,62 @@ object TwitterUtils {
        *        authorization; this uses the system properties twitter4j.oauth.consumerKey,
        *        twitter4j.oauth.consumerSecret, twitter4j.oauth.accessToken and
        *        twitter4j.oauth.accessTokenSecret
    -   * @param filters Set of filter strings to get only those tweets that match them
    --- End diff --
    
    @tdas I renamed "filters" to "track" in the new methods so it matches the name that FilterQuery uses. The name "filters" for the keyword set doesn't make much sense anyways, since locations and follow are also filters.
    
    Is it ok to rename "filters" in the original methods too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by JoshRosen <gi...@git.apache.org>.
Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57575122
  
    Do you mind adding "closes #2098" to the description of your PR so that this automatically closes the other PR when merged?  Thanks!!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by sjbrunst <gi...@git.apache.org>.
Github user sjbrunst commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-58916182
  
    @tdas I agree with building the facade. It's more difficult for us, but it will be easier for users to learn how to use it. Once I have time (hopefully in the next few days) I'll work on building that and push the changes to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-68617146
  
      [Test build #25019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25019/consoleFull) for   PR 1717 at commit [`77dedcb`](https://github.com/apache/spark/commit/77dedcb2a00027c9119f8dba79f25c84d14d17c8).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:
      * `case class BoundingBox(west: Double, south: Double, east: Double, north: Double)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-2788] [STREAMING] Add location filterin...

Posted by ezhulenev <gi...@git.apache.org>.
Github user ezhulenev commented on the pull request:

    https://github.com/apache/spark/pull/1717#issuecomment-57012331
  
    @sjbrunst I need to add count parameter to twitter streams, I merged your fork and made small changes. Unfortunately I can't submit PR to your repository. It would be awesome to to add changes from my repository to this PR from: https://github.com/ezhulenev/
    
    And maybe it's good time to introduce some factory instead of createStream for JavaStreamingContext, or we'll end up with 10s of methods for all possible parameters combinations
    
    @tdas I added constructors to TwitterInputDStream & TwitterReceiver with previous signature, so I think it should also resolve binary compatibility problem


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org