You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Marcelo Vanzin <va...@cloudera.com> on 2016/04/19 19:15:31 UTC

RFC: Remote "HBaseTest" from examples?

Hey all,

Two reasons why I think we should remove that from the examples:

- HBase now has Spark integration in its own repo, so that really
should be the template for how to use HBase from Spark, making that
example less useful, even misleading.

- It brings up a lot of extra dependencies that make the size of the
Spark distribution grow.

Any reason why we shouldn't drop that example?

-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: RFC: Remote "HBaseTest" from examples?

Posted by Ignacio Zendejas <iz...@node.io>.
I'm very late to this party and I get hbase-spark... what's the
recommendation for pyspark + hbase? I realize this isn't necessarily a
concern of the spark project, but it'd be nice to at least document it here
with a very short and sweet response because I haven't found anything
useful in the wild besides using the approach in the examples with
pythonconverters, which were dropped in 2.0.

Thanks.

On Thu, Apr 21, 2016 at 1:47 PM, Ted Yu <yu...@gmail.com> wrote:

> Zhan:
> I have mentioned the JIRA numbers in the thread starting with (note the
> typo in subject of this thread):
>
> RFC: Remove ...
>
> On Thu, Apr 21, 2016 at 1:28 PM, Zhan Zhang <zz...@hortonworks.com>
> wrote:
>
>> FYI: There are several pending patches for DataFrame support on top of
>> HBase.
>>
>> Thanks.
>>
>> Zhan Zhang
>>
>> On Apr 20, 2016, at 2:43 AM, Saisai Shao <sa...@gmail.com> wrote:
>>
>> +1, HBaseTest in Spark Example is quite old and obsolete, the HBase
>> connector in HBase repo has evolved a lot, it would be better to guide user
>> to refer to that not here in Spark example. So good to remove it.
>>
>> Thanks
>> Saisai
>>
>> On Wed, Apr 20, 2016 at 1:41 AM, Josh Rosen <jo...@databricks.com>
>> wrote:
>>
>>> +1; I think that it's preferable for code examples, especially
>>> third-party integration examples, to live outside of Spark.
>>>
>>> On Tue, Apr 19, 2016 at 10:29 AM Reynold Xin <rx...@databricks.com>
>>> wrote:
>>>
>>>> Yea in general I feel examples that bring in a large amount of
>>>> dependencies should be outside Spark.
>>>>
>>>>
>>>> On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Hey all,
>>>>>
>>>>> Two reasons why I think we should remove that from the examples:
>>>>>
>>>>> - HBase now has Spark integration in its own repo, so that really
>>>>> should be the template for how to use HBase from Spark, making that
>>>>> example less useful, even misleading.
>>>>>
>>>>> - It brings up a lot of extra dependencies that make the size of the
>>>>> Spark distribution grow.
>>>>>
>>>>> Any reason why we shouldn't drop that example?
>>>>>
>>>>> --
>>>>> Marcelo
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>
>>
>

Re: RFC: Remote "HBaseTest" from examples?

Posted by Ted Yu <yu...@gmail.com>.
Zhan:
I have mentioned the JIRA numbers in the thread starting with (note the
typo in subject of this thread):

RFC: Remove ...

On Thu, Apr 21, 2016 at 1:28 PM, Zhan Zhang <zz...@hortonworks.com> wrote:

> FYI: There are several pending patches for DataFrame support on top of
> HBase.
>
> Thanks.
>
> Zhan Zhang
>
> On Apr 20, 2016, at 2:43 AM, Saisai Shao <sa...@gmail.com> wrote:
>
> +1, HBaseTest in Spark Example is quite old and obsolete, the HBase
> connector in HBase repo has evolved a lot, it would be better to guide user
> to refer to that not here in Spark example. So good to remove it.
>
> Thanks
> Saisai
>
> On Wed, Apr 20, 2016 at 1:41 AM, Josh Rosen <jo...@databricks.com>
> wrote:
>
>> +1; I think that it's preferable for code examples, especially
>> third-party integration examples, to live outside of Spark.
>>
>> On Tue, Apr 19, 2016 at 10:29 AM Reynold Xin <rx...@databricks.com> wrote:
>>
>>> Yea in general I feel examples that bring in a large amount of
>>> dependencies should be outside Spark.
>>>
>>>
>>> On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>
>>> wrote:
>>>
>>>> Hey all,
>>>>
>>>> Two reasons why I think we should remove that from the examples:
>>>>
>>>> - HBase now has Spark integration in its own repo, so that really
>>>> should be the template for how to use HBase from Spark, making that
>>>> example less useful, even misleading.
>>>>
>>>> - It brings up a lot of extra dependencies that make the size of the
>>>> Spark distribution grow.
>>>>
>>>> Any reason why we shouldn't drop that example?
>>>>
>>>> --
>>>> Marcelo
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>
>>>>
>>>
>
>

Re: RFC: Remote "HBaseTest" from examples?

Posted by Zhan Zhang <zz...@hortonworks.com>.
FYI: There are several pending patches for DataFrame support on top of HBase.

Thanks.

Zhan Zhang

On Apr 20, 2016, at 2:43 AM, Saisai Shao <sa...@gmail.com>> wrote:

+1, HBaseTest in Spark Example is quite old and obsolete, the HBase connector in HBase repo has evolved a lot, it would be better to guide user to refer to that not here in Spark example. So good to remove it.

Thanks
Saisai

On Wed, Apr 20, 2016 at 1:41 AM, Josh Rosen <jo...@databricks.com>> wrote:
+1; I think that it's preferable for code examples, especially third-party integration examples, to live outside of Spark.

On Tue, Apr 19, 2016 at 10:29 AM Reynold Xin <rx...@databricks.com>> wrote:
Yea in general I feel examples that bring in a large amount of dependencies should be outside Spark.


On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>> wrote:
Hey all,

Two reasons why I think we should remove that from the examples:

- HBase now has Spark integration in its own repo, so that really
should be the template for how to use HBase from Spark, making that
example less useful, even misleading.

- It brings up a lot of extra dependencies that make the size of the
Spark distribution grow.

Any reason why we shouldn't drop that example?

--
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org<ma...@spark.apache.org>
For additional commands, e-mail: dev-help@spark.apache.org<ma...@spark.apache.org>





Re: RFC: Remote "HBaseTest" from examples?

Posted by Saisai Shao <sa...@gmail.com>.
+1, HBaseTest in Spark Example is quite old and obsolete, the HBase
connector in HBase repo has evolved a lot, it would be better to guide user
to refer to that not here in Spark example. So good to remove it.

Thanks
Saisai

On Wed, Apr 20, 2016 at 1:41 AM, Josh Rosen <jo...@databricks.com>
wrote:

> +1; I think that it's preferable for code examples, especially third-party
> integration examples, to live outside of Spark.
>
> On Tue, Apr 19, 2016 at 10:29 AM Reynold Xin <rx...@databricks.com> wrote:
>
>> Yea in general I feel examples that bring in a large amount of
>> dependencies should be outside Spark.
>>
>>
>> On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>
>> wrote:
>>
>>> Hey all,
>>>
>>> Two reasons why I think we should remove that from the examples:
>>>
>>> - HBase now has Spark integration in its own repo, so that really
>>> should be the template for how to use HBase from Spark, making that
>>> example less useful, even misleading.
>>>
>>> - It brings up a lot of extra dependencies that make the size of the
>>> Spark distribution grow.
>>>
>>> Any reason why we shouldn't drop that example?
>>>
>>> --
>>> Marcelo
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>

Re: RFC: Remote "HBaseTest" from examples?

Posted by Josh Rosen <jo...@databricks.com>.
+1; I think that it's preferable for code examples, especially third-party
integration examples, to live outside of Spark.

On Tue, Apr 19, 2016 at 10:29 AM Reynold Xin <rx...@databricks.com> wrote:

> Yea in general I feel examples that bring in a large amount of
> dependencies should be outside Spark.
>
>
> On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>
> wrote:
>
>> Hey all,
>>
>> Two reasons why I think we should remove that from the examples:
>>
>> - HBase now has Spark integration in its own repo, so that really
>> should be the template for how to use HBase from Spark, making that
>> example less useful, even misleading.
>>
>> - It brings up a lot of extra dependencies that make the size of the
>> Spark distribution grow.
>>
>> Any reason why we shouldn't drop that example?
>>
>> --
>> Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Re: RFC: Remote "HBaseTest" from examples?

Posted by Reynold Xin <rx...@databricks.com>.
Yea in general I feel examples that bring in a large amount of dependencies
should be outside Spark.


On Tue, Apr 19, 2016 at 10:15 AM, Marcelo Vanzin <va...@cloudera.com>
wrote:

> Hey all,
>
> Two reasons why I think we should remove that from the examples:
>
> - HBase now has Spark integration in its own repo, so that really
> should be the template for how to use HBase from Spark, making that
> example less useful, even misleading.
>
> - It brings up a lot of extra dependencies that make the size of the
> Spark distribution grow.
>
> Any reason why we shouldn't drop that example?
>
> --
> Marcelo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>