You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2017/01/15 00:52:25 UTC

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

I agree with Devaraj's assessment w.r.t. hbase-spark module in master
(which is becoming branch-2).

Cheers



On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com> wrote:

> Hi Sean, I did a quick check with someone from the Spark team here and his
> opinion was that the hbase-spark module as it currently stands can be used
> by downstream users to do basic stuff and to try some simple things out,
> etc. The integration is improving.
> I think we should get what we have in 2.0 (which is the default action
> anyways).
> Thanks
> Devaraj
> ________________________________________
> From: Sean Busbey <bu...@apache.org>
> Sent: Wednesday, November 16, 2016 9:49 AM
> To: dev
> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
>
> Hi folks!
>
> With 2.0 releases coming up, I'd like to revive our prior discussion
> on the readiness of the hbase-spark module for downstream users.
>
> We've had a ticket for tracking the milestones set up for inclusion in
> branch-1 releases for about 1.5 years:
>
> https://issues.apache.org/jira/browse/HBASE-14160
>
> We still haven't gotten all of the blocker issues completed, AFAIK.
>
> Is anyone interested in volunteering to knock the rest of these out?
>
> If they aren't, shall we plan to leave hbase-spark in master and
> revert it from branch-2 once it forks for the HBase 2.0 release line?
>
> This feature isn't a blocker for 2.0; just as we've been planning to
> add the hbase-spark module to some 1.y release we can also include it
> in a 2.1+ release.
>
> This does appear to be a feature our downstream users could benefit
> from, so I'd hate to continue the current situation where no official
> releases include it. This is especially true now that we're looking at
> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
>
> -
> busbey
>
>

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

Posted by Jerry He <je...@gmail.com>.
Hi, Andrew

Stack was talking to me about this area when I met him in the HBase Meetup
last December.
Let me take a shot at HBASE-14375.

Thanks,

Jerry

On Sat, Jan 14, 2017 at 9:22 PM, Andrew Purtell <an...@gmail.com>
wrote:

>
>
> > On Jan 14, 2017, at 9:07 PM, Jerry He <je...@gmail.com> wrote:
> >
> > I think it will be a big disappointment for the community if the
> > hbase-spark module is not going into 2.0.
> > I understand there are still a few blockers, including HBASE-16179.
>
> Patches welcome. :-)
>
>
> > We have it in our distribution, probably in other vendors' as well.  It
> is
> > little easier for us because we can be flexible on the supported
> > Spark/Scala version combinations and the APIs.
> > But a major release still without a good Spark story for the HBase open
> > source community does not look good.
> >
> > Jerry
> >
> >> On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu <yu...@gmail.com> wrote:
> >>
> >> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> >> (which is becoming branch-2).
> >>
> >> Cheers
> >>
> >>
> >>
> >> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com>
> >> wrote:
> >>
> >>> Hi Sean, I did a quick check with someone from the Spark team here and
> >> his
> >>> opinion was that the hbase-spark module as it currently stands can be
> >> used
> >>> by downstream users to do basic stuff and to try some simple things
> out,
> >>> etc. The integration is improving.
> >>> I think we should get what we have in 2.0 (which is the default action
> >>> anyways).
> >>> Thanks
> >>> Devaraj
> >>> ________________________________________
> >>> From: Sean Busbey <bu...@apache.org>
> >>> Sent: Wednesday, November 16, 2016 9:49 AM
> >>> To: dev
> >>> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
> >>>
> >>> Hi folks!
> >>>
> >>> With 2.0 releases coming up, I'd like to revive our prior discussion
> >>> on the readiness of the hbase-spark module for downstream users.
> >>>
> >>> We've had a ticket for tracking the milestones set up for inclusion in
> >>> branch-1 releases for about 1.5 years:
> >>>
> >>> https://issues.apache.org/jira/browse/HBASE-14160
> >>>
> >>> We still haven't gotten all of the blocker issues completed, AFAIK.
> >>>
> >>> Is anyone interested in volunteering to knock the rest of these out?
> >>>
> >>> If they aren't, shall we plan to leave hbase-spark in master and
> >>> revert it from branch-2 once it forks for the HBase 2.0 release line?
> >>>
> >>> This feature isn't a blocker for 2.0; just as we've been planning to
> >>> add the hbase-spark module to some 1.y release we can also include it
> >>> in a 2.1+ release.
> >>>
> >>> This does appear to be a feature our downstream users could benefit
> >>> from, so I'd hate to continue the current situation where no official
> >>> releases include it. This is especially true now that we're looking at
> >>> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
> >>>
> >>> -
> >>> busbey
> >>>
> >>>
> >>
>

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

Posted by Andrew Purtell <an...@gmail.com>.

> On Jan 14, 2017, at 9:07 PM, Jerry He <je...@gmail.com> wrote:
> 
> I think it will be a big disappointment for the community if the
> hbase-spark module is not going into 2.0.
> I understand there are still a few blockers, including HBASE-16179.

Patches welcome. :-) 


> We have it in our distribution, probably in other vendors' as well.  It is
> little easier for us because we can be flexible on the supported
> Spark/Scala version combinations and the APIs.
> But a major release still without a good Spark story for the HBase open
> source community does not look good.
> 
> Jerry
> 
>> On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu <yu...@gmail.com> wrote:
>> 
>> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
>> (which is becoming branch-2).
>> 
>> Cheers
>> 
>> 
>> 
>> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com>
>> wrote:
>> 
>>> Hi Sean, I did a quick check with someone from the Spark team here and
>> his
>>> opinion was that the hbase-spark module as it currently stands can be
>> used
>>> by downstream users to do basic stuff and to try some simple things out,
>>> etc. The integration is improving.
>>> I think we should get what we have in 2.0 (which is the default action
>>> anyways).
>>> Thanks
>>> Devaraj
>>> ________________________________________
>>> From: Sean Busbey <bu...@apache.org>
>>> Sent: Wednesday, November 16, 2016 9:49 AM
>>> To: dev
>>> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
>>> 
>>> Hi folks!
>>> 
>>> With 2.0 releases coming up, I'd like to revive our prior discussion
>>> on the readiness of the hbase-spark module for downstream users.
>>> 
>>> We've had a ticket for tracking the milestones set up for inclusion in
>>> branch-1 releases for about 1.5 years:
>>> 
>>> https://issues.apache.org/jira/browse/HBASE-14160
>>> 
>>> We still haven't gotten all of the blocker issues completed, AFAIK.
>>> 
>>> Is anyone interested in volunteering to knock the rest of these out?
>>> 
>>> If they aren't, shall we plan to leave hbase-spark in master and
>>> revert it from branch-2 once it forks for the HBase 2.0 release line?
>>> 
>>> This feature isn't a blocker for 2.0; just as we've been planning to
>>> add the hbase-spark module to some 1.y release we can also include it
>>> in a 2.1+ release.
>>> 
>>> This does appear to be a feature our downstream users could benefit
>>> from, so I'd hate to continue the current situation where no official
>>> releases include it. This is especially true now that we're looking at
>>> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
>>> 
>>> -
>>> busbey
>>> 
>>> 
>> 

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

Posted by Jerry He <je...@gmail.com>.
I think it will be a big disappointment for the community if the
hbase-spark module is not going into 2.0.
I understand there are still a few blockers, including HBASE-16179.
We have it in our distribution, probably in other vendors' as well.  It is
little easier for us because we can be flexible on the supported
Spark/Scala version combinations and the APIs.
But a major release still without a good Spark story for the HBase open
source community does not look good.

Jerry

On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu <yu...@gmail.com> wrote:

> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> (which is becoming branch-2).
>
> Cheers
>
>
>
> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com>
> wrote:
>
> > Hi Sean, I did a quick check with someone from the Spark team here and
> his
> > opinion was that the hbase-spark module as it currently stands can be
> used
> > by downstream users to do basic stuff and to try some simple things out,
> > etc. The integration is improving.
> > I think we should get what we have in 2.0 (which is the default action
> > anyways).
> > Thanks
> > Devaraj
> > ________________________________________
> > From: Sean Busbey <bu...@apache.org>
> > Sent: Wednesday, November 16, 2016 9:49 AM
> > To: dev
> > Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
> >
> > Hi folks!
> >
> > With 2.0 releases coming up, I'd like to revive our prior discussion
> > on the readiness of the hbase-spark module for downstream users.
> >
> > We've had a ticket for tracking the milestones set up for inclusion in
> > branch-1 releases for about 1.5 years:
> >
> > https://issues.apache.org/jira/browse/HBASE-14160
> >
> > We still haven't gotten all of the blocker issues completed, AFAIK.
> >
> > Is anyone interested in volunteering to knock the rest of these out?
> >
> > If they aren't, shall we plan to leave hbase-spark in master and
> > revert it from branch-2 once it forks for the HBase 2.0 release line?
> >
> > This feature isn't a blocker for 2.0; just as we've been planning to
> > add the hbase-spark module to some 1.y release we can also include it
> > in a 2.1+ release.
> >
> > This does appear to be a feature our downstream users could benefit
> > from, so I'd hate to continue the current situation where no official
> > releases include it. This is especially true now that we're looking at
> > ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
> >
> > -
> > busbey
> >
> >
>

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

Posted by Ted Yu <yu...@gmail.com>.
After HBASE-16179 gets reviewed / committed, I should be able to take on
other high priority Spark connector issues.

Cheers

On Wed, Jan 18, 2017 at 12:30 PM, Sean Busbey <bu...@apache.org> wrote:

> I don't doubt that downstream users could "try out" our integration
> using what currently exists in the branch-2. However, we already had
> community consensus on what is necessary for our downstream folks to
> have a good experience with a ready-for-production feature. I don't
> see why we should subject them to a lower bar in a branch-2 release
> than we would have in a branch-1 release just because we're starting
> up a new major version.
>
> The work in HBASE-16179 is certainly a blocker given the rising
> popularity of Spark 2.0 (thanks Ted for getting that work under way, I
> hope we get sufficient review bandwidth to get it finished), but it's not
> everything; e.g. we don't have regression checks in place for the
> things that show up in our docs.
>
> -
> busbey
>
> On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu <yu...@gmail.com> wrote:
> > I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> > (which is becoming branch-2).
> >
> > Cheers
> >
> >
> >
> > On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com>
> wrote:
> >
> >> Hi Sean, I did a quick check with someone from the Spark team here and
> his
> >> opinion was that the hbase-spark module as it currently stands can be
> used
> >> by downstream users to do basic stuff and to try some simple things out,
> >> etc. The integration is improving.
> >> I think we should get what we have in 2.0 (which is the default action
> >> anyways).
> >> Thanks
> >> Devaraj
> >> ________________________________________
> >> From: Sean Busbey <bu...@apache.org>
> >> Sent: Wednesday, November 16, 2016 9:49 AM
> >> To: dev
> >> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
> >>
> >> Hi folks!
> >>
> >> With 2.0 releases coming up, I'd like to revive our prior discussion
> >> on the readiness of the hbase-spark module for downstream users.
> >>
> >> We've had a ticket for tracking the milestones set up for inclusion in
> >> branch-1 releases for about 1.5 years:
> >>
> >> https://issues.apache.org/jira/browse/HBASE-14160
> >>
> >> We still haven't gotten all of the blocker issues completed, AFAIK.
> >>
> >> Is anyone interested in volunteering to knock the rest of these out?
> >>
> >> If they aren't, shall we plan to leave hbase-spark in master and
> >> revert it from branch-2 once it forks for the HBase 2.0 release line?
> >>
> >> This feature isn't a blocker for 2.0; just as we've been planning to
> >> add the hbase-spark module to some 1.y release we can also include it
> >> in a 2.1+ release.
> >>
> >> This does appear to be a feature our downstream users could benefit
> >> from, so I'd hate to continue the current situation where no official
> >> releases include it. This is especially true now that we're looking at
> >> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
> >>
> >> -
> >> busbey
> >>
> >>
>

Re: [DISCUSS] hbase-spark module in branch-1 and branch-2

Posted by Sean Busbey <bu...@apache.org>.
I don't doubt that downstream users could "try out" our integration
using what currently exists in the branch-2. However, we already had
community consensus on what is necessary for our downstream folks to
have a good experience with a ready-for-production feature. I don't
see why we should subject them to a lower bar in a branch-2 release
than we would have in a branch-1 release just because we're starting
up a new major version.

The work in HBASE-16179 is certainly a blocker given the rising
popularity of Spark 2.0 (thanks Ted for getting that work under way, I
hope we get sufficient review bandwidth to get it finished), but it's not
everything; e.g. we don't have regression checks in place for the
things that show up in our docs.

-
busbey

On Sat, Jan 14, 2017 at 4:52 PM, Ted Yu <yu...@gmail.com> wrote:
> I agree with Devaraj's assessment w.r.t. hbase-spark module in master
> (which is becoming branch-2).
>
> Cheers
>
>
>
> On Mon, Nov 21, 2016 at 11:46 AM, Devaraj Das <dd...@hortonworks.com> wrote:
>
>> Hi Sean, I did a quick check with someone from the Spark team here and his
>> opinion was that the hbase-spark module as it currently stands can be used
>> by downstream users to do basic stuff and to try some simple things out,
>> etc. The integration is improving.
>> I think we should get what we have in 2.0 (which is the default action
>> anyways).
>> Thanks
>> Devaraj
>> ________________________________________
>> From: Sean Busbey <bu...@apache.org>
>> Sent: Wednesday, November 16, 2016 9:49 AM
>> To: dev
>> Subject: [DISCUSS] hbase-spark module in branch-1 and branch-2
>>
>> Hi folks!
>>
>> With 2.0 releases coming up, I'd like to revive our prior discussion
>> on the readiness of the hbase-spark module for downstream users.
>>
>> We've had a ticket for tracking the milestones set up for inclusion in
>> branch-1 releases for about 1.5 years:
>>
>> https://issues.apache.org/jira/browse/HBASE-14160
>>
>> We still haven't gotten all of the blocker issues completed, AFAIK.
>>
>> Is anyone interested in volunteering to knock the rest of these out?
>>
>> If they aren't, shall we plan to leave hbase-spark in master and
>> revert it from branch-2 once it forks for the HBase 2.0 release line?
>>
>> This feature isn't a blocker for 2.0; just as we've been planning to
>> add the hbase-spark module to some 1.y release we can also include it
>> in a 2.1+ release.
>>
>> This does appear to be a feature our downstream users could benefit
>> from, so I'd hate to continue the current situation where no official
>> releases include it. This is especially true now that we're looking at
>> ways to handle changes between Spark 1.6 and Spark 2.0 in HBASE-16179.
>>
>> -
>> busbey
>>
>>