You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kudu.apache.org by Jean-Daniel Cryans <jd...@apache.org> on 2016/04/18 17:31:48 UTC

What's coming in Kudu 0.9.0, getting involved

Hey Kudu devs,

With 0.8.0 out of the way, let's talk about 0.9.0!

But first, let me link back to an email[1] I sent 3 months ago to this list
in which I proposed a plan for 1.0 and a time-based release cadence (every
2 months) to get there. I also volunteered to act as the release manager
during that period, and so I'm aiming to cut 0.9.0 RC1 on June 1st with
branching happening about two weeks prior.

So, here's what I expect we'll see in our next release.

Multi-master incremental improvements:
Adar Dembo wrote a really good design document [2] that describes in depth
what we currently have and solution to get a reliable multi-master. Give it
a read if tough distributed system management problems is something that
interests you. Also, keep in eye out for patches.

Scan Token API:
This is the name Dan Burkert gave to what what was previously referred to
as the "high-level scan API". I was hoping to get parts of it in 0.8.0 but
last minute Dan and Todd Lipcon found some issues with the new partition
pruning code and it couldn't make it in. Dan has a design document [3] with
a protobuf message format [4] and implementations in Java [5] and C++[6].
This enables projects like Impala, Drill, Spark, and others to provide the
client with a scan description and receive "tokens" that can be passed
along to worker processes and hydrated into scanners.

Foundations for the replay cache:
If you've used Kudu in a serious way you've probably encountered the need
to ignore duplicate row key errors since those aren't so reliable. Well,
we're almost past that! David Alves posted a (rough first cut) of a design
document[7] where we've been discussing the different solutions. It's still
way too early to tell exactly what will be in 0.9.0, so I wouldn't get too
excited yet.

Some support for tables with non-covering key-ranges:
Dan Burkert just posted a design document[8] and also see this Jira[9].
Basically it allows for range-partitioned tables to be bounded, and adds
new APIs to add/remove tablets in and out of those bounds.

So, that's what's on my radar. It should be a pretty exciting release. If
you are looking to get involved, I'd suggest helping out with the design
documents and see where you can make good contributions. We also have some
holes in the stuff that was implemented for 0.8.0, for example partition
pruning didn't make it to the Java client. Then we have Flume and Spark
connectors, both could use more improvements, especially the latter which
is barebone.

Finally, a new venue for contributions is the blog[10] and of course
more/better documentation is always welcome.

Cheers,

J-D

1.
http://mail-archives.apache.org/mod_mbox/kudu-dev/201602.mbox/%3CCAGpTDNcMBWwX8p+yGKzHfL2xcmKTScU-rhLcQFSns1UVSbrXhw@mail.gmail.com%3E

2. http://gerrit.cloudera.org:8080/#/c/2527/
3. http://gerrit.cloudera.org:8080/#/c/2443/
4. http://gerrit.cloudera.org:8080/#/c/2622/
5. http://gerrit.cloudera.org:8080/#/c/2592/
6. http://gerrit.cloudera.org:8080/#/c/2757/
7. http://gerrit.cloudera.org:8080/#/c/2642/
8. http://gerrit.cloudera.org:8080/#/c/2772/
9. https://issues.apache.org/jira/browse/KUDU-1306
10. http://getkudu.io/blog/