You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Neville Dipale <ne...@gmail.com> on 2020/08/08 11:43:00 UTC

[Rust] Creating a separate branch for Arrow Partuet writer until next release

Good day,

This relates to https://github.com/apache/arrow/pull/7319 (ARROW-8289)

In the past few months we haven't had enough review bandwidth on Rust's
Parquet implementation (mostly relying on Chao for non-trivial reviews),
and given the amount of work needed for an Arrow writer + the interest so
far (I think few people already using the draft PR), I'd like to propose:

* We create a temporary branch in the apache/arrow repo, where the arrow
writer can temporarily live
* We can merge changes into the branch, esp if there aren't enough
reviewers at the time
* When we're close to a release, we merge what's on the temp branch into
the branch that's currently called `master` but will be renamed soon 😉

ITO the Arrow Parquet writer PR, I think I've gotten arbitrary nesting
covered, but there's a lot more work that we can now divide more easily so
others can contribute better.
I'm also unsure of how to test deeply nested arrays directly in the code (I
had to use Spark because Arrow reader doesn't yet support that).

Given that we have a linear git timeline where each commit is roughly =
JIRA ticket; I don't know if this would mess up the timeline, or whether
we'd still be able to merge into the temporary branch, and then rebase into
the main branch later.

Any thoughts and suggestions?

Neville

Re: [Rust] Creating a separate branch for Arrow Partuet writer until next release

Posted by Wes McKinney <we...@gmail.com>.
Having a feature branch for large projects is perfectly manageable as long
as it's rebased fairly frequently (just be SURE not to accidentally force
push master) and PRs are careful to target the feature branch instead of
master (our PR merge tool will respect the target branch of the PR), so
this sounds fine to me.

On Sat, Aug 8, 2020 at 6:43 AM Neville Dipale <ne...@gmail.com> wrote:

> Good day,
>
> This relates to https://github.com/apache/arrow/pull/7319 (ARROW-8289)
>
> In the past few months we haven't had enough review bandwidth on Rust's
> Parquet implementation (mostly relying on Chao for non-trivial reviews),
> and given the amount of work needed for an Arrow writer + the interest so
> far (I think few people already using the draft PR), I'd like to propose:
>
> * We create a temporary branch in the apache/arrow repo, where the arrow
> writer can temporarily live
> * We can merge changes into the branch, esp if there aren't enough
> reviewers at the time
> * When we're close to a release, we merge what's on the temp branch into
> the branch that's currently called `master` but will be renamed soon 😉
>
> ITO the Arrow Parquet writer PR, I think I've gotten arbitrary nesting
> covered, but there's a lot more work that we can now divide more easily so
> others can contribute better.
> I'm also unsure of how to test deeply nested arrays directly in the code (I
> had to use Spark because Arrow reader doesn't yet support that).
>
> Given that we have a linear git timeline where each commit is roughly =
> JIRA ticket; I don't know if this would mess up the timeline, or whether
> we'd still be able to merge into the temporary branch, and then rebase into
> the main branch later.
>
> Any thoughts and suggestions?
>
> Neville
>