You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spot.apache.org by Vivienne Pustell <vi...@yellingviv.com> on 2020/04/02 23:09:35 UTC

Re: Hive Partition Names for numbers (SPOT-19 and SPOT-239)

Hi Jeremy,

My personal vote would be to align around the two-digit zero-padded
strings. It's more standardized, and the consistency in format makes it
easier to run scripts as needed. In that sense, it should be viable to run
a script to change the numbers that are currently stored, and then stay
consistent going forward. We've been running into issues with things like
this in $DAY_JOB and it's definitely worth getting consistent sooner rather
than later, IMO.

Cheers,
-Vivienne

On Mon, Mar 30, 2020 at 12:47 PM Jeremy Nelson <je...@digitalminion.com>
wrote:

> Greetings,
>
> I noticed that SPOT-19 and SPOT-239 describe two aspects of the same
> problem – that is – HIVE partition names are stored as numbers (ie,
> y=2020/m=3/d=4) but some software (like the SPOT-ML) expects these
> partition names to be stored as two-digit zero-padded strings (ie,
> y=2020/m=03/d=04).
>
> We should identify which approach is preferable (break the schema, or break
> the software?) and then harmonize all affected systems to do the same
> thing, one way or the other.
>
> Thanks,
>
> Jeremy
>