You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Christopher <ct...@apache.org> on 2015/01/26 23:23:08 UTC

Re: Planning for (eventual) removal of instance.dfs.{uri,dir}

Revisiting this, I've created ACCUMULO-3535.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii

On Fri, Dec 19, 2014 at 5:37 PM, Christopher <ct...@apache.org> wrote:

> I spoke with Keith a bit offline, and he suggested a different path:
>
> Have an intermediate release which requires use of instance.volumes,
> instead of allowing users to continue using the old properties. Use the old
> properties for resolving old relative paths, and re-write them on upgrade.
>
> This solution creates a boundary for upgrades, though. One could only
> (safely) upgrade to a version without these properties, from a version
> where after this metadata upgrade had been forced. So, if we did this in
> 1.7.0, and dropped the old properties in 2.0 (for example), we could not
> upgrade from 1.6 to 2.0 (unless a user was certain they didn't have any
> relative paths left to resolve).
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
> On Thu, Dec 18, 2014 at 12:42 PM, Christopher <ct...@apache.org> wrote:
>
>> Talked with Dave a bit offline. I think I've convinced myself that
>> whenever we do drop instance.dfs.{uri,dir}, what we can do when we see a
>> relative path is resolve it with the first item in instance.volumes instead
>> of all the complicated logic of resolving using instance.dfs.{uri,dir} with
>> fallback to the Hadoop default fs. I think this will provide a reasonable
>> user experience and resolve all the annoyingly confusing logic of relative
>> path resolution today.
>>
>> We can also spit out a warning when we see a relative path, which
>> displays instructions for compacting the file away. We could actually
>> introduce this warning sooner, like say 1.7.0, rather than wait until we
>> drop the old properties.
>>
>> So, in summary, I think:
>>
>> 1.7.0: continue current complicated resolution of old properties, warn
>> about not using instance.volumes, warn with compact instructions when
>> resolving relative paths
>>
>> 2.0.0: drop old properties and resolve using first item in
>> instance.volumes, require instance.volumes to have at least one volume,
>> warn with compact instructions when resolving relative paths
>>
>>
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>> On Thu, Dec 11, 2014 at 11:33 AM, <dl...@comcast.net> wrote:
>>>
>>> I think we crossed the streams. I'll talk to you offline.
>>>
>>>
>>> ----- Original Message -----
>>>
>>> From: "Christopher" <ct...@apache.org>
>>> To: "Accumulo Dev List" <de...@accumulo.apache.org>
>>> Sent: Thursday, December 11, 2014 11:27:48 AM
>>> Subject: Re: Planning for (eventual) removal of instance.dfs.{uri,dir}
>>>
>>> On Thu, Dec 11, 2014 at 10:52 AM, <dl...@comcast.net> wrote:
>>>
>>> > "Making it required is annoying if users don't have relative paths."
>>> >
>>> > But this property is used to determine the location of new files.....
>>> >
>>> >
>>> No, it's not. Not if you're using instance.volumes. It should only be
>>> used
>>> for resolving old files in that case.
>>>
>>>
>>> > ----- Original Message -----
>>> >
>>> > From: "Christopher" <ct...@apache.org>
>>> > To: "Accumulo Dev List" <de...@accumulo.apache.org>
>>> > Sent: Thursday, December 11, 2014 10:03:56 AM
>>> > Subject: Re: Planning for (eventual) removal of instance.dfs.{uri,dir}
>>> >
>>> > Well, no, the database will start if we rely on instance.volumes, but
>>> we
>>> > may be unable to find files that have relative paths, if we incorrectly
>>> > assume /accumulo. Making it required is annoying if users don't have
>>> > relative paths.
>>> >
>>> >
>>> > --
>>> > Christopher L Tubbs II
>>> > http://gravatar.com/ctubbsii
>>> >
>>> > On Thu, Dec 11, 2014 at 8:15 AM, <dl...@comcast.net> wrote:
>>> >
>>> > > How so? If someone upgrades from another version and is using a
>>> different
>>> > > dir, doesn't specify it in the configuration, and we assume
>>> /accumulo,
>>> > then
>>> > > their database won't start. The other option, which may be safer than
>>> > > making any assumptions, is to make instance.volumes a required
>>> parameter
>>> > > with no defaults.
>>> > >
>>> > > ----- Original Message -----
>>> > >
>>> > > From: "Christopher" <ct...@apache.org>
>>> > > To: "Accumulo Dev List" <de...@accumulo.apache.org>
>>> > > Sent: Wednesday, December 10, 2014 11:51:39 PM
>>> > > Subject: Re: Planning for (eventual) removal of
>>> instance.dfs.{uri,dir}
>>> > >
>>> > > The URI is probably reasonable, but the dir is potentially
>>> problematic if
>>> > > you weren't using the default.
>>> > >
>>> > >
>>> > > --
>>> > > Christopher L Tubbs II
>>> > > http://gravatar.com/ctubbsii
>>> > >
>>> > > On Wed, Dec 10, 2014 at 10:03 PM, dlmarion <dl...@comcast.net>
>>> wrote:
>>> > >
>>> > > > Looks like VolumeConfiguration falls back to fs.defaultFS for the
>>> uri
>>> > and
>>> > > > /accumulo for the dir. You could remove both properties and still
>>> keep
>>> > > this
>>> > > > as the documented default behavior if instance.volumes is not
>>> > specified.
>>> > > >
>>> > > >
>>> > > >
>>> > > > <div>-------- Original message --------</div><div>From:
>>> Christopher <
>>> > > > ctubbsii@apache.org> </div><div>Date:12/10/2014 9:13 PM
>>> (GMT-05:00)
>>> > > > </div><div>To: Accumulo Dev List <de...@accumulo.apache.org>
>>> > > > </div><div>Cc: </div><div>Subject: Re: Planning for (eventual)
>>> removal
>>> > of
>>> > > > instance.dfs.{uri,dir} </div><div>
>>> > > > </div>My ACCUMULO-2589 branch in github (
>>> > > > https://github.com/ctubbsii/accumulo/tree/ACCUMULO-2589) does
>>> have a
>>> > > > commit
>>> > > > that drops a bunch of stuff (which may or may not be accepted as
>>> is for
>>> > > > 2.0). The instance.dfs.{uri,dir} properties aren't so easy and
>>> require
>>> > > more
>>> > > > planning, because it's not just removing a property... it's also
>>> > dealing
>>> > > > with updating internal data by making relative paths absolute.
>>> > > >
>>> > > > For what it's worth, I'm also looking at what changes if we drop
>>> > Hadoop 1
>>> > > > support.
>>> > > >
>>> > > > As for the validation of config, I think we do a sanity check on
>>> > startup
>>> > > > already, which we can always extend. Doesn't solve this issue,
>>> though.
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Christopher L Tubbs II
>>> > > > http://gravatar.com/ctubbsii
>>> > > >
>>> > > > On Wed, Dec 10, 2014 at 8:59 PM, dlmarion <dl...@comcast.net>
>>> > wrote:
>>> > > >
>>> > > > > We should schedule a bunch of deprecated things for removal in
>>> 2.0 to
>>> > > > ease
>>> > > > > maintenance. Do we have a way to validate the site.xml and
>>> zookeeper
>>> > > > > settings before startup and fail with appropriate error message.
>>> > > > >
>>> > > > >
>>> > > > >
>>> > > > > <div>-------- Original message --------</div><div>From:
>>> Christopher <
>>> > > > > ctubbsii@apache.org> </div><div>Date:12/10/2014 8:44 PM
>>> (GMT-05:00)
>>> > > > > </div><div>To: Accumulo Dev List <de...@accumulo.apache.org>
>>> > > > > </div><div>Cc: </div><div>Subject: Planning for (eventual)
>>> removal of
>>> > > > > instance.dfs.{uri,dir} </div><div>
>>> > > > > </div>So,
>>> > > > >
>>> > > > > instance.volumes replaces instance.dfs.uri and instance.dfs.dir
>>> in
>>> > 1.6.
>>> > > > > But, what's our long-term plan for these? I ask, because we still
>>> > have
>>> > > > > internal code that uses instance.dfs.uri to resolve relative
>>> paths.
>>> > > > >
>>> > > > > Should we force these to be re-written at some point (maybe on
>>> > upgrade
>>> > > to
>>> > > > > 1.7)? Should we continue to support the deprecated properties
>>> > > > indefinitely
>>> > > > > and continue the lazy re-write-on-compact? Do we transition by
>>> > > requiring
>>> > > > > instance.volumes to specify the volumes, and only use the old
>>> > > properties
>>> > > > to
>>> > > > > resolve relative paths?
>>> > > > >
>>> > > > > My personal view is that we should provide an upgrade-prep/check
>>> tool
>>> > > > prior
>>> > > > > to upgrade to 2.0, which checks and/or re-writes paths and
>>> verifies
>>> > > that
>>> > > > > instance.volumes is set.
>>> > > > >
>>> > > > > Does anybody have a different opinion on this?
>>> > > > >
>>> > > > > --
>>> > > > > Christopher L Tubbs II
>>> > > > > http://gravatar.com/ctubbsii
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>>
>>>
>