You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by Andrew Wong <aw...@cloudera.com.INVALID> on 2018/06/26 22:56:36 UTC

Re: Kudu: potential metadata changes

Hi folks,

I'd like to give an update to this thread, noting some metadata changes that
have gone in over the past months, and some further metadata changes that I
thinkare worth making. The considerations from the previous document are
still
valid; I'd like to paint a slightly more concrete picture of the changes I
think can go in.

- the previously-discussed option to allow users to configure the placement
of
  metadata has landed; the new default metadata directory is the WALs
directory
- the optimization to store min/max keys in the rowset metadata is exposed
to
  users via a flag

What's left is, well, the rest of it. We still have no easy way to make
storage
changes to metadata. Nor do we have the abstractions to address all of the
issues mentioned last time. I've put together a doc
<https://docs.google.com/document/d/1ijh3SGFAR2VXAvT5L1stVPgadLLm1aGWCfJjTdijpX0/edit?usp=sharing>
that outlines a couple
potential steps in that direction and some other noteworthy issues with our
current metadata story. I would appreciate some feedback.

Thanks,
Andrew


On Wed, Jan 10, 2018 at 3:11 PM Todd Lipcon <to...@cloudera.com> wrote:

> On Wed, Jan 10, 2018 at 11:57 AM, Adar Lieber-Dembo <ad...@cloudera.com>
> wrote:
>
> >
> > As far as specific bottlenecks are concerned, the spreading of LBM
> > metadata across multiple files is a main contributor to our long
> > startup times, and that's one of the biggest scalability bottlenecks
> > AFAIK.
> >
>
> I think the context of the doc Andrew posted is more about tablet-meta and
> consensus-meta and less about LBM meta. Improving LBM meta to be more
> efficiently stored (perhaps rolled into the same storage as other bits of
> metadata) is a nice idea as well, and maybe there are some shared bits of
> infrastructure here though?
>
> (eg if we imported rocksdb, lmdb, or built a simple KV-store abstraction
> maybe we could use it for all these use cases equally)
>
>
> >
> > > With these points in mind, it seems the reasonable path forward is go
> > with
> > > 1. and introduce a flag for users to colocate WALs and metadata.
> >
> > Makes sense. Will colocation be enabled by default for new clusters?
> > How about using the flag to define an explicit metadata directory,
> > with the default being empty ("use the WAL directory")? That'd make it
> > similar to fs_data_dirs, which is nice. If the metadata directory
> > can't be found in directory specified by this gflag (or in the WAL
> > directory, if blank), we can fall back to looking in the first data
> > directory.
> >
>
> That sounds reasonable to me.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


-- 
Andrew Wong