You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@druid.apache.org by Abhishek Agarwal <ab...@apache.org> on 2024/02/01 13:48:44 UTC

Re: [Discuss] deprecating and eventually removing non-sql compatible configs

@clint - Those configs are already in SQL compatible mode by default since
28. Aren't they?

To the original question, I am ok to deprecate these configs given that we
have enough releases for folks to migrate over their queries.

On Fri, Jan 26, 2024 at 4:55 AM Clint Wylie <cw...@apache.org> wrote:

> >Are there any performance implications in making those the defaults?
> For numeric columns without null values there should be no impact. For
> numeric columns with null values, I believe there is currently a very
> minor performance impact for scans/aggregations since the null value
> bitmap is iterated to check which rows are null. String performance
> should not be changed afaik. I will try to find some time to do some
> benchmarking to ensure this is true and measure the stuff to confirm
> this, and look for opportunities to improve things where possible in
> order to make this happen, since performance is important. Longer
> term, I have a hunch that if we actually delete all of the default
> value mode code there might also be some performance improvements due
> to smaller function bytecode sizes and the lack of 'if sql compatible
> mode do this, else do this' being littered everywhere across the
> codebase.
>
> >Would this break compatibility for any existing native queries?
> There are a handful of query shapes which might have some differences,
> the most common are probably with strings since in default value mode
> the empty string and null are treated interchangeably, so these kinds
> of queries will need to be modified.
>
> There are also the possibility for some subtle differences in some
> numeric expressions due to the 'pretend nulls do not exist' behavior
> of the default value mode config, in default value mode you can write
> arguably nonsense expressions that end up evaluating stuff like "3 +
> 'hello'" and it will evaluate to 3, while in SQL compatible mode the
> 'hello' will be unable to be implicitly cast to a number and become
> zero, resulting in null.
>
> Strict boolean mode also has some subtle differences if selecting the
> outputs, since it homogenizes all true/false values to 1/0 instead of
> the javascript style behavior when it is off, where the truthy values
> are just passed through.
>
> https://druid.apache.org/docs/latest/querying/math-expr#logical-operator-modes
>
> The other main change in behavior is due to the SQL three-value logic,
> where a filter like "x != 'hello'" today will in default value mode
> match rows with null values, but in SQL compatible mode will not match
> nulls. These queries will need to be rewritten into the form "x !=
> 'hello' OR x is null", or there is also some native filters added
> recently which could be used to wrap to make the SQL equivalent of "(x
> = 'hello') is not true", https://github.com/apache/druid/pull/15182,
> or "not(istrue(eq(x, 'hello'))". I'm actually open to the idea of
> keeping the toggle config for 2vl/3vl since it would need checked in
> very few places to control the behavior (basically in not filter), my
> main goal here is to eventually get rid of
> druid.generic.useDefaultValueForNull.
>
> I plan to dig into all of this in greater depth in the migration guide
> mentioned in the first email to make the transition as painless as
> possible before we remove it.
>
> On Mon, Jan 22, 2024 at 9:54 AM Xavier Léauté
> <xa...@confluent.io.invalid> wrote:
> >
> > Two questions:
> > - Are there any performance implications in making those the defaults?
> > - Would this break compatibility for any existing native queries?
> >
> > On Wed, Jan 17, 2024 at 5:53 PM Clint Wylie <cw...@apache.org> wrote:
> >
> > > Hi all,
> > >
> > > I wanted to discuss the deprecation and removal of the Druid configs
> > > related to SQL compatibility, with the eventual goal of always running
> > > in SQL compatible mode. As of Druid 28, these are the default, but in
> > > the interest of dramatically reducing complexity and removing a ton of
> > > code, and also cutting a lot of our CI time in half, I would
> > > eventually like to remove the related configs and the code that
> > > handles the now non-default behaviors completely.
> > >
> > > The related configs are:
> > >
> > > runtime.properties:
> > > druid.generic.useDefaultValueForNull  - (will become always false)
> > > druid.expressions.useStrictBooleans - (will become always true)
> > > druid.generic.useThreeValueLogicForNativeFilters - (will become always
> > > true)
> > > druid.generic.ignoreNullsForStringCardinality - (irrelevant if
> > > druid.generic.useDefaultValueForNull=false)
> > >
> > > query context:
> > > sqlUseBoundAndSelectors - (this is moderately related, and defaults to
> > > value of druid.generic.useDefaultValueForNull, but enhancements to
> > > expressions and sql planning for lookups make this totally unnecessary
> > > to keep around)
> > >
> > > other things to dump while we are at it:
> > > druid.expressions.allowNestedArrays - (will become always true)
> > >
> > > There might be additional configs which can also be removed, we can
> > > add to this thread if we can think of them.
> > >
> > > I would like to get the official deprecation process started now with
> > > Druid 29, and imagine actually removing them sometime towards the end
> > > of the year, so maybe Druid 32 or so?
> > >
> > > Before completely removing them I think I would like to get a more in
> > > depth migration guide in place, to help any hold-outs that are
> > > overriding the now default SQL compatible configs so that things still
> > > run in the legacy mode.
> > >
> > > Thoughts? Concerns?
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
> > > For additional commands, e-mail: dev-help@druid.apache.org
> > >
> > >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@druid.apache.org
> For additional commands, e-mail: dev-help@druid.apache.org
>
>