You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by Ted Dunning <te...@gmail.com> on 2015/07/21 02:00:44 UTC

question about UDF optimization

*Summary:*

Drill is very aggressive about optimizing away calls to functions with
constant arguments. I worry that could extend to per record batch
optimization if I accidentally have constant values and even if that
doesn't happen, it is a pain in the ass now largely because Drill is clever
enough to see through my attempt to hide the constant nature of my
parameters.

*Question:*

Is there a way to mark a UDF as not being a pure function?

*Details:*

I have written a UDF to generate a random number.  It takes parameters that
define the distribution.  All seems well and good.

I find, however, that the function is only called once (twice, actually
apparently due to pipeline warmup) and then Drill optimizes away later
calls, apparently because the parameters to the function are constant and
Drill thinks my function is a pure function.  If I make up some bogus data
to pass in as a parameter, all is well and the function is called as much
as I wanted.

For instance, with the uniform distribution, my function takes two
arguments, those being the minimum and maximum value to return.  Here is
what I see with constants for the min and max:

0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as tbl(x);
into eval
into eval
+---------------------+
|       EXPR$0        |
+---------------------+
| 1.7787372583008298  |
| 1.7787372583008298  |
| 1.7787372583008298  |
| 1.7787372583008298  |
+---------------------+


If I include an actual value, we see more interesting behavior even if the
value is effectively constant:

0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as tbl(x);
into eval
into eval
into eval
into eval
+----------------------+
|        EXPR$0        |
+----------------------+
| 3.688377805419459    |
| 0.2827056410711032   |
| 2.3107479622644918   |
| 0.10813788169218574  |
+----------------------+
4 rows selected (0.088 seconds)


Even if I make the max value come along from the sub-query, I get the evil
behavior although the function is now surprisingly actually called three
times, apparently to do with warming up the pipeline:

0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
max_value,x from (values 5,5,5,5) as tbl(x)) foo;
into eval
into eval
into eval
+---------------------+
|       EXPR$0        |
+---------------------+
| 13.404462063773702  |
| 13.404462063773702  |
| 13.404462063773702  |
| 13.404462063773702  |
+---------------------+
4 rows selected (0.121 seconds)

The UDF itself is boring and can be found at
https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0

So how can I defeat this behavior?

Re: question about UDF optimization

Posted by Chris Westin <ch...@gmail.com>.

Yep, I would expect pure to be the majority and default, and that makes
sense, because these functions are not class members that could have a
(implicit) "this" pointer that references member variables whose state
would change, leading to implementations with side-effects (and impure
results).

On Tue, Jul 21, 2015 at 5:25 PM, Ted Dunning <te...@gmail.com> wrote:

> Even in my own warped experience, the vast majority of UDF's I have written
> or considered writing have been pure.
>
>
>
> On Tue, Jul 21, 2015 at 4:27 PM, Jacques Nadeau <ja...@dremio.com>
> wrote:
>
> > I don't think so.  There are something like 1500 functions where this
> isn't
> > true (default) and one or two where it is.
> >
> > On Tue, Jul 21, 2015 at 4:25 PM, Daniel Barclay <db...@maprtech.com>
> > wrote:
> >
> > >
> > > Should Drill be defaulting the other way?
> > >
> > > That is, instead of assuming pure unless declared otherwise (leading to
> > > wrong results in the case that the assumption is wrong (or the
> annotation
> > > was forgotten)), should Drill be assuming not pure unless declared pure
> > > (leading to only lower performance in the wrong-assumption case)?
> > >
> > > Daniel
> > >
> > >
> > >
> > >
> > > Jacques Nadeau wrote:
> > >
> > >> There is an annotation on the function template.  I don't have a
> laptop
> > >> close but I believe it is something similar to isRandom. It basically
> > >> tells
> > >> Drill that this is a nondeterministic function. I will be more
> specific
> > >> once I get back to my machine if you don't find it sooner.
> > >>
> > >> Jacques
> > >> *Summary:*
> > >>
> > >> Drill is very aggressive about optimizing away calls to functions with
> > >> constant arguments. I worry that could extend to per record batch
> > >> optimization if I accidentally have constant values and even if that
> > >> doesn't happen, it is a pain in the ass now largely because Drill is
> > >> clever
> > >> enough to see through my attempt to hide the constant nature of my
> > >> parameters.
> > >>
> > >> *Question:*
> > >>
> > >> Is there a way to mark a UDF as not being a pure function?
> > >>
> > >> *Details:*
> > >>
> > >> I have written a UDF to generate a random number.  It takes parameters
> > >> that
> > >> define the distribution.  All seems well and good.
> > >>
> > >> I find, however, that the function is only called once (twice,
> actually
> > >> apparently due to pipeline warmup) and then Drill optimizes away later
> > >> calls, apparently because the parameters to the function are constant
> > and
> > >> Drill thinks my function is a pure function.  If I make up some bogus
> > data
> > >> to pass in as a parameter, all is well and the function is called as
> > much
> > >> as I wanted.
> > >>
> > >> For instance, with the uniform distribution, my function takes two
> > >> arguments, those being the minimum and maximum value to return.  Here
> is
> > >> what I see with constants for the min and max:
> > >>
> > >> 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as
> > >> tbl(x);
> > >> into eval
> > >> into eval
> > >> +---------------------+
> > >> |       EXPR$0        |
> > >> +---------------------+
> > >> | 1.7787372583008298  |
> > >> | 1.7787372583008298  |
> > >> | 1.7787372583008298  |
> > >> | 1.7787372583008298  |
> > >> +---------------------+
> > >>
> > >>
> > >> If I include an actual value, we see more interesting behavior even if
> > the
> > >> value is effectively constant:
> > >>
> > >> 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as
> > >> tbl(x);
> > >> into eval
> > >> into eval
> > >> into eval
> > >> into eval
> > >> +----------------------+
> > >> |        EXPR$0        |
> > >> +----------------------+
> > >> | 3.688377805419459    |
> > >> | 0.2827056410711032   |
> > >> | 2.3107479622644918   |
> > >> | 0.10813788169218574  |
> > >> +----------------------+
> > >> 4 rows selected (0.088 seconds)
> > >>
> > >>
> > >> Even if I make the max value come along from the sub-query, I get the
> > evil
> > >> behavior although the function is now surprisingly actually called
> three
> > >> times, apparently to do with warming up the pipeline:
> > >>
> > >> 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
> > >> max_value,x from (values 5,5,5,5) as tbl(x)) foo;
> > >> into eval
> > >> into eval
> > >> into eval
> > >> +---------------------+
> > >> |       EXPR$0        |
> > >> +---------------------+
> > >> | 13.404462063773702  |
> > >> | 13.404462063773702  |
> > >> | 13.404462063773702  |
> > >> | 13.404462063773702  |
> > >> +---------------------+
> > >> 4 rows selected (0.121 seconds)
> > >>
> > >> The UDF itself is boring and can be found at
> > >> https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0
> > >>
> > >> So how can I defeat this behavior?
> > >>
> > >>
> > >
> > > --
> > > Daniel Barclay
> > > MapR Technologies
> > >
> >
>

Re: question about UDF optimization

Posted by Ted Dunning <te...@gmail.com>.

Even in my own warped experience, the vast majority of UDF's I have written
or considered writing have been pure.



On Tue, Jul 21, 2015 at 4:27 PM, Jacques Nadeau <ja...@dremio.com> wrote:

> I don't think so.  There are something like 1500 functions where this isn't
> true (default) and one or two where it is.
>
> On Tue, Jul 21, 2015 at 4:25 PM, Daniel Barclay <db...@maprtech.com>
> wrote:
>
> >
> > Should Drill be defaulting the other way?
> >
> > That is, instead of assuming pure unless declared otherwise (leading to
> > wrong results in the case that the assumption is wrong (or the annotation
> > was forgotten)), should Drill be assuming not pure unless declared pure
> > (leading to only lower performance in the wrong-assumption case)?
> >
> > Daniel
> >
> >
> >
> >
> > Jacques Nadeau wrote:
> >
> >> There is an annotation on the function template.  I don't have a laptop
> >> close but I believe it is something similar to isRandom. It basically
> >> tells
> >> Drill that this is a nondeterministic function. I will be more specific
> >> once I get back to my machine if you don't find it sooner.
> >>
> >> Jacques
> >> *Summary:*
> >>
> >> Drill is very aggressive about optimizing away calls to functions with
> >> constant arguments. I worry that could extend to per record batch
> >> optimization if I accidentally have constant values and even if that
> >> doesn't happen, it is a pain in the ass now largely because Drill is
> >> clever
> >> enough to see through my attempt to hide the constant nature of my
> >> parameters.
> >>
> >> *Question:*
> >>
> >> Is there a way to mark a UDF as not being a pure function?
> >>
> >> *Details:*
> >>
> >> I have written a UDF to generate a random number.  It takes parameters
> >> that
> >> define the distribution.  All seems well and good.
> >>
> >> I find, however, that the function is only called once (twice, actually
> >> apparently due to pipeline warmup) and then Drill optimizes away later
> >> calls, apparently because the parameters to the function are constant
> and
> >> Drill thinks my function is a pure function.  If I make up some bogus
> data
> >> to pass in as a parameter, all is well and the function is called as
> much
> >> as I wanted.
> >>
> >> For instance, with the uniform distribution, my function takes two
> >> arguments, those being the minimum and maximum value to return.  Here is
> >> what I see with constants for the min and max:
> >>
> >> 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as
> >> tbl(x);
> >> into eval
> >> into eval
> >> +---------------------+
> >> |       EXPR$0        |
> >> +---------------------+
> >> | 1.7787372583008298  |
> >> | 1.7787372583008298  |
> >> | 1.7787372583008298  |
> >> | 1.7787372583008298  |
> >> +---------------------+
> >>
> >>
> >> If I include an actual value, we see more interesting behavior even if
> the
> >> value is effectively constant:
> >>
> >> 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as
> >> tbl(x);
> >> into eval
> >> into eval
> >> into eval
> >> into eval
> >> +----------------------+
> >> |        EXPR$0        |
> >> +----------------------+
> >> | 3.688377805419459    |
> >> | 0.2827056410711032   |
> >> | 2.3107479622644918   |
> >> | 0.10813788169218574  |
> >> +----------------------+
> >> 4 rows selected (0.088 seconds)
> >>
> >>
> >> Even if I make the max value come along from the sub-query, I get the
> evil
> >> behavior although the function is now surprisingly actually called three
> >> times, apparently to do with warming up the pipeline:
> >>
> >> 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
> >> max_value,x from (values 5,5,5,5) as tbl(x)) foo;
> >> into eval
> >> into eval
> >> into eval
> >> +---------------------+
> >> |       EXPR$0        |
> >> +---------------------+
> >> | 13.404462063773702  |
> >> | 13.404462063773702  |
> >> | 13.404462063773702  |
> >> | 13.404462063773702  |
> >> +---------------------+
> >> 4 rows selected (0.121 seconds)
> >>
> >> The UDF itself is boring and can be found at
> >> https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0
> >>
> >> So how can I defeat this behavior?
> >>
> >>
> >
> > --
> > Daniel Barclay
> > MapR Technologies
> >
>

Re: question about UDF optimization

Posted by Jacques Nadeau <ja...@dremio.com>.

I don't think so.  There are something like 1500 functions where this isn't
true (default) and one or two where it is.

On Tue, Jul 21, 2015 at 4:25 PM, Daniel Barclay <db...@maprtech.com>
wrote:

>
> Should Drill be defaulting the other way?
>
> That is, instead of assuming pure unless declared otherwise (leading to
> wrong results in the case that the assumption is wrong (or the annotation
> was forgotten)), should Drill be assuming not pure unless declared pure
> (leading to only lower performance in the wrong-assumption case)?
>
> Daniel
>
>
>
>
> Jacques Nadeau wrote:
>
>> There is an annotation on the function template.  I don't have a laptop
>> close but I believe it is something similar to isRandom. It basically
>> tells
>> Drill that this is a nondeterministic function. I will be more specific
>> once I get back to my machine if you don't find it sooner.
>>
>> Jacques
>> *Summary:*
>>
>> Drill is very aggressive about optimizing away calls to functions with
>> constant arguments. I worry that could extend to per record batch
>> optimization if I accidentally have constant values and even if that
>> doesn't happen, it is a pain in the ass now largely because Drill is
>> clever
>> enough to see through my attempt to hide the constant nature of my
>> parameters.
>>
>> *Question:*
>>
>> Is there a way to mark a UDF as not being a pure function?
>>
>> *Details:*
>>
>> I have written a UDF to generate a random number.  It takes parameters
>> that
>> define the distribution.  All seems well and good.
>>
>> I find, however, that the function is only called once (twice, actually
>> apparently due to pipeline warmup) and then Drill optimizes away later
>> calls, apparently because the parameters to the function are constant and
>> Drill thinks my function is a pure function.  If I make up some bogus data
>> to pass in as a parameter, all is well and the function is called as much
>> as I wanted.
>>
>> For instance, with the uniform distribution, my function takes two
>> arguments, those being the minimum and maximum value to return.  Here is
>> what I see with constants for the min and max:
>>
>> 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as
>> tbl(x);
>> into eval
>> into eval
>> +---------------------+
>> |       EXPR$0        |
>> +---------------------+
>> | 1.7787372583008298  |
>> | 1.7787372583008298  |
>> | 1.7787372583008298  |
>> | 1.7787372583008298  |
>> +---------------------+
>>
>>
>> If I include an actual value, we see more interesting behavior even if the
>> value is effectively constant:
>>
>> 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as
>> tbl(x);
>> into eval
>> into eval
>> into eval
>> into eval
>> +----------------------+
>> |        EXPR$0        |
>> +----------------------+
>> | 3.688377805419459    |
>> | 0.2827056410711032   |
>> | 2.3107479622644918   |
>> | 0.10813788169218574  |
>> +----------------------+
>> 4 rows selected (0.088 seconds)
>>
>>
>> Even if I make the max value come along from the sub-query, I get the evil
>> behavior although the function is now surprisingly actually called three
>> times, apparently to do with warming up the pipeline:
>>
>> 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
>> max_value,x from (values 5,5,5,5) as tbl(x)) foo;
>> into eval
>> into eval
>> into eval
>> +---------------------+
>> |       EXPR$0        |
>> +---------------------+
>> | 13.404462063773702  |
>> | 13.404462063773702  |
>> | 13.404462063773702  |
>> | 13.404462063773702  |
>> +---------------------+
>> 4 rows selected (0.121 seconds)
>>
>> The UDF itself is boring and can be found at
>> https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0
>>
>> So how can I defeat this behavior?
>>
>>
>
> --
> Daniel Barclay
> MapR Technologies
>

Re: question about UDF optimization

Posted by Daniel Barclay <db...@maprtech.com>.

Should Drill be defaulting the other way?

That is, instead of assuming pure unless declared otherwise (leading to
wrong results in the case that the assumption is wrong (or the annotation
was forgotten)), should Drill be assuming not pure unless declared pure
(leading to only lower performance in the wrong-assumption case)?

Daniel



Jacques Nadeau wrote:
> There is an annotation on the function template.  I don't have a laptop
> close but I believe it is something similar to isRandom. It basically tells
> Drill that this is a nondeterministic function. I will be more specific
> once I get back to my machine if you don't find it sooner.
>
> Jacques
> *Summary:*
>
> Drill is very aggressive about optimizing away calls to functions with
> constant arguments. I worry that could extend to per record batch
> optimization if I accidentally have constant values and even if that
> doesn't happen, it is a pain in the ass now largely because Drill is clever
> enough to see through my attempt to hide the constant nature of my
> parameters.
>
> *Question:*
>
> Is there a way to mark a UDF as not being a pure function?
>
> *Details:*
>
> I have written a UDF to generate a random number.  It takes parameters that
> define the distribution.  All seems well and good.
>
> I find, however, that the function is only called once (twice, actually
> apparently due to pipeline warmup) and then Drill optimizes away later
> calls, apparently because the parameters to the function are constant and
> Drill thinks my function is a pure function.  If I make up some bogus data
> to pass in as a parameter, all is well and the function is called as much
> as I wanted.
>
> For instance, with the uniform distribution, my function takes two
> arguments, those being the minimum and maximum value to return.  Here is
> what I see with constants for the min and max:
>
> 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as tbl(x);
> into eval
> into eval
> +---------------------+
> |       EXPR$0        |
> +---------------------+
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> +---------------------+
>
>
> If I include an actual value, we see more interesting behavior even if the
> value is effectively constant:
>
> 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as tbl(x);
> into eval
> into eval
> into eval
> into eval
> +----------------------+
> |        EXPR$0        |
> +----------------------+
> | 3.688377805419459    |
> | 0.2827056410711032   |
> | 2.3107479622644918   |
> | 0.10813788169218574  |
> +----------------------+
> 4 rows selected (0.088 seconds)
>
>
> Even if I make the max value come along from the sub-query, I get the evil
> behavior although the function is now surprisingly actually called three
> times, apparently to do with warming up the pipeline:
>
> 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
> max_value,x from (values 5,5,5,5) as tbl(x)) foo;
> into eval
> into eval
> into eval
> +---------------------+
> |       EXPR$0        |
> +---------------------+
> | 13.404462063773702  |
> | 13.404462063773702  |
> | 13.404462063773702  |
> | 13.404462063773702  |
> +---------------------+
> 4 rows selected (0.121 seconds)
>
> The UDF itself is boring and can be found at
> https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0
>
> So how can I defeat this behavior?
>


-- 
Daniel Barclay
MapR Technologies

Re: question about UDF optimization

Posted by Ted Dunning <te...@gmail.com>.

I found the option.  Works a charm.  I will add it to the FAQ list.

On Mon, Jul 20, 2015 at 5:11 PM, Jacques Nadeau <ja...@dremio.com> wrote:

> There is an annotation on the function template.  I don't have a laptop
> close but I believe it is something similar to isRandom. It basically tells
> Drill that this is a nondeterministic function. I will be more specific
> once I get back to my machine if you don't find it sooner.
>
> Jacques
> *Summary:*
>
> Drill is very aggressive about optimizing away calls to functions with
> constant arguments. I worry that could extend to per record batch
> optimization if I accidentally have constant values and even if that
> doesn't happen, it is a pain in the ass now largely because Drill is clever
> enough to see through my attempt to hide the constant nature of my
> parameters.
>
> *Question:*
>
> Is there a way to mark a UDF as not being a pure function?
>
> *Details:*
>
> I have written a UDF to generate a random number.  It takes parameters that
> define the distribution.  All seems well and good.
>
> I find, however, that the function is only called once (twice, actually
> apparently due to pipeline warmup) and then Drill optimizes away later
> calls, apparently because the parameters to the function are constant and
> Drill thinks my function is a pure function.  If I make up some bogus data
> to pass in as a parameter, all is well and the function is called as much
> as I wanted.
>
> For instance, with the uniform distribution, my function takes two
> arguments, those being the minimum and maximum value to return.  Here is
> what I see with constants for the min and max:
>
> 0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as
> tbl(x);
> into eval
> into eval
> +---------------------+
> |       EXPR$0        |
> +---------------------+
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> | 1.7787372583008298  |
> +---------------------+
>
>
> If I include an actual value, we see more interesting behavior even if the
> value is effectively constant:
>
> 0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as tbl(x);
> into eval
> into eval
> into eval
> into eval
> +----------------------+
> |        EXPR$0        |
> +----------------------+
> | 3.688377805419459    |
> | 0.2827056410711032   |
> | 2.3107479622644918   |
> | 0.10813788169218574  |
> +----------------------+
> 4 rows selected (0.088 seconds)
>
>
> Even if I make the max value come along from the sub-query, I get the evil
> behavior although the function is now surprisingly actually called three
> times, apparently to do with warming up the pipeline:
>
> 0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
> max_value,x from (values 5,5,5,5) as tbl(x)) foo;
> into eval
> into eval
> into eval
> +---------------------+
> |       EXPR$0        |
> +---------------------+
> | 13.404462063773702  |
> | 13.404462063773702  |
> | 13.404462063773702  |
> | 13.404462063773702  |
> +---------------------+
> 4 rows selected (0.121 seconds)
>
> The UDF itself is boring and can be found at
> https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0
>
> So how can I defeat this behavior?
>

Re: question about UDF optimization

Posted by Jacques Nadeau <ja...@dremio.com>.

There is an annotation on the function template.  I don't have a laptop
close but I believe it is something similar to isRandom. It basically tells
Drill that this is a nondeterministic function. I will be more specific
once I get back to my machine if you don't find it sooner.

Jacques
*Summary:*

Drill is very aggressive about optimizing away calls to functions with
constant arguments. I worry that could extend to per record batch
optimization if I accidentally have constant values and even if that
doesn't happen, it is a pain in the ass now largely because Drill is clever
enough to see through my attempt to hide the constant nature of my
parameters.

*Question:*

Is there a way to mark a UDF as not being a pure function?

*Details:*

I have written a UDF to generate a random number.  It takes parameters that
define the distribution.  All seems well and good.

I find, however, that the function is only called once (twice, actually
apparently due to pipeline warmup) and then Drill optimizes away later
calls, apparently because the parameters to the function are constant and
Drill thinks my function is a pure function.  If I make up some bogus data
to pass in as a parameter, all is well and the function is called as much
as I wanted.

For instance, with the uniform distribution, my function takes two
arguments, those being the minimum and maximum value to return.  Here is
what I see with constants for the min and max:

0: jdbc:drill:zk=local> select random(0,10) from (values 5,5,5,5) as tbl(x);
into eval
into eval
+---------------------+
|       EXPR$0        |
+---------------------+
| 1.7787372583008298  |
| 1.7787372583008298  |
| 1.7787372583008298  |
| 1.7787372583008298  |
+---------------------+


If I include an actual value, we see more interesting behavior even if the
value is effectively constant:

0: jdbc:drill:zk=local> select random(0,x) from (values 5,5,5,5) as tbl(x);
into eval
into eval
into eval
into eval
+----------------------+
|        EXPR$0        |
+----------------------+
| 3.688377805419459    |
| 0.2827056410711032   |
| 2.3107479622644918   |
| 0.10813788169218574  |
+----------------------+
4 rows selected (0.088 seconds)


Even if I make the max value come along from the sub-query, I get the evil
behavior although the function is now surprisingly actually called three
times, apparently to do with warming up the pipeline:

0: jdbc:drill:zk=local> select random(0,max_value) from (select 14 as
max_value,x from (values 5,5,5,5) as tbl(x)) foo;
into eval
into eval
into eval
+---------------------+
|       EXPR$0        |
+---------------------+
| 13.404462063773702  |
| 13.404462063773702  |
| 13.404462063773702  |
| 13.404462063773702  |
+---------------------+
4 rows selected (0.121 seconds)

The UDF itself is boring and can be found at
https://gist.github.com/tdunning/0c2cc2089e6cd8c030c0

So how can I defeat this behavior?