You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Satheesh Bandaram <sa...@Sourcery.Org> on 2005/02/03 19:19:53 UTC

Re: derby's function system


shahbaz chaudhary wrote:

> A question about how some functionality is implemented.  It looks like
> several functions (actually all functions other than the aggregate
> functions) are implemented along with their data types.  For example
> hour/day extractors seem to be part of SQLTime/SQLTimestamp/etc. data
> types.  LENGTH/RTRIM/LTRIM seem to be part of the string datatype. 
> Isn't there a disconnect...while the aggregate functions are laid out
> in a hierarchichal, scalar functions are not.
>
> If I wanted a new datatype varcharx, will I have to reimplement all
> the string related scalar functions?
>
This would depend on what is in your varcharx datatype. If it is a minor
variation of existing types, you might be able to extend existing type
implementation. For example, substring() builtin is defined in
SQLChar.java, which represents a CHAR datatype. It is inherited by
SQLVarchar.java, which is used for VARCHAR and like wise, SQLClob.java
extends this VARCHAR implementation.

But if your varcharx has some totally different symantics and manages
different type of data, you may have to implement your substring() for
your datatype.

> What if I wanted to build a bridge to use an existing library of
> (scalar) functions, won't this setup make that more difficult (think
> various regex libraries)?
>
> Regarding aggregators, it looks like they are implemented by operating
> on two values at a time (looking at SumAggregator): a value to be
> added and a value which was populated previously (presumably
> containing aggregate result of column already traversed).
>
> Again, if I wanted to build a bridge to existing libraries which might
> operate on vectors rather than a scalar value, I would not be able to
> do that since most such libraries expect to receive a whole set of
> values at once (in an array or List form).  I'm specifically thinking
> of the scientific COLT library.
>
> How would I implement a typical 'approximating' function: an AVG which
> doesn't sum up everything but SUMs x% of the values (randomly
> selected) and divides by x% of rows?  In other words, now I have to
> pass around another parameter or some sort of context object which
> contains some extra information. (just an example of a possible problem)
>
You should be able to keep more context information if you need to. I am
not familiar with COLT library, but if you want to implement
AvgTopThree() aggregate, to provide average of top three numbers, you
should be able to.

> Just trying to make sure the mental model of the code I've studied so
> far is an accurate one.  Thanks.
>
I think you got it... :-)

> Falcon
>
Satheesh

Re: derby's function system

Posted by Daniel John Debrunner <dj...@debrunners.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> shahbaz chaudhary wrote:

>>
>>Regarding aggregators, it looks like they are implemented by operating
>>on two values at a time (looking at SumAggregator): a value to be
>>added and a value which was populated previously (presumably
>>containing aggregate result of column already traversed).

So to be a little clearer, the aggregator operates on a single incoming
 value and a context object specific to the aggregator function. This
context object is where you would maintain your list of previous values
if required.

Eg. for the max aggregator the context object would simply contain the
current maximum value, for a top-three average it would maintain the top
three values and calculate the final value at finish time.

Dan.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFCAnAEIv0S4qsbfuQRAq+6AKC4hHKjzNb7kRkCif3jMWSF48gRawCfX80b
0wti7Le5a0dylpqaWjqgn0c=
=4U4m
-----END PGP SIGNATURE-----