You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Rick Hillegas (JIRA)" <ji...@apache.org> on 2012/07/16 20:04:36 UTC

[jira] [Commented] (DERBY-672) Re-enable user defined aggregates

    [ https://issues.apache.org/jira/browse/DERBY-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415462#comment-13415462 ] 

Rick Hillegas commented on DERBY-672:
-------------------------------------

Here is a proposal for creating/dropping user defined aggregates in Derby. If this seems reasonable, I will write a functional spec. I would appreciate feedback on:

1) Whether the syntax is acceptable.

2) Whether the restriction to Java 5 (and above) is acceptable.

Thanks,
-Rick

--------------------------------

Because there is no SQL Standard syntax for user defined aggregates, I see only two ways to offer this frequently requested feature:

i) Introduce new system procedures to create and drop user defined aggregates.

ii) Introduce Derby-specific syntax.

Approach (i) might look something like this:

sys.create_aggregate
(
    aggregateSchema varchar( 128 ),
    aggregateName varchar( 128 ),
    valueDataTypeSchema varchar( 128 ),
    valueDataTypeName varchar( 128 ),
    returnTypeSchema varchar( 128 ),
    returnTypeName varchar( 128 ),
    aggregateClassName varchar( 32672 )
)

sys.drop_aggregate
(
    aggregateSchema varchar( 128 ),
    aggregateName varchar( 128 )
)

Approach (ii) might look something like the following. Note that the extra "derby" keyword flags these statements as Derby extensions and protects us from syntax conflicts in case ANSI/ISO decide to introduce standard syntax in the future:

create derby aggregate [<schemaName>.]<aggregateName> for <argumentDataType>
returns <returnDataType>
external name <className>

drop derby aggregate [<schemaName>.]<aggregateName>


The two approaches would look like this to the user:

    call sys.create_aggregate ( 'APP', 'MODE', null, 'int', null, 'int', 'com.mycompany.myapp.aggs.Mode' );

    call sys.drop_aggregate ( 'APP', 'MODE' );

vs.:

    create derby aggregate MODE for int
    returns int
    external name 'com.mycompany.myapp.aggs.Mode';

    drop derby aggregate MODE;


Here's how I rate these two approaches:

(i):

+ Compact
- Cryptic
- Suffers the same identifier casing problems which mar our other system procedures.

(ii):

+ Readable
- Verbose


I think that approach (ii) is more elegant and attractive.

--------------------------------

Regardless of how we declare and drop user defined aggregates, they would be invoked just like builtin aggregates. For example:

    select age, mode( salary )
    from employee
    group by age;


--------------------------------

User defined aggregates look like parameterized types to me. That means that they would be available on platforms operating at level Java 5 or higher. They would not be available on CDC platforms. To run them on small devices, you would need Java Embedded SE.

I think that a user defined aggregate is a class which implements the following interface. A little mapping code would be necessary to map between this interface and the Java implementations expected by Postgres, IBM, and Oracle. Regardless of the Java api we require, some (probably trivial) re-coding would be necessary to port an aggregate between Derby and Microsoft's .NET apis.

Note that the interface extends Serializable. That is because Derby may have to serialize these objects when sorts spill to disk.


package org.apache.derby.agg;

import java.io.Serializable;
import java.sql.SQLException;

/**
 * <p>
 * Behavior of a user-defined Derby aggregator. Aggregates values
 * of type V and returns a result of type R. In addition to the methods
 * in the interface, implementing classes must have a 0-arg public
 * constructor.
 * </p>
 */
public interface Aggregator<V,R>    extends Serializable
{
    /** Initialize the Aggregator */
    public void init()  throws SQLException;

    /** Accumulate the next scalar value */
    public  void    accumulate( V value )   throws SQLException;

    /**
     * For merging another partial result into this Aggregator.
     * This lets the SQL interpreter divide the incoming rows into
     * subsets, aggregating each subset in isolation, and then merging
     * the partial results together.
     */
    public  void    merge( Aggregator<V,R> otherAggregator )    throws SQLException;

    /** Return the result scalar value */
    public  R   terminate() throws SQLException;
}

                
> Re-enable user defined aggregates
> ---------------------------------
>
>                 Key: DERBY-672
>                 URL: https://issues.apache.org/jira/browse/DERBY-672
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Rick Hillegas
>
> Nicolas Dufour in an email thread titled "functions and list" started on November 2, 2005 requests the ability to create user defined aggregates.
> This functionality used to be in Cloudscape. It was disabled presumably because it was considered non-standard. However, most of the machinery needed for this feature is still in the code. We should re-enable user defined aggregates after we agree on acceptable syntax.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira