You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2022/03/23 17:17:42 UTC

[GitHub] [cassandra] adelapena commented on a change in pull request #1510: CASSANDRA-11871: Allow to aggregate by time intervals

adelapena commented on a change in pull request #1510:
URL: https://github.com/apache/cassandra/pull/1510#discussion_r833496195



##########
File path: src/java/org/apache/cassandra/cql3/Duration.java
##########
@@ -395,6 +409,113 @@ private static long append(StringBuilder builder, long dividend, long divisor, S
         return dividend % divisor;
     }
 
+    /**
+     * Rounds a timestamp down to the closest multiple of a duration.
+     *
+     * @param timeInMillis the time to round in millisecond
+     * @param duration the duration
+     * @param startingTimeInMillis the time offset in milliseconds
+     * @return the timestamp rounded down to the closest multiple of the duration
+     */
+    public static long floorTimestamp(long timeInMillis, Duration duration, long startingTimeInMillis)
+    {
+        checkFalse(startingTimeInMillis > timeInMillis, "The floor function starting time is greater than the provided time");
+        checkFalse(duration.isNegative(), "Negative durations are not supported by the floor function");
+
+        // If the duration does not contains any months we can ignore daylight saving,
+        // as time zones are not supported, and simply look at the milliseconds
+        if (duration.months == 0)
+        {
+            // We can ignore daylight saving as time zones are not supported
+            long durationInMillis = (duration.days * MILLIS_PER_DAY) + (duration.nanoseconds / NANOS_PER_MILLI);
+
+            // If the duration is smaller than millisecond
+            if (durationInMillis == 0)
+                return timeInMillis;
+
+            long delta = (timeInMillis - startingTimeInMillis) % durationInMillis;
+            return timeInMillis - delta;
+        }
+
+        /*
+         * Otherwise, we resort to Calendar for the computation.
+         * What we're trying to compute is the largest integer 'multiplier' value such that
+         *   startingTimeMillis + (multiplier * duration) <= timeInMillis
+         * at which point we want to return 'startingTimeMillis + (multiplier * duration)'.
+         *
+         * One option would be to add 'duration' to 'statingTimeMillis' in a loop until we
+         * cross 'timeInMillis' and return how many iterator we did. But this might be slow if there is very many
+         * steps.
+         *
+         * So instead we first estimate 'multiplier' using the number of months between 'startingTimeMillis'
+         * and 'timeInMillis' ('durationInMonths' below) and the duration months. As the real computation
+         * should also take the 'days' and 'nanoseconds' parts of the duration, this multiplier may overshoot,
+         * so we detect it and work back from that, decreasing the multiplier until we find the proper one.
+         */
+
+        Calendar calendar = CALENDAR_PROVIDER.get();
+
+        calendar.setTimeInMillis(timeInMillis);
+        int year = calendar.get(Calendar.YEAR);
+        int month = calendar.get(Calendar.MONTH);
+
+        calendar.setTimeInMillis(startingTimeInMillis);
+        int startingYear = calendar.get(Calendar.YEAR);
+        int startingMonth = calendar.get(Calendar.MONTH);
+
+        int durationInMonths = (year - startingYear) * MONTHS_PER_YEAR + (month - startingMonth);
+        int multiplier = durationInMonths / duration.months;
+
+        calendar.add(Calendar.MONTH, multiplier * duration.months);
+
+        // If the duration was only containing months, we are done.
+        if (duration.days == 0 && duration.nanoseconds == 0)
+            return calendar.getTimeInMillis();
+
+        long durationInMillis = (duration.days * MILLIS_PER_DAY) + (duration.nanoseconds / NANOS_PER_MILLI);
+        long floor = calendar.getTimeInMillis() + (multiplier * durationInMillis);
+
+        // Once the milliseconds have been added we might have gone too far. If it is the case we will reduce the
+        // multiplier until the floor value is smaller than time in millis.
+        while (floor > timeInMillis)
+        {
+            multiplier--;
+            calendar.add(Calendar.MONTH, -duration.months);
+            floor = calendar.getTimeInMillis() + (multiplier * durationInMillis);
+        }
+
+        return floor < startingTimeInMillis ? startingTimeInMillis : floor;

Review comment:
       We can return `Math.max(floor, startingTimeInMillis)`

##########
File path: src/java/org/apache/cassandra/cql3/functions/NativeScalarFunction.java
##########
@@ -38,4 +41,16 @@ public final boolean isAggregate()
     {
         return false;
     }
+
+    /**
+     * Checks if a partial application of the function is monotonic.
+     *
+     *<p>A function is monotonic if it is either entirely nonincreasing or nondecreasing.</p>

Review comment:
       ```suggestion
        * <p>A function is monotonic if it is either entirely nonincreasing or nondecreasing.</p>
   ```
   ```suggestion
        *<p>A function is monotonic if it is either entirely nonincreasing or nondecreasing.</p>
   ```

##########
File path: src/java/org/apache/cassandra/cql3/functions/ScalarFunction.java
##########
@@ -39,4 +51,49 @@
      * @throws InvalidRequestException if this function cannot not be applied to the parameter
      */
     public ByteBuffer execute(ProtocolVersion protocolVersion, List<ByteBuffer> parameters) throws InvalidRequestException;
+
+    /**
+     * Does a partial application of the function. That is, given only some of the parameters of the function provided,
+     * return a new function that only expect the parameters not provided.
+     * <p>
+     * To take an example, if you consider the function
+     * <pre>
+     *     text foo(int a, text b, text c, int d)
+     * </pre>
+     * then {@code foo.partialApplication([3, <ommitted>, 'bar', <omitted>])} will return a function {@code bar} of signature:
+     * <pre>
+     *     text bar(text b, int d)
+     * </pre>
+     * and such that for any value of {@code b} and {@code d}, {@code bar(b, d) == foo(3, b, 'bar', d)}.
+     *
+     * @param protocolVersion protocol version used for parameters
+     * @param partialParameters a list of input parameters for the function where some parameters can be {@link #UNRESOLVED}.
+     *                          The input <b>must</b> be of size {@code this.argsType().size()}. For convenience, it is
+     *                          allowed both to pass a list with all parameters being {@link #UNRESOLVED} (the function is
+     *                          then returned directy) and with none of them unresolved (in which case the function is computed
+     *                          and a dummy no-arg function returning the result is returned).

Review comment:
       Maybe we could mention that that dummy no-arg function will be precomputed if it's pure?

##########
File path: src/java/org/apache/cassandra/cql3/functions/ScalarFunction.java
##########
@@ -39,4 +51,49 @@
      * @throws InvalidRequestException if this function cannot not be applied to the parameter
      */
     public ByteBuffer execute(ProtocolVersion protocolVersion, List<ByteBuffer> parameters) throws InvalidRequestException;
+
+    /**
+     * Does a partial application of the function. That is, given only some of the parameters of the function provided,
+     * return a new function that only expect the parameters not provided.
+     * <p>
+     * To take an example, if you consider the function
+     * <pre>
+     *     text foo(int a, text b, text c, int d)
+     * </pre>
+     * then {@code foo.partialApplication([3, <ommitted>, 'bar', <omitted>])} will return a function {@code bar} of signature:
+     * <pre>
+     *     text bar(text b, int d)
+     * </pre>
+     * and such that for any value of {@code b} and {@code d}, {@code bar(b, d) == foo(3, b, 'bar', d)}.
+     *
+     * @param protocolVersion protocol version used for parameters
+     * @param partialParameters a list of input parameters for the function where some parameters can be {@link #UNRESOLVED}.
+     *                          The input <b>must</b> be of size {@code this.argsType().size()}. For convenience, it is
+     *                          allowed both to pass a list with all parameters being {@link #UNRESOLVED} (the function is
+     *                          then returned directy) and with none of them unresolved (in which case the function is computed
+     *                          and a dummy no-arg function returning the result is returned).
+     * @return a function corresponding to the partial application of this function to the parameters of
+     * {@code partialParameters} that are not {@link #UNRESOLVED}.
+     */
+    public default ScalarFunction partialApplication(ProtocolVersion protocolVersion, List<ByteBuffer> partialParameters)

Review comment:
       Not strictly related to the changes, but the class-level JavaDoc for `ScalarFunction` says `Determines a single output value based on a single input value.`, when a scalar function can actually be applied to any number of values, as we do here. Maybe we can change that description to something like `Determines a single output value based on any number of input values.`

##########
File path: src/java/org/apache/cassandra/cql3/functions/Function.java
##########
@@ -40,6 +49,13 @@
      */
     public boolean isNative();
 
+    /**
+     * Checks whether the function is a pure function (as in doesn't depend on, nor produce side effects) or not.
+     *
+     * @return <code>true</code> if the function is a pure function, <code>false</code> otherwise.

Review comment:
       Nit: we could use the more concise `{@code true} if the function is a pure function, {@code false} otherwise.`

##########
File path: src/java/org/apache/cassandra/cql3/functions/NativeScalarFunction.java
##########
@@ -38,4 +41,16 @@ public final boolean isAggregate()
     {
         return false;
     }
+
+    /**
+     * Checks if a partial application of the function is monotonic.
+     *
+     *<p>A function is monotonic if it is either entirely nonincreasing or nondecreasing.</p>

Review comment:
       ```suggestion
        * <p>A function is monotonic if it is either entirely nonincreasing or nondecreasing.</p>
        *
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org