You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by abdelhakim deneche <ad...@gmail.com> on 2015/06/03 05:05:43 UTC

Review Request 34977: DRILL-3200: Add Window functions: ROW_NUMBER, RANK, PERCENT_RANK, DENSE_RANK and CUME_DIST

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34977/
-----------------------------------------------------------

Review request for drill and Steven Phillips.


Bugs: DRILL-3200
    https://issues.apache.org/jira/browse/DRILL-3200


Repository: drill-git


Description
-------

This is an non final patch, although all required window functions have been implemented the code still needs to be cleaned and properly commented. Here is a list of changes made:

- enum WindowFrameRecordBatch.WindowFunction to handle supported window function and their corresponding output MajorType
- renamed WindowFrameTemplate -> DefaultFrameTemplate, cleaned the template to handle the default frame efficiently:
  . a batch can be processed as soon as we find the last peer row of it's last row
  . once a batch is processed it can be safely released => we can transfer it's value vectors to the container instead of copying them
- DefaultFrameTemplate.Partition tracks the current window frame and computes the following window functions automatically: row_number, rank, dense_rank, percent_rank, cume_dist. It doesn't need to aggregate the value vectors to compute these window functions
- updated TestWindowFrame to check the results of row_number, rank, dense_rank, percent_rank and cume_dist in various cases
  . added a debug config option to MSorter to control the size of batches. This is needed by TestWindowFrame so it can use small test data files (20 rows per batch)
  . removed contrib/data/window-test-data
- WindowFrameRecordBatch properly releases saved batches if the query stops prematurely
- GenerateTestData can be used to generate test data for the window function unit tests [it's a work in progress and can be either improved to make it developer friendly or removed from the final patch]


Diffs
-----

  contrib/data/pom.xml d1def76 
  contrib/data/window-test-data/pom.xml 6d195da 
  exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java 91793f5 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/DefaultFrameTemplate.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/Partition.java PRE-CREATION 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java 428632f 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameTemplate.java 78bab54 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java 23a2b53 
  exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/MSortTemplate.java 9b21ae3 
  exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractRecordBatch.java 330ec79 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java PRE-CREATION 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java 2b8bd64 
  exec/java-exec/src/test/resources/window/b1.p1.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b1.p1.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b1.p1/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b1.p2.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b1.p2.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b1.p2/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p2.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p2.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p2/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p2/1.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4/1.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b3.p2.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b3.p2.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b3.p2/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b3.p2/1.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b3.p2/2.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.subs.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4/0.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4/1.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4/2.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4/3.data.json PRE-CREATION 
  exec/java-exec/src/test/resources/window/mediumData.json ad86627 
  exec/java-exec/src/test/resources/window/oneKeyCount.json fa5cd8c 
  exec/java-exec/src/test/resources/window/oneKeyCountData.json 3c0115e 
  exec/java-exec/src/test/resources/window/oneKeyCountMultiBatch.json 09a405c 
  exec/java-exec/src/test/resources/window/twoKeys.json f3ef4a5 
  exec/java-exec/src/test/resources/window/twoKeysData.json fd09236 

Diff: https://reviews.apache.org/r/34977/diff/


Testing
-------


Thanks,

abdelhakim deneche