You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/08/01 23:36:21 UTC

[GitHub] [druid] kaplanmaxe opened a new pull request #10230: Adding bitwise operation expressions

kaplanmaxe opened a new pull request #10230:
URL: https://github.com/apache/druid/pull/10230


   Related to #8560 
   
   This feature adds several bitwise expressions to be used on ingestion.
   
   **Use Case**
   
   Say you're doing CDC and your source table has a column of binary flags. For OLAP queries, it's extremely useful to have these flags extracted out into their own dimensions which can be done now on ingestion via the new expressions.
   
   Example:
   
   - You might have a column in your source DB called `flags` that represents a series of binary flags. Bit 1 might mean an order was placed on mobile, bit 2 might mean it was purchased on web, bit 4 might mean it was the users first purchase.
   - Instead of ingesting the raw integer value of the `flags` column, you can break these columns out into something like `mobile_ind`, `web_ind`, `first_purchase_ind` where your expression can be `bitwiseAnd(flags, 1)`, `bitwiseAnd(flags, 2)`, `bitwiseAnd(flags, 4)`.
   
   Druid SQL currently does not support bitwise operations (#8560) which makes these even more valuable IMO.
   
   **Tested on a local cluster**
   
   Ingestion spec:
   
   ```
   {
     "type": "index_parallel",
     "spec": {
       "ioConfig": {
         "type": "index_parallel",
         "inputSource": {
           "type": "inline",
           "data": "\"x\",\"y\"\n4,2\n8,4\n16,8\n3,2\n5,2"
         },
         "inputFormat": {
           "type": "csv",
           "findColumnsFromHeader": true
         }
       },
       "tuningConfig": {
         "type": "index_parallel",
         "partitionsSpec": {
           "type": "dynamic"
         }
       },
       "dataSchema": {
         "dataSource": "bitwise_test",
         "granularitySpec": {
           "type": "uniform",
           "queryGranularity": "NONE",
           "rollup": false,
           "segmentGranularity": "YEAR"
         },
         "timestampSpec": {
           "column": "!!!_no_such_column_!!!",
           "missingValue": "2010-01-01T00:00:00Z"
         },
         "transformSpec": {
           "transforms": [
             {
               "type": "expression",
               "name": "zBitwiseAnd",
               "expression": "bitwiseAnd(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseOr",
               "expression": "bitwiseOr(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseComplement",
               "expression": "bitwiseComplement(CAST(x, 'LONG'))"
             },
             {
               "type": "expression",
               "expression": "bitwiseShiftLeft(CAST(x, 'LONG'), CAST(y, 'LONG'))",
               "name": "zBitwiseShiftLeft"
             },
             {
               "type": "expression",
               "name": "zBitwiseShiftRight",
               "expression": "bitwiseShiftRight(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             },
             {
               "type": "expression",
               "name": "zBitwiseXor",
               "expression": "bitwiseXor(CAST(x, 'LONG'), CAST(y, 'LONG'))"
             }
           ]
         },
         "dimensionsSpec": {
           "dimensions": [
             {
               "type": "long",
               "name": "x"
             },
             {
               "type": "long",
               "name": "y"
             },
             {
               "type": "long",
               "name": "zBitwiseAnd"
             },
             {
               "type": "long",
               "name": "zBitwiseComplement"
             },
             {
               "type": "long",
               "name": "zBitwiseOr"
             },
             {
               "type": "long",
               "name": "zBitwiseShiftLeft"
             },
             {
               "type": "long",
               "name": "zBitwiseShiftRight"
             },
             {
               "type": "long",
               "name": "zBitwiseXor"
             }
           ]
         }
       }
     }
   }
   ```
   
   ![Screenshot from 2020-08-01 18-27-54](https://user-images.githubusercontent.com/3860450/89112251-f8bd6500-d42d-11ea-8aff-33f761a34f61.png)
   
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added documentation for new or modified features or behaviors.
   - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [x] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-751238544


   This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-670280819


   @kaplanmaxe thank you for the contribution!
   
   There's another patch with a similar feature: #10084. I prefer the approach you took in this patch — of defining named functions — because it will be easier to integrate into the SQL layer. It is pretty quick to write a SQL operator that is backed by a native expr function. You also have a few more functions added (like bit shifts), which is great.
   
   Some comments:
   
   1. Please add some tests. You could start by copying the tests from #10084 and then adding new ones for the additional functions you have. That patch has a pretty good range of tests for a variety of cases. It's a bit more complicated because it also modified the grammar. Your tests would mostly be going in "FunctionTest".
   2. The functions should be named like other native expr functions, using `snake_case` rather than `camelCase`.
   3. If you're interested in adding SQL functions too, or as a follow up, check out IPv4AddressMatchOperatorConversion for an example of how to implement a SQL function that maps cleanly onto a native expr function. It is mostly just declaring the proper type and operand checking info for the SQL layer.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-670277230


   > Looks like the CI failed on spellcheck of the expression names. Let me know if you want me to change the names
   
   Generally you'd just add them to the spell checker exclusions file: `website/.spelling`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-707627425


   This issue is no longer marked as stale.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis closed pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
clintropolis closed pull request #10230:
URL: https://github.com/apache/druid/pull/10230


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kaplanmaxe commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
kaplanmaxe commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-667604860


   Looks like the CI failed on spellcheck of the expression names. Let me know if you want me to change the names


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-756536305


   Ah, this one can be closed actually, i picked up the commit from here and added tests in #10605


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kaplanmaxe commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
kaplanmaxe commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-670702325


   Thank you @gianm and @clintropolis for your very detailed reviews and feedback. As a first time contributor and someone unfamiliar with the code base, this was extremely helpful.
   
   I will update this over the weekend likely. Unfortunately, i am effected by a major power outage atm, but hope it comes back soon. Druid has been amazing for our use case, and I'm really excited this is conceptually approved as it's the biggest thing affecting us atm. Expect to see some revisions soon within the next few days!
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kaplanmaxe commented on a change in pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
kaplanmaxe commented on a change in pull request #10230:
URL: https://github.com/apache/druid/pull/10230#discussion_r464011328



##########
File path: docs/misc/math-expr.md
##########
@@ -117,6 +117,12 @@ See javadoc of java.lang.Math for detailed explanation for each function.
 |acos|acos(x) would return the arc cosine of x|
 |asin|asin(x) would return the arc sine of x|
 |atan|atan(x) would return the arc tangent of x|
+|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y|
+|bitwiseComplement|bitwiseComplement(x) would return the result of ~x|
+|bitwiseOr|bitwiseOr(x,y) would return the result of x [PIPE] y |

Review comment:
       I added `[PIPE]` as I had trouble escaping `|` within the table. Can play with it though if necessary




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on a change in pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on a change in pull request #10230:
URL: https://github.com/apache/druid/pull/10230#discussion_r466777031



##########
File path: docs/misc/math-expr.md
##########
@@ -117,6 +117,12 @@ See javadoc of java.lang.Math for detailed explanation for each function.
 |acos|acos(x) would return the arc cosine of x|
 |asin|asin(x) would return the arc sine of x|
 |atan|atan(x) would return the arc tangent of x|
+|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y|
+|bitwiseComplement|bitwiseComplement(x) would return the result of ~x|
+|bitwiseOr|bitwiseOr(x,y) would return the result of x [PIPE] y |

Review comment:
       `&#124;` should work (it's done elsewhere in the file, for logical OR).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
clintropolis commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-713905702


   > I'm just going to be honest, our team found a workaround for this one, and I likely don't have the capacity to finish this up. I know this feature is going to be really valuable to end users, and I don't mind if someone else picks up this PR and finishes it up. Really sorry about that, but just want to be honest with intentions so you guys aren't waiting on me to pick it up. In the end, just want what is best for Druid
   
   @kaplanmaxe I should be able to pick this up sometime soon and get it finished up so we can get this feature in, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] kaplanmaxe commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
kaplanmaxe commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-708611381


   Hey guys. Sorry I haven't responded recently.
   
   I'm just going to be honest, our team found a workaround for this one, and I likely don't have the capacity to finish this up. I know this feature is going to be really valuable to end users, and I don't mind if someone else picks up this PR and finishes it up. Really sorry about that, but just want to be honest with intentions so you guys aren't waiting on me to pick it up. In the end, just want what is best for Druid


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-756507167


   Let's keep this one open, stalebot.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] teyeheimans commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
teyeheimans commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-707627393


   I would love to see this functionality in druid. Any chance this could be fixed/merged? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-756507178


   This issue is no longer marked as stale.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] stale[bot] commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
stale[bot] commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-706785995


   This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] gianm commented on pull request #10230: Adding bitwise expressions

Posted by GitBox <gi...@apache.org>.
gianm commented on pull request #10230:
URL: https://github.com/apache/druid/pull/10230#issuecomment-672897772


   Looking forward to it, @kaplanmaxe.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org