You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@datasketches.apache.org by GitBox <gi...@apache.org> on 2019/11/11 05:40:16 UTC

[GitHub] [incubator-datasketches-cpp] ravindra-wagh commented on issue #68: Can we add an enhancement to merge update_theta_sketch?

ravindra-wagh commented on issue #68: Can we add an enhancement to merge update_theta_sketch?
URL: https://github.com/apache/incubator-datasketches-cpp/issues/68#issuecomment-552300890
 
 
   Let us say we have data for 2 columns say for **X** and **Y** and we want to perform 3 standard set of operations on them: **union, intersection and difference**.
   The columns data are in big size and distributed among the nodes for the performance so let us say we have 4 nodes and each node create it's own _update_theta_sketch_ for the received data. Finally leader node combines all the sketches into one sketch.
   
   **Column X computation:**
   
                    X
                    |
        --------------------------
        |       |       |        |
        sk1    sk2     sk3      sk4   ==> All are update_theta_sketch
        |       |       |        |
         -------------------------
                   |
                 merge()
                   |
          update_theta_sketch_X
   	
   **Column Y computation:**
   
                    Y
                    |
        --------------------------
        |       |       |        |
        sk1    sk2     sk3      sk4   ==> All are update_theta_sketch
        |       |       |        |
         -------------------------
                   |
                 merge()
                   |
          update_theta_sketch_Y
   Now with **update_theta_sketch_X** and **update_theta_sketch_Y** sketches, we can easily perform union, intersection and difference on them. 
   Instead of merge(), if we use **theta_union** to combine them then we would get **theta_union_X** and **theta_union_Y**. With these, we can perform only union operations and not intersection and difference directly. To perform intersection and difference, first we have to convert _theta_union_X_ and _theta_union_Y_ to **compact_theta_sketch_X** and **compact_theta_sketch_Y** respectively and then apply intersection and difference on them.
   If we have **merge()** function as part of **update_theta_sketch** class, then we can easily perform all the operations using base class only.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@datasketches.apache.org
For additional commands, e-mail: commits-help@datasketches.apache.org