You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Mostafa Mokhtar <mm...@hortonworks.com> on 2015/05/22 07:44:11 UTC

Review Request 34586: HIVE-10704

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c5dd03abe9ff57bf64d87be0f3ef34aa7a 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
-------


Thanks,

Mostafa Mokhtar


Re: Review Request 34586: HIVE-10704

Posted by Alexander Pivovarov <ap...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/#review87672
-----------------------------------------------------------

Ship it!


Ship It!

- Alexander Pivovarov


On May 27, 2015, 6:33 a.m., Mostafa Mokhtar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34586/
> -----------------------------------------------------------
> 
> (Updated May 27, 2015, 6:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> fix biggest small table selection when table sizes are 0
> fallback to dividing memory equally if any tables have invalid size
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 
> 
> Diff: https://reviews.apache.org/r/34586/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mostafa Mokhtar
> 
>


Re: Review Request 34586: HIVE-10704

Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/#review85870
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java
<https://reviews.apache.org/r/34586/#comment137716>

    The check for "totalSize <= 0"  is there to terminate the loop early, incases where these are two tables that add up to <= 0.
    
    And the check for negative valaues is covered here 
    
    if (tableSize <= 0) {
              fallbackToEqualProportions = true;
              break;
            }


- Mostafa Mokhtar


On May 27, 2015, 6:33 a.m., Mostafa Mokhtar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34586/
> -----------------------------------------------------------
> 
> (Updated May 27, 2015, 6:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> fix biggest small table selection when table sizes are 0
> fallback to dividing memory equally if any tables have invalid size
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 
> 
> Diff: https://reviews.apache.org/r/34586/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mostafa Mokhtar
> 
>


Re: Review Request 34586: HIVE-10704

Posted by Jason Dere <jd...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/#review87522
-----------------------------------------------------------


- Jason Dere


On May 27, 2015, 6:33 a.m., Mostafa Mokhtar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34586/
> -----------------------------------------------------------
> 
> (Updated May 27, 2015, 6:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> fix biggest small table selection when table sizes are 0
> fallback to dividing memory equally if any tables have invalid size
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 
> 
> Diff: https://reviews.apache.org/r/34586/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mostafa Mokhtar
> 
>


Re: Review Request 34586: HIVE-10704

Posted by Jason Dere <jd...@hortonworks.com>.

> On May 30, 2015, 2:57 a.m., Alexander Pivovarov wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java, line 201
> > <https://reviews.apache.org/r/34586/diff/2/?file=972516#file972516line201>
> >
> >     The comment above says - "if any table has bad size estimate...".
> >     But why you check "totalSize <= 0" then?
> >     Should you iterate over all small tables and check that they all have good size estimate.
> >     
> >     What if you have table sizes (100, -4, 0)
> >     totalSize is 96. But table #2 size is -4, which is bad size.
> >     
> >     To make code clear I recommend to add new boolean variable isAnyTableHasBadSize and set its value it in the place where you calc totalSize, biggest and maxSize

The logic here does still check each table individually to make sure that the table has valid size (lines 201-214). It just uses the initial check (totalSize <= 0) to see whether iterating over the tables is even necessary, if the size is non-positive we don't even have to bother checking each table and we will automatically fallback to equal proportions.


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/#review85853
-----------------------------------------------------------


On May 27, 2015, 6:33 a.m., Mostafa Mokhtar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34586/
> -----------------------------------------------------------
> 
> (Updated May 27, 2015, 6:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> fix biggest small table selection when table sizes are 0
> fallback to dividing memory equally if any tables have invalid size
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 
> 
> Diff: https://reviews.apache.org/r/34586/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mostafa Mokhtar
> 
>


Re: Review Request 34586: HIVE-10704

Posted by Alexander Pivovarov <ap...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/#review85853
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java
<https://reviews.apache.org/r/34586/#comment137688>

    The comment above says - "if any table has bad size estimate...".
    But why you check "totalSize <= 0" then?
    Should you iterate over all small tables and check that they all have good size estimate.
    
    What if you have table sizes (100, -4, 0)
    totalSize is 96. But table #2 size is -4, which is bad size.
    
    To make code clear I recommend to add new boolean variable isAnyTableHasBadSize and set its value it in the place where you calc totalSize, biggest and maxSize


- Alexander Pivovarov


On May 27, 2015, 6:33 a.m., Mostafa Mokhtar wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34586/
> -----------------------------------------------------------
> 
> (Updated May 27, 2015, 6:33 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> fix biggest small table selection when table sizes are 0
> fallback to dividing memory equally if any tables have invalid size
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 
> 
> Diff: https://reviews.apache.org/r/34586/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mostafa Mokhtar
> 
>


Re: Review Request 34586: HIVE-10704

Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
-----------------------------------------------------------

(Updated May 27, 2015, 6:33 a.m.)


Review request for hive.


Repository: hive-git


Description
-------

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
-------


Thanks,

Mostafa Mokhtar


Re: Review Request 34586: HIVE-10704

Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
-----------------------------------------------------------

(Updated May 27, 2015, 6:32 a.m.)


Review request for hive.


Repository: hive-git


Description
-------

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c5dd03abe9ff57bf64d87be0f3ef34aa7a 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
-------


Thanks,

Mostafa Mokhtar


Re: Review Request 34586: HIVE-10704

Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34586/
-----------------------------------------------------------

(Updated May 27, 2015, 6:30 a.m.)


Review request for hive.


Repository: hive-git


Description
-------

fix biggest small table selection when table sizes are 0
fallback to dividing memory equally if any tables have invalid size


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 536b92c5dd03abe9ff57bf64d87be0f3ef34aa7a 

Diff: https://reviews.apache.org/r/34586/diff/


Testing
-------


File Attachments (updated)
----------------

HIVE-10704.3.patch
  https://reviews.apache.org/media/uploaded/files/2015/05/27/4a999c9c-1c3f-44dd-a321-a4157a067300__HIVE-10704.3.patch


Thanks,

Mostafa Mokhtar