You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Ryota Egashira <eg...@yahoo-inc.com> on 2014/11/20 19:54:20 UTC

Review Request 28290: OOZIE-2070 reduce number of HCatalog access for coord latest check

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28290/
-----------------------------------------------------------

Review request for oozie.


Bugs: OOZIE-2070
    https://issues.apache.org/jira/browse/OOZIE-2070


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2070


Diffs
-----

  core/src/main/java/org/apache/oozie/coord/CoordELFunctions.java 7f59186 
  core/src/main/java/org/apache/oozie/dependency/FSURIHandler.java 7c1aadf 
  core/src/main/java/org/apache/oozie/dependency/HCatURIHandler.java 0e690a0 
  core/src/main/java/org/apache/oozie/dependency/URIHandler.java 6e54d4b 
  core/src/main/resources/oozie-default.xml 19cae9d 
  core/src/test/java/org/apache/oozie/command/coord/TestCoordActionInputCheckXCommand.java f79c9a0 
  core/src/test/java/org/apache/oozie/dependency/TestFSURIHandler.java 75d5429 
  core/src/test/java/org/apache/oozie/dependency/TestHCatURIHandler.java a49eba5 
  core/src/test/java/org/apache/oozie/test/MiniHCatServer.java 8699ff8 
  core/src/test/java/org/apache/oozie/test/XHCatTestCase.java 85ee1f2 

Diff: https://reviews.apache.org/r/28290/diff/


Testing
-------


Thanks,

Ryota Egashira


Re: Review Request 28290: OOZIE-2070 reduce number of HCatalog access for coord latest check

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28290/#review64100
-----------------------------------------------------------



core/src/main/java/org/apache/oozie/coord/CoordELFunctions.java
<https://reviews.apache.org/r/28290/#comment106524>

    This is inefficient. Start with min(4, actual range count) as the size, then exponentially increase, but keep limit fixed at 200.



core/src/main/java/org/apache/oozie/dependency/FSURIHandler.java
<https://reviews.apache.org/r/28290/#comment106525>

    Do not do this for fs. Unnecessary checks when not required if we are not using fs liststatus or globbing. Have a supportsBatchExists() method which will return false for FSUriHandler and if that is the case go with the old logic.



core/src/main/java/org/apache/oozie/dependency/HCatURIHandler.java
<https://reviews.apache.org/r/28290/#comment106533>

    Need to use > or < when possible (YYYYMMDD, YYYYMMDDHH, etc patterns). OR is not efficient and will still put lot of load on hcat server



core/src/test/java/org/apache/oozie/dependency/TestHCatURIHandler.java
<https://reviews.apache.org/r/28290/#comment106551>

    Have a extra partition "state" which is not used in the uri template. Users have a lot of partition, but mostly one use 1 or 2 in the uri template.


- Rohini Palaniswamy


On Nov. 20, 2014, 6:54 p.m., Ryota Egashira wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/28290/
> -----------------------------------------------------------
> 
> (Updated Nov. 20, 2014, 6:54 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-2070
>     https://issues.apache.org/jira/browse/OOZIE-2070
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/OOZIE-2070
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/coord/CoordELFunctions.java 7f59186 
>   core/src/main/java/org/apache/oozie/dependency/FSURIHandler.java 7c1aadf 
>   core/src/main/java/org/apache/oozie/dependency/HCatURIHandler.java 0e690a0 
>   core/src/main/java/org/apache/oozie/dependency/URIHandler.java 6e54d4b 
>   core/src/main/resources/oozie-default.xml 19cae9d 
>   core/src/test/java/org/apache/oozie/command/coord/TestCoordActionInputCheckXCommand.java f79c9a0 
>   core/src/test/java/org/apache/oozie/dependency/TestFSURIHandler.java 75d5429 
>   core/src/test/java/org/apache/oozie/dependency/TestHCatURIHandler.java a49eba5 
>   core/src/test/java/org/apache/oozie/test/MiniHCatServer.java 8699ff8 
>   core/src/test/java/org/apache/oozie/test/XHCatTestCase.java 85ee1f2 
> 
> Diff: https://reviews.apache.org/r/28290/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ryota Egashira
> 
>


Re: Review Request 28290: OOZIE-2070 reduce number of HCatalog access for coord latest check

Posted by Ryota Egashira <eg...@yahoo-inc.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/28290/
-----------------------------------------------------------

(Updated Dec. 11, 2014, 6:19 a.m.)


Review request for oozie.


Changes
-------

added range query optimization, for example

ts=${YEAR}-${MONTH}-${DAY}
say, latest check perform batch exist from 2013-02-01 back to 2013-01-15,
query is like  ts <= '2013-02-01' and ts >='2013-01-15'

range query is not used when ts=${MONTH}-${DAY}-${YEAR}, time variable is in out of sequence, which mess up range. instead, using OR query.
also range query is not used when time variables are used in multiple partitions,like ts1=${YEAR};ts2=${MONTH};ts3=${DAY}.  one approach is to apply range for the biggest scale like ${YEAR} in this case,  but there is risk that the range could be too large and might end up too many partitions even when not needed, depending on partition structure (for example,  if partiton is structured donw to minute or second level,   2013<=year<=2014, might check too many, thus this type of optimization could be future item. 

this patch also includes two test case failure fix agaisnt -P hadoop-2
1)org.apache.oozie.command.wf.TestSubmitXCommand.testProtoConfStorage
2)org.apache.oozie.service.TestLiteWorkflowAppService.testCreateprotoConfWithMulipleLibPath


Bugs: OOZIE-2070
    https://issues.apache.org/jira/browse/OOZIE-2070


Repository: oozie-git


Description
-------

https://issues.apache.org/jira/browse/OOZIE-2070


Diffs (updated)
-----

  core/src/main/java/org/apache/oozie/coord/CoordELFunctions.java 7f59186 
  core/src/main/java/org/apache/oozie/dependency/FSURIHandler.java 7c1aadf 
  core/src/main/java/org/apache/oozie/dependency/HCatURIHandler.java 0e690a0 
  core/src/main/java/org/apache/oozie/dependency/URIHandler.java 6e54d4b 
  core/src/main/resources/oozie-default.xml 3d07c6f 
  core/src/test/java/org/apache/oozie/command/coord/TestCoordActionInputCheckXCommand.java f79c9a0 
  core/src/test/java/org/apache/oozie/command/wf/TestSubmitXCommand.java 42fa198 
  core/src/test/java/org/apache/oozie/dependency/TestFSURIHandler.java 75d5429 
  core/src/test/java/org/apache/oozie/dependency/TestHCatURIHandler.java a49eba5 
  core/src/test/java/org/apache/oozie/service/TestLiteWorkflowAppService.java 5977c8f 

Diff: https://reviews.apache.org/r/28290/diff/


Testing
-------


Thanks,

Ryota Egashira