You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by PuspenduBanerjee <gi...@git.apache.org> on 2016/02/02 22:12:21 UTC

[GitHub] nifi pull request: Performance improvement. test Timeout Mitigatio...

GitHub user PuspenduBanerjee opened a pull request:

    https://github.com/apache/nifi/pull/202

    Performance improvement. test Timeout Mitigation. less IO, less depen…

    Performance improvement. test Timeout Mitigation. less IO, less dependency on /dev/(u)random.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PuspenduBanerjee/nifi orgin_master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/202.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #202
    
----
commit 0657b3a908910241cbf709a7b1387766d1e12e8b
Author: puspendu.banerjee@gmail.com <pu...@gmail.com>
Date:   2016-02-02T21:04:37Z

    Performance improvement. test Timeout Mitigation. less IO, less dependency on /dev/(u)random.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/202


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178942251
  
    I did a quick experiment (code below), generated a 10^20 "UUID"'s using a few mechanisms
    
    1. as you did in this pr (~120ms)
    2. an slightly different version of what is in the PR  (~55ms)
    3. using UUID with the atomic long as the low order bits (~800ms)
    4. what was there originally (~3400ms)
    5. Using JUG [1] to generate a time-based UUID (~800ms)
    
    So, I think it is a net win with changing the method you created to return a String rather than a Long (based on 1 vs 2). On the long as a string vs UUID as a string, although it is faster by a factor of  6, I do think returning a UUID as a string may better fit the contract of the UUID core attribute. Although pretty much everywhere it is stored, it is done so as a String, the comments clearly say UUID [2]. I'm not sure the technical debt accrued for a Long vs UUID performance increase is a good trade in this instance. 
    
    Side note - the JUG github page is an interesting read. There are some good benchmarks talking about how awesome it is.
    
    [1] https://github.com/cowtowncoder/java-uuid-generator
    [2] https://github.com/apache/nifi/blob/master/nifi-commons/nifi-utils/src/main/java/org/apache/nifi/flowfile/attributes/CoreAttributes.java#L34
    ```java
        public static void main(String args[]){
            long iter = 1<<20;
            AtomicLong ai = new AtomicLong(0);
            long now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
                Long x = ai.getAndIncrement();
                String x2 = x.toString();
            }
            System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now ));
    
            now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
                
                String x2 = Long.toString(ai.getAndIncrement());
            }
            System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now ));
    
            now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
                UUID u = new UUID(0, ai.getAndIncrement());
                String x2 = u.toString();
            }
            System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now ));
            
            now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
                UUID u = UUID.randomUUID();
                String x2 = u.toString();
            }
            System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now ));
        
            now = System.currentTimeMillis();
            TimeBasedGenerator g = Generators.timeBasedGenerator();
            for(int i=0; i < iter; i++) {
                UUID u = g.generate();
                String x2 = u.toString();
            }
            System.out.printf("%d elapsed\n", (System.currentTimeMillis() - now ));
        }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by taftster <gi...@git.apache.org>.
Github user taftster commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179495566
  
    1) StringBuilder should probably be preferred to StringBuffer in this case.  Though it probably won't matter for performance, the intention is that this is the unsychronized (thread safe) usage.
    
    2)  I'm opting for readability over speed.  What's wrong with something like:
    
    ```
    private static String createFakeUUID() {
        return new StringBuilder()
            .append("00000000-0000-0000-")
            .append(String.format("%016x", id)
            .insert(23, '-')
            .toString();
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179211849
  
    @trkurc Yes , updated ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178958929
  
    @trkurc :+1:  That's a good catch. Yes Long.toString(long) is faster but the difference is only about ~60 milliSeconds for 10^20 iteration because:
    ```java
        /**
         * Returns a {@code String} object representing this
         * {@code Long}'s value.  The value is converted to signed
         * decimal representation and returned as a string, exactly as if
         * the {@code long} value were given as an argument to the
         * {@link java.lang.Long#toString(long)} method.
         *
         * @return  a string representation of the value of this object in
         *          base&nbsp;10.
         */
        public String toString() {
            return toString(value);
        }
    ```
    But, do you think that should be accounted against the manageability of the method in context of the provided test scenario?
    And, please try the following code, try to run it &  you will  something noticeable:
    ```java
        public static void main(String args[]){
            for (int i = 0; i < 100; i++) {
                main1();
            }
        }
        
        public static void main1(){
            long iter = 1<<20;
            AtomicLong ai = new AtomicLong(0);
            long now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
                Long x = ai.getAndIncrement();
                String x2 = x.toString();
            }
            System.out.printf("1 ===== %d elapsed\n", (System.currentTimeMillis() - now ));
    
            now = System.currentTimeMillis();
            for(int i=0; i < iter; i++) {
    
                String x2 = Long.toString(ai.getAndIncrement());
            }
            System.out.printf("2 ===== %d elapsed\n", (System.currentTimeMillis() - now ));
    
           
            System.out.println("=============================DONE========================");
        }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178965345
  
    > But, do you think that should be accounted against the manageability of the method in context of the provided test scenario?
    
    I'm not sure I'm following. To give you an idea of my thought process with 2, "if we're going for maximum speed, changing the method signature, and avoiding some boxing would be ideal". 
    
    However, it was the latter part of my claim about 3 being "performant enough" and not dependent on a RNG, and still a "UUID" which I thought more thought provoking - fast enough, potentially less tech debt, everyone wins?
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178874441
  
    Raised issue :  https://issues.apache.org/jira/browse/NIFI-1460


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179265990
  
    Probably a basic question, but am I missing any process to request a PR to merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178875032
  
    @trkurc  Thanks for reviewing. let me know in case can be merged or if I need to create a patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: Test Performance improvement. test Timeout Miti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178869385
  
    @oleg Sure. Raising one now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: Test Performance improvement. test Timeout Miti...

Posted by olegz <gi...@git.apache.org>.
Github user olegz commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178865279
  
    @PuspenduBanerjee could you please raise JIRAs before submitting PRs. This is kind of the process that these projects follow and its easier for planning as well as compelling release notes during the actual release. Also it helps traceability since (as you have seen) the convention that everyone follows is to include JIRA number in the commit message. It allows us to trace any discussions straight down to the code.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179209898
  
    I am not able to verify right now (no compiler at hand), but I like this approach. Do you plan to update the PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: Test Performance improvement. test Timeout Miti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178872670
  
    Reviewing now. Anything that makes tests faster makes me happy!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: Test Performance improvement. test Timeout Miti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-178847399
  
    @joewitt Another chance of improving test performance & less IO  by replacing  `UUID.randomUUID()` with` AtomicLong#incrementAndGet()`, which is sufficient for our test cases at org.apache.nifi.controller#TestStandardFlowFileQueue.
    
    UUID.randomUUID() affects linux/unix systems heavily with Blocking IO from /dev/(u)random, entropy starvation etc., so let's not use that unless necessary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179413462
  
    @PuspenduBanerjee - not missing anything, I should be able to merge in your patch tonight


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by PuspenduBanerjee <gi...@git.apache.org>.
Github user PuspenduBanerjee commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179166551
  
    @trkurc  I think, if UUID format is of concern the following piece will run more than 2 times faster compared to `UUID(0,long)`, could you please re-verify:
    ```java
    private static String createFakeUUIDString(final AtomicLong atomicLong){
             final String s=Long.toHexString(atomicLong.incrementAndGet());
            return new StringBuffer("00000000-0000-0000-0000000000000000"
                                          .substring(0,(35-s.length()))+s)
                                           .insert(23, '-').toString();
        }
    ```
    I hope, everyone wins that way too along with more performance benefit.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] nifi pull request: NIFI-1460 Test Performance improvement. test Ti...

Posted by trkurc <gi...@git.apache.org>.
Github user trkurc commented on the pull request:

    https://github.com/apache/nifi/pull/202#issuecomment-179579922
  
    @taftster  - code readability is subjective, I prefer actually prefer what is in the PR. The String.format makes it an order of magnitude slower based on my initial tests. I see no appreciable difference between StringBuffer and StringBuilder
    
    The results of the below code:
    ``
    1 ===== 304 elapsed
    2 ===== 1717 elapsed
    3 ===== 248 elapsed
    ```
    The StringBuffer is faster, but it is likely due to some of the JVM string magic.
    
    ```java
            int iter = 1<<20;
            AtomicLong ai = new AtomicLong(0);
    
            long now = System.currentTimeMillis();
    
            for(int i=0; i < iter; i++) {
                final String s=Long.toHexString(ai.getAndIncrement());
                String x2 = new StringBuilder("00000000-0000-0000-0000000000000000".substring(0,(35-s.length()))+s).insert(23, '-').toString();
                
            }
            System.out.printf("1 ===== %d elapsed\n", (System.currentTimeMillis() - now ));
            
            now = System.currentTimeMillis();
    
            for(int i=0; i < iter; i++) {
                String x2 =        new StringBuilder()
                .append("00000000-0000-0000-")
                .append(String.format("%016x", ai.getAndIncrement()))
                 .insert(23, '-')
                .toString();     
            }
            System.out.printf("2 ===== %d elapsed\n", (System.currentTimeMillis() - now ));
            
            
            now = System.currentTimeMillis();
    
            for(int i=0; i < iter; i++) {
                final String s=Long.toHexString(ai.getAndIncrement());
    
                String x2 = new StringBuffer("00000000-0000-0000-0000000000000000".substring(0,(35-s.length()))+s).insert(23, '-').toString();
    
            }
            System.out.printf("3 ===== %d elapsed\n", (System.currentTimeMillis() - now ));
    ``` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---