You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gora.apache.org by cloudysunny14 <gi...@git.apache.org> on 2017/02/16 14:34:25 UTC

[GitHub] gora pull request #95: Fixed issue GORA-443

GitHub user cloudysunny14 opened a pull request:

    https://github.com/apache/gora/pull/95

    Fixed issue GORA-443

    https://github.com/apache/gora/pull/86
    > All of tests pass when run individually, but when run as a whole some of them fail.
    
    I also encountered the same problem.
    I think this is because of AsyncProcessor asynchronously process a series of mutations.
    I made this fix as a possible solution.
    Anyway, If there is anything I can contribute to something, I definitely want to work for GORA or NUTCH :)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudysunny14/gora GORA-443-fixed-issue-mutation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/gora/pull/95.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #95
    
----
commit 5664dc6791d97116f7613e9bb8407583b411b457
Author: Kiyonari Harigae <la...@cloudysunny14.org>
Date:   2017-02-16T14:24:40Z

    Fixed issue GORA-443

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora pull request #95: Fixed issue GORA-443

Posted by renato2099 <gi...@git.apache.org>.
Github user renato2099 commented on a diff in the pull request:

    https://github.com/apache/gora/pull/95#discussion_r101905632
  
    --- Diff: gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseTableConnection.java ---
    @@ -112,9 +112,9 @@ public void flushCommits() throws IOException {
         for (ConcurrentLinkedQueue<Mutation> buffer : bPool) {
           for (Mutation m: buffer) {
             bufMutator.mutate(m);
    +        bufMutator.flush();
           }
         }
    -    bufMutator.flush();
         bufMutator.close();
    --- End diff --
    
    yeah I tried this as well, but this would mean that if we have millions of operations buffered, we would flush each and every one of them one by one, but even with this we couldn't get the tests to pass. IMO the bufMutator.flush() should remain where it is, and we should find out why the mutations get applied asynchronously when we call flush.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora issue #95: Fixed issue GORA-443

Posted by alfonsonishikawa <gi...@git.apache.org>.
Github user alfonsonishikawa commented on the issue:

    https://github.com/apache/gora/pull/95
  
    I know this thread is old, but I don't understand why this not surfaced before (the issue at HBase is from 2014). It seems that at HBase they will not fix it until 2.0.0: https://issues.apache.org/jira/browse/HBASE-8770
    Thank you for the hack-fix!  :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora issue #95: Fixed issue GORA-443

Posted by lewismc <gi...@git.apache.org>.
Github user lewismc commented on the issue:

    https://github.com/apache/gora/pull/95
  
    I tried this locally and I am still getting error messages
    ```
    Results :
    
    Failed tests:   testUpdate(org.apache.gora.hbase.store.TestHBaseStore)
      testGetWebPage(org.apache.gora.hbase.store.TestHBaseStore)
      testQuery(org.apache.gora.hbase.store.TestHBaseStore)
      testQueryStartKey(org.apache.gora.hbase.store.TestHBaseStore)
      testQueryWebPageSingleKey(org.apache.gora.hbase.store.TestHBaseStore)
      testDeleteByQueryFields(org.apache.gora.hbase.store.TestHBaseStore)
    
    Tests run: 42, Failures: 6, Errors: 0, Skipped: 1
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora issue #95: Fixed issue GORA-443

Posted by cloudysunny14 <gi...@git.apache.org>.
Github user cloudysunny14 commented on the issue:

    https://github.com/apache/gora/pull/95
  
    Thank you for comments.
    
    I think this is because of the reason as follows:
    
    First, BufferedMutation#flush is processed synchronously, 
    and a batch request (MultiRequest) is created from a buffered mutations.
    
    Then, RegionServer processes a MultiRequest as a minibatch(HRegion#doMiniBatchMutation) that updates the timestamp of each cell to currentTime if Mutation has HConstants.LATEST_TIMESTAMP(by default).
    This operation apply all mutations in minibatch, therefore all cells has same timestamp.
    
    Since the HBaseStore#put create the Delete and Put as the MultiRequest, Puts are invisible.
    
    See Also:
    https://issues.apache.org/jira/browse/HBASE-2256
    https://hbase.apache.org/book.html#version.delete
    
    I'm sorry for my poor english :(
    
    I made this fix as a possible solution. (HACK)
    https://github.com/apache/gora/pull/95/commits/b0cd1950c978181890213e1c85e437e44421405e
    
    However, does not pass testDeleteByQueryFields yet.
    This is known issue GORA-472. I will create(reopen) the pull request for GORA-472 later.
    and I am trying to run all test..
    
    Kiyonari Harigae



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora issue #95: Fixed issue GORA-443

Posted by renato2099 <gi...@git.apache.org>.
Github user renato2099 commented on the issue:

    https://github.com/apache/gora/pull/95
  
    Thanks for finding this out @alfonsonishikawa ! This makes much more sense now! 
    I do agree with you that is weird that this hasn't surfaced before. My guess would be (1) using a slower machine before which would make the deletes come after the inserts, but with a faster machine they could be swapped (we are also using the same writer btw) (2) Multi-threaded Hbase client? I think this came later than our previous releases, but I don't know if the client we are using is actually multithreaded or not. Maybe another thing to try would be [limiting HBase client](https://stackoverflow.com/questions/42919819/hbase-multithreading-client-performance) to see what happens by reducing the connection pool.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora pull request #95: Fixed issue GORA-443

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/gora/pull/95


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] gora issue #95: Fixed issue GORA-443

Posted by lewismc <gi...@git.apache.org>.
Github user lewismc commented on the issue:

    https://github.com/apache/gora/pull/95
  
    OK thank you @cloudysunny14 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---