You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Gordon Sim <gs...@redhat.com> on 2015/04/28 12:57:39 UTC

Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/
-----------------------------------------------------------

(Updated April 28, 2015, 10:57 a.m.)


Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.


Bugs: PROTON-858
    https://issues.apache.org/jira/browse/PROTON-858


Repository: qpid-proton-git


Description
-------

The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.

This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.

(I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).


Diffs
-----

  proton-c/src/object/map.c 1a758a1 

Diff: https://reviews.apache.org/r/33623/diff/


Testing
-------


Thanks,

Gordon Sim


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 11:07 a.m., Rafael Schloming wrote:
> > proton-c/src/object/map.c, line 335
> > <https://reviews.apache.org/r/33623/diff/1/?file=943957#file943957line335>
> >
> >     These decrefs can trigger finalizers which could indirectly cause more entries to be deleted from the map thereby causing the code to become reentrant. To make that safe, you want the decrefs to be the last thing you do so that all the structural modifications to the entries are complete before that can happen.

Thanks, I'll fix that.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81797
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Rafael Schloming <rh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81797
-----------------------------------------------------------



proton-c/src/object/map.c
<https://reviews.apache.org/r/33623/#comment132288>

    These decrefs can trigger finalizers which could indirectly cause more entries to be deleted from the map thereby causing the code to become reentrant. To make that safe, you want the decrefs to be the last thing you do so that all the structural modifications to the entries are complete before that can happen.


- Rafael Schloming


On April 28, 2015, 10:57 a.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 10:57 a.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Rafael Schloming <rh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81958
-----------------------------------------------------------

Ship it!


Ship It!

- Rafael Schloming


On April 29, 2015, 11:53 a.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 29, 2015, 11:53 a.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81975
-----------------------------------------------------------

Ship it!


Ship It!

- Andrew Stitcher


On April 29, 2015, 11:53 a.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 29, 2015, 11:53 a.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Alan Conway <ac...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81973
-----------------------------------------------------------

Ship it!


Ship It!

- Alan Conway


On April 29, 2015, 11:53 a.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 29, 2015, 11:53 a.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/
-----------------------------------------------------------

(Updated April 29, 2015, 11:53 a.m.)


Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.


Changes
-------

Fixed tests, added another.


Bugs: PROTON-858
    https://issues.apache.org/jira/browse/PROTON-858


Repository: qpid-proton-git


Description
-------

The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.

This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.

(I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).


Diffs (updated)
-----

  proton-c/src/object/map.c 1a758a1 
  proton-c/src/tests/object.c 2b213c5 

Diff: https://reviews.apache.org/r/33623/diff/


Testing
-------


Thanks,

Gordon Sim


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 10:08 p.m., Rafael Schloming wrote:
> > proton-c/src/tests/object.c, line 804
> > <https://reviews.apache.org/r/33623/diff/3/?file=944252#file944252line804>
> >
> >     The pn_string_t type is mutable, so the hash you are setting up here has a lot of different keys all pointing to a single mutable string value. 
> >     
> >     The reason this appears to manifest as a valgrind error is that your assertion checking the value associated with each key is failing since all the keys map to the same value. When this happens the map (and string value) are not freed. The relevant error though is not the memory leak but the assertion failure.
> >     
> >     This is a bit hard to spot in the test output since the valgrind output is noiser, but the process is also aborting with the assert failure.
> >     
> >     It occurs to me if we install a handler for the abort signal and print a stack trace then it would be a lot more obvious what is going on here.

Sorry, I had indeed missed that. Fixed now though.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81890
-----------------------------------------------------------


On April 29, 2015, 11:53 a.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 29, 2015, 11:53 a.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Rafael Schloming <rh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81890
-----------------------------------------------------------



proton-c/src/tests/object.c
<https://reviews.apache.org/r/33623/#comment132414>

    The pn_string_t type is mutable, so the hash you are setting up here has a lot of different keys all pointing to a single mutable string value. 
    
    The reason this appears to manifest as a valgrind error is that your assertion checking the value associated with each key is failing since all the keys map to the same value. When this happens the map (and string value) are not freed. The relevant error though is not the memory leak but the assertion failure.
    
    This is a bit hard to spot in the test output since the valgrind output is noiser, but the process is also aborting with the assert failure.
    
    It occurs to me if we install a handler for the abort signal and print a stack trace then it would be a lot more obvious what is going on here.


- Rafael Schloming


On April 28, 2015, 6:35 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 6:35 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/
-----------------------------------------------------------

(Updated April 28, 2015, 6:35 p.m.)


Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.


Changes
-------

This adds a unit test showing a problem case. At present the test passes, but shows up a couple of valgrind errors for the map and value string. Clearly I'm doing something wrong but have been bashing my head against this for ages without seeing the issue. Any pointers? (I use the pn_hash interface in order to be able to use plain ints as keys, which makes figuring out how they hash simpler).


Bugs: PROTON-858
    https://issues.apache.org/jira/browse/PROTON-858


Repository: qpid-proton-git


Description
-------

The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.

This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.

(I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).


Diffs (updated)
-----

  proton-c/src/object/map.c 1a758a1 
  proton-c/src/tests/object.c 2b213c5 

Diff: https://reviews.apache.org/r/33623/diff/


Testing
-------


Thanks,

Gordon Sim


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.

> On April 28, 2015, 4:42 p.m., Andrew Stitcher wrote:
> > I can't see anything obviously wrong with this code, but that doesn't mean too much - so I agree some unit tests are pretty essential.
> > 
> > Incidentally, I think the strategy I would have used would be to mark the entry deleted, and then only actually discard it when and if the table gets resized, but it sounds like you maty have tried that approach. And it certainly has the problem of expanding the table perhaps unnecessarily.
> 
> Gordon Sim wrote:
>     I initially tried introducing an extra 'deleted' state, but still rehashed chains (rather arbitrarily if they ever contained 2 or more 'deleted' entries). Something like that is I think workable, but is a more complex change. (I'll revisit, but I'd like to get a simpler fix first).
>     
>     If you don't reuse or free deleted entries then the chains can get longer which can then (in theory) impact lookup. If you only rehash when expanding then I think this is exarcerbated as the length of chains increases with increased capacity.

Well you can reuse the deleted entries as if they were empty and that won't affect the chains. But you are correct that leaving them in the chains without reinserting the following entries will make the chains longer and longer thus impacting lookup and insertion time.

In any case the whole coalesced strategy is not going to work well for a hash table that has a lot of deletes compared to inserts and lookups. I think this is generally true as a comparison of the efficiency of hash tables against some form of tree structure.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81828
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 4:42 p.m., Andrew Stitcher wrote:
> > I can't see anything obviously wrong with this code, but that doesn't mean too much - so I agree some unit tests are pretty essential.
> > 
> > Incidentally, I think the strategy I would have used would be to mark the entry deleted, and then only actually discard it when and if the table gets resized, but it sounds like you maty have tried that approach. And it certainly has the problem of expanding the table perhaps unnecessarily.

I initially tried introducing an extra 'deleted' state, but still rehashed chains (rather arbitrarily if they ever contained 2 or more 'deleted' entries). Something like that is I think workable, but is a more complex change. (I'll revisit, but I'd like to get a simpler fix first).

If you don't reuse or free deleted entries then the chains can get longer which can then (in theory) impact lookup. If you only rehash when expanding then I think this is exarcerbated as the length of chains increases with increased capacity.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81828
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81828
-----------------------------------------------------------


I can't see anything obviously wrong with this code, but that doesn't mean too much - so I agree some unit tests are pretty essential.

Incidentally, I think the strategy I would have used would be to mark the entry deleted, and then only actually discard it when and if the table gets resized, but it sounds like you maty have tried that approach. And it certainly has the problem of expanding the table perhaps unnecessarily.

- Andrew Stitcher


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.
> 
> Gordon Sim wrote:
>     I don't understand your point.
> 
> Andrew Stitcher wrote:
>     I'm trying to say that Alan's suggested optimisation is incorrect.
> 
> Gordon Sim wrote:
>     Can you elaborate as to why not? Above you state the nodes can be in the addressable region as well as outside it, which is obvious, but you don't say why it is incorrect to move an entry out of the non-addressable region. I believe it is moving addressable entries that is the problem (as this is what breaks the link to the hashing).
> 
> Andrew Stitcher wrote:
>     It's because the entries addessable and nonaddressable can be interleaved. If all the nonaddressable entries came before any addressable entry in the chain then the optimisation would be safe.
>     
>     But in the presence of deletion you could use up all the cellar entries; create an entry in the addressable region; then have a cellar entry removed; and then have that linked into the chain after the addressable entry. This would mean that only checking the next entry is not sufficient to be sure that no coalescing has happened further down the chain (so you need to walk the entire chain anyway).
>     
>     If you can make it an invariant that *all* entries in the chain that are addressable (hence possibly coalesced) after after *all* entries in the chain that are in the cellar then this optimisation is safe. It is possible to ensure this invariant, but the current code doesn't so it.
>     
>     Does this make sense, or did I misunderstand the optimisation suggestion?

The optimisation proposed would only affect the next entry, not any chain following that. The test of whether the next entry is in the cellar merely indicates whether that entry can be safely moved, not whether it is part of any coalesced chain (which it well might be).

So when Alan says it 'is a very cheap way to check if there is a coalesced chain to worry about', I believe the 'to worry about' part is the key. Its not that it is not a coalesced chain, its that moving the entry doesn't affect the addressability of any entry within it.

Do you agree?


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 6:35 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 6:35 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.
> 
> Gordon Sim wrote:
>     I don't understand your point.
> 
> Andrew Stitcher wrote:
>     I'm trying to say that Alan's suggested optimisation is incorrect.

Can you elaborate as to why not? Above you state the nodes can be in the addressable region as well as outside it, which is obvious, but you don't say why it is incorrect to move an entry out of the non-addressable region. I believe it is moving addressable entries that is the problem (as this is what breaks the link to the hashing).


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.

Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.
> 
> Gordon Sim wrote:
>     I don't understand your point.
> 
> Andrew Stitcher wrote:
>     I'm trying to say that Alan's suggested optimisation is incorrect.
> 
> Gordon Sim wrote:
>     Can you elaborate as to why not? Above you state the nodes can be in the addressable region as well as outside it, which is obvious, but you don't say why it is incorrect to move an entry out of the non-addressable region. I believe it is moving addressable entries that is the problem (as this is what breaks the link to the hashing).

It's because the entries addessable and nonaddressable can be interleaved. If all the nonaddressable entries came before any addressable entry in the chain then the optimisation would be safe.

But in the presence of deletion you could use up all the cellar entries; create an entry in the addressable region; then have a cellar entry removed; and then have that linked into the chain after the addressable entry. This would mean that only checking the next entry is not sufficient to be sure that no coalescing has happened further down the chain (so you need to walk the entire chain anyway).

If you can make it an invariant that *all* entries in the chain that are addressable (hence possibly coalesced) after after *all* entries in the chain that are in the cellar then this optimisation is safe. It is possible to ensure this invariant, but the current code doesn't so it.

Does this make sense, or did I misunderstand the optimisation suggestion?


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 6:35 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 6:35 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.

I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.
> 
> Gordon Sim wrote:
>     I don't understand your point.

I'm trying to say that Alan's suggested optimisation is incorrect.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Andrew Stitcher <as...@apache.org>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.
> 
> Gordon Sim wrote:
>     I don't understand your point.
> 
> Andrew Stitcher wrote:
>     I'm trying to say that Alan's suggested optimisation is incorrect.
> 
> Gordon Sim wrote:
>     Can you elaborate as to why not? Above you state the nodes can be in the addressable region as well as outside it, which is obvious, but you don't say why it is incorrect to move an entry out of the non-addressable region. I believe it is moving addressable entries that is the problem (as this is what breaks the link to the hashing).
> 
> Andrew Stitcher wrote:
>     It's because the entries addessable and nonaddressable can be interleaved. If all the nonaddressable entries came before any addressable entry in the chain then the optimisation would be safe.
>     
>     But in the presence of deletion you could use up all the cellar entries; create an entry in the addressable region; then have a cellar entry removed; and then have that linked into the chain after the addressable entry. This would mean that only checking the next entry is not sufficient to be sure that no coalescing has happened further down the chain (so you need to walk the entire chain anyway).
>     
>     If you can make it an invariant that *all* entries in the chain that are addressable (hence possibly coalesced) after after *all* entries in the chain that are in the cellar then this optimisation is safe. It is possible to ensure this invariant, but the current code doesn't so it.
>     
>     Does this make sense, or did I misunderstand the optimisation suggestion?
> 
> Gordon Sim wrote:
>     The optimisation proposed would only affect the next entry, not any chain following that. The test of whether the next entry is in the cellar merely indicates whether that entry can be safely moved, not whether it is part of any coalesced chain (which it well might be).
>     
>     So when Alan says it 'is a very cheap way to check if there is a coalesced chain to worry about', I believe the 'to worry about' part is the key. Its not that it is not a coalesced chain, its that moving the entry doesn't affect the addressability of any entry within it.
>     
>     Do you agree?

Yes, I do agree. I think that as long as you avoid cutting a non-cellar entry out of the chain you can't break any coalesced chain.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 6:35 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 6:35 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:33 p.m., Alan Conway wrote:
> > proton-c/src/object/map.c, line 322
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line322>
> >
> >     Just to see if I understand the problem:
> >     
> >     The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
> >     
> >     So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.
> 
> Gordon Sim wrote:
>     Yes, I believe that is right, you can move the next entry as long as it is outside the addressable region. I'll make (and test) that change.
> 
> Andrew Stitcher wrote:
>     I don't think that test is correct: nodes in the hash chain can be in both the "addressable" and "non-addressable" region, it's just that you don't use the addressable region unless there is no more space in the "non-addressable" region. The scan for a free table entry starts at the last entry but can reach all the way to the first table entry not just the first entry in the non-addressable region.

I don't understand your point.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Alan Conway <ac...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81814
-----------------------------------------------------------



proton-c/src/object/map.c
<https://reviews.apache.org/r/33623/#comment132308>

    Looks correct but absolutely requires unit tests, at least for this bug and ideally to give some reasonable coverage of the map code esp. in situations where coalesced chains are created.



proton-c/src/object/map.c
<https://reviews.apache.org/r/33623/#comment132310>

    Just to see if I understand the problem:
    
    The old code moves the "next" entry into the current slot. This breaks if the next entry is in allocated space and is the head of a coalesced chain as the latter part of that chain is now under the wrong bucket.
    
    So a simple optimization is to keep the original solution if (next > allocated) That is a very cheap way to check if there is a coalesced chain to worry about.


- Alan Conway


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:32 p.m., Rafael Schloming wrote:
> > proton-c/src/object/map.c, line 266
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line266>
> >
> >     This isn't related to this line, I just don't know where to put general comments...
> >     
> >     It occurs to me that with this sort of thing it is extremely valuable to capture an actual test case that reproduces the corruption. There are a number of existing tests for maps in proton-c/src/tests/object.c, but obviously they don't hit this delete scenario. It would be good to get some coverage of this case. If it would be helpful I can throw together a test class that lets you create objects consisting of an explicitly specified hashcode and value. That way you should be able to deterministically recreate the collisions resulting in this bug.
> 
> Gordon Sim wrote:
>     I agree. I'll have a look at the existing tests and see if I can create one demonstrating the problem with the current code.

"If it would be helpful I can throw together a test class that lets you create objects consisting of an explicitly specified hashcode and value. That way you should be able to deterministically recreate the collisions resulting in this bug."

This would indeed be helpful. I tried using pn_hash_t, but can't seem tocorrectly free that.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81816
-----------------------------------------------------------


On April 28, 2015, 6:35 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 6:35 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
>   proton-c/src/tests/object.c 2b213c5 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.

> On April 28, 2015, 2:32 p.m., Rafael Schloming wrote:
> > proton-c/src/object/map.c, line 266
> > <https://reviews.apache.org/r/33623/diff/2/?file=944061#file944061line266>
> >
> >     This isn't related to this line, I just don't know where to put general comments...
> >     
> >     It occurs to me that with this sort of thing it is extremely valuable to capture an actual test case that reproduces the corruption. There are a number of existing tests for maps in proton-c/src/tests/object.c, but obviously they don't hit this delete scenario. It would be good to get some coverage of this case. If it would be helpful I can throw together a test class that lets you create objects consisting of an explicitly specified hashcode and value. That way you should be able to deterministically recreate the collisions resulting in this bug.

I agree. I'll have a look at the existing tests and see if I can create one demonstrating the problem with the current code.


- Gordon


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81816
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Rafael Schloming <rh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81816
-----------------------------------------------------------



proton-c/src/object/map.c
<https://reviews.apache.org/r/33623/#comment132311>

    This isn't related to this line, I just don't know where to put general comments...
    
    It occurs to me that with this sort of thing it is extremely valuable to capture an actual test case that reproduces the corruption. There are a number of existing tests for maps in proton-c/src/tests/object.c, but obviously they don't hit this delete scenario. It would be good to get some coverage of this case. If it would be helpful I can throw together a test class that lets you create objects consisting of an explicitly specified hashcode and value. That way you should be able to deterministically recreate the collisions resulting in this bug.


- Rafael Schloming


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>


Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Posted by Gordon Sim <gs...@redhat.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/
-----------------------------------------------------------

(Updated April 28, 2015, 12:50 p.m.)


Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.


Changes
-------

Move decref calls to end of the pn_map_del function, following Rafi's comments.


Bugs: PROTON-858
    https://issues.apache.org/jira/browse/PROTON-858


Repository: qpid-proton-git


Description
-------

The internal map implementation for proton-c uses coalesced hashing, but the current deletion algorithm does not take account of this. On deletion of entries, the map can lose its ability to locate other entries by key.

This patch is the simplest possible fix. It frees the deleted entry, makes any previous entry that points to this tha tail of its chain, and then relocates any entries in a chain following the deleted entry. Though I believe this to be correct and safe, it does involve more work on some deletions. The approach can be optimised further, either be reducing the amount of chain that needs relocated, or by marking records as deleted in the first instance and only rehashing later on some criteria.

(I started off trying to do the latter, but it is more complicated and I still haven't got my patch fully correct. When I get that working I'll post that also. It is a more involved change though, so perhaps we may want to consider this simpler approach for 0.9.1).


Diffs (updated)
-----

  proton-c/src/object/map.c 1a758a1 

Diff: https://reviews.apache.org/r/33623/diff/


Testing
-------


Thanks,

Gordon Sim