You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by "John Plevyak (JIRA)" <ji...@apache.org> on 2009/11/14 20:16:40 UTC

[jira] Commented: (TS-39) prior BZ59274 "fix" can result in a partition being cleared unnecessarily

    [ https://issues.apache.org/jira/browse/TS-39?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12777985#action_12777985 ] 

John Plevyak commented on TS-39:
--------------------------------


Here is the failure scenario:

... write document 1 to cache ...
Part::aggWrite we get unlucky and write_pos + agg_buf_pos + writelen == skip + len
Part::aggWriteDone, write_pos += write size,  last_write_pos = write_pos
CacheSync::mainEvent ... snap of header where write_pos == skip + len
.... write another document 2 to cache
Part::aggWrite
Part::agg_wrap() is called, write_pos = start
.. Sync complete ...
EXIT, disk now contains last_write_pos == start


.. recovery ..
initial recovery_pos == skip + len
read of 0 bytes
repeat
prev_recovery_pos == recovery_pos, cache partition cleared.

Also I was wrong about prev_recover_pos being local.

> prior BZ59274  "fix" can result in a partition being cleared unnecessarily
> --------------------------------------------------------------------------
>
>                 Key: TS-39
>                 URL: https://issues.apache.org/jira/browse/TS-39
>             Project: Traffic Server
>          Issue Type: Bug
>          Components: Cache
>         Environment: All
>            Reporter: John Plevyak
>            Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The prior fix for BZ59274 clears the cache partition if recovery gets into a loop.  This can occur if the last_write_pos == skip + len
> (the end of the cache partition).  This can occur because the code which updates wraps the write_pos does so when it attempts
> the next write.  The solution is to check for this at the top of recover and wrap recovery. Also, the variable which the prior patch used
> "prev_recover_pos" is stored in the CachePart when it is a purely local variable.  I would suggest leaving in the check (it doesn't hurt
> if it never detects a problem).  Patch forthcoming.   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.