You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by John Plevyak <jp...@acm.org> on 2009/12/10 01:15:21 UTC

cache partition size patch commit vote request

I would like to ask for a vote on whether or not to commit the cache 
partition size patch (+1, -1, 0).

Background:

In the current cache, the disk is broken up into 8GB partitions where
   - objects are hashed to partitions
   - an object must fit entirely in a partition effectively limiting 
object size and
      introducing potential competition
   - each partition has it's own write pointer, resulting in lots of 
seeks on large
      disks to write in different places

The patch adds:
   - support of partitions up to .5PB (500TB) with the default 512 block 
size
   - larger aggregation buffer: 2MB (since we have fewer we can have 
larger ones)
   - larger top fast IOBuffer size (64kB), this should probably be 
increased but this
      patch fixes bugs which prevent increasing this in the current code
   - internal cache support for large objects (still requires 
VIO/IOBuffer changes + HTTP changes)
   - experimental support for do_io_pread
   - new on disk format which will support do_io_pread as well as 
non-HTTP header access

Potential downsize:
   This patch changes the on disk format which will require a cache wipe.
   I have tried to include all the changes necessary to implement large 
object,
   pread and pluggable-protocol header usage, but one can never be sure.

The patch is available for under TS-46 in jira.

Thank you,
john

Re: cache partition size patch commit vote request

Posted by Bryan Call <bc...@yahoo-inc.com>.
+1 for committing the patch
-1 for having this in the first release

I think there needs to be more testing of the cache changes and the cache in general before these changes make it to a release branch.  I am open for discussion on how we can better test the cache, so I would feel more confident in the changes.

This isn't specific to changes to cache alone, I alway feel nervous when making fundamental changes to traffic server.  A couple weeks ago I spent some time trying to write a unit test for hostdb and just about went crazy trying to figure out the all the different libraries I needed to link in (I walked away from it before I went mad)...

I have an interesting dependency graph for traffic server.  I had to write a program to graph out all the dependencies, it was too much to do by hand.  Cleaning up our dependency hierarchy (well make it a hierarchy by removing the circular dependencies) would go a long way to making unit tests able to be written by mere mortals.

-Bryan

On Dec 9, 2009, at 4:15 PM, John Plevyak wrote:

> 
> I would like to ask for a vote on whether or not to commit the cache partition size patch (+1, -1, 0).
> 
> Background:
> 
> In the current cache, the disk is broken up into 8GB partitions where
>  - objects are hashed to partitions
>  - an object must fit entirely in a partition effectively limiting object size and
>     introducing potential competition
>  - each partition has it's own write pointer, resulting in lots of seeks on large
>     disks to write in different places
> 
> The patch adds:
>  - support of partitions up to .5PB (500TB) with the default 512 block size
>  - larger aggregation buffer: 2MB (since we have fewer we can have larger ones)
>  - larger top fast IOBuffer size (64kB), this should probably be increased but this
>     patch fixes bugs which prevent increasing this in the current code
>  - internal cache support for large objects (still requires VIO/IOBuffer changes + HTTP changes)
>  - experimental support for do_io_pread
>  - new on disk format which will support do_io_pread as well as non-HTTP header access
> 
> Potential downsize:
>  This patch changes the on disk format which will require a cache wipe.
>  I have tried to include all the changes necessary to implement large object,
>  pread and pluggable-protocol header usage, but one can never be sure.
> 
> The patch is available for under TS-46 in jira.
> 
> Thank you,
> john



Re: cache partition size patch commit vote request

Posted by John Plevyak <jp...@acm.org>.
I suppose the question is whether the remap+Y! changes are merged
immediately into the
the branch (cutting 2.0) or lazily (wait for 2.0).  As long as the
changes are relatively isolated
the lazy option is easier, or if they come as a package rather than
trickle in.

Do you know how much of the Y! internal changes effect the
cache/event/net systems and
what the timeframe on getting those in might be?  I'd rather not diverge
too much if they
overlap.

john  


On 12/15/2009 9:01 AM, Leif Hedstrom wrote:
> On 12/15/2009 09:48 AM, John Plevyak wrote:
>>
>> Why not cut the 2.0 branch now?
>>    
>
> We're still waiting to land changes that were made in the Y! internal
> tree after we moved the source to ASF. We need those into the 2.0
> branch, I'm pretty sure. We're also still discussing the changes for
> the "remap" APIs, to decide what we should do. After we cut the 2.0
> branch, I'd assume we do bug-fixes only (and no new features) on that
> branch (kinda defeats the purpose of branching otherwise).
>
> If the consensus is to branch now instead, and merge all these changes
> both into trunk and branch, we can do that (but my vote would be -1).
>
>> The existing codebase has some serious issues in the event scheduling
>> system
>> (for example large cache miss latencies, an issue I am submitting)
>> along with performance problems handling medium size objects (more than
>> 100k)
>> in addition to the limits handing large objects.   The API should be
>> updated to
>> support large objects as well.  Taken as a whole these issues restrict
>> the applications
>> that the current codebase can be applied to, so if we are not going to
>> address them
>> all in 2.0 we might as well fork 2.0 now and restrict ourselves to bugs
>> with usability and
>> packaging in 2.0 and move main development to a new 2.X branch.
>>    
>
>
> All that sounds good. I think we generally want to have "trunk" be new
> development, and we branch when we have a feature set that we think is
> "good" to branch on. I don't think trunk is quite there yet, since
> we're waiting to land a few things that we do want in 2.0 (as I
> mentioned above).
>
> The reason I suggested making a "dev" branch for your cache partition
> changes was to let people test it right now. I don't have an ETA when
> we think trunk weill be ready to branch for 2.0 (assuming we stick to
> that methodology, and not do feature development on multiple branches).
>
> Cheers,
>
> -- Leif


Re: cache partition size patch commit vote request

Posted by Leif Hedstrom <zw...@apache.org>.
On 12/15/2009 09:48 AM, John Plevyak wrote:
>
> Why not cut the 2.0 branch now?
>    

We're still waiting to land changes that were made in the Y! internal 
tree after we moved the source to ASF. We need those into the 2.0 
branch, I'm pretty sure. We're also still discussing the changes for the 
"remap" APIs, to decide what we should do. After we cut the 2.0 branch, 
I'd assume we do bug-fixes only (and no new features) on that branch 
(kinda defeats the purpose of branching otherwise).

If the consensus is to branch now instead, and merge all these changes 
both into trunk and branch, we can do that (but my vote would be -1).

> The existing codebase has some serious issues in the event scheduling system
> (for example large cache miss latencies, an issue I am submitting)
> along with performance problems handling medium size objects (more than
> 100k)
> in addition to the limits handing large objects.   The API should be
> updated to
> support large objects as well.  Taken as a whole these issues restrict
> the applications
> that the current codebase can be applied to, so if we are not going to
> address them
> all in 2.0 we might as well fork 2.0 now and restrict ourselves to bugs
> with usability and
> packaging in 2.0 and move main development to a new 2.X branch.
>    


All that sounds good. I think we generally want to have "trunk" be new 
development, and we branch when we have a feature set that we think is 
"good" to branch on. I don't think trunk is quite there yet, since we're 
waiting to land a few things that we do want in 2.0 (as I mentioned above).

The reason I suggested making a "dev" branch for your cache partition 
changes was to let people test it right now. I don't have an ETA when we 
think trunk weill be ready to branch for 2.0 (assuming we stick to that 
methodology, and not do feature development on multiple branches).

Cheers,

-- Leif


Re: cache partition size patch commit vote request

Posted by John Plevyak <jp...@acm.org>.

Why not cut the 2.0 branch now?

The existing codebase has some serious issues in the event scheduling system
(for example large cache miss latencies, an issue I am submitting)
along with performance problems handling medium size objects (more than
100k)
in addition to the limits handing large objects.   The API should be
updated to
support large objects as well.  Taken as a whole these issues restrict
the applications
that the current codebase can be applied to, so if we are not going to
address them
all in 2.0 we might as well fork 2.0 now and restrict ourselves to bugs
with usability and
packaging in 2.0 and move main development to a new 2.X branch.

john


On 12/15/2009 7:17 AM, Leif Hedstrom wrote:
> On 12/09/2009 05:15 PM, John Plevyak wrote:
>>
>> I would like to ask for a vote on whether or not to commit the cache
>> partition size patch (+1, -1, 0).
>
> I think we're gonna have to "call" this vote, and with only 2 +1 and
> one -1,  we need to postpone this checkin until the 2.0 "branch" is
> cut (and we open up trunk again for new features). In the mean time,
> my suggestion would be that you create a branch in SVN (call it
> "TS-46" or "cache_partition" or something like that), and commit it
> there. This way, people can easily make builds testing it, and we can
> later just merge it back into trunk.
>
> Cheers,
>
> -- leif


Re: cache partition size patch commit vote request

Posted by Leif Hedstrom <zw...@apache.org>.
On 12/09/2009 05:15 PM, John Plevyak wrote:
>
> I would like to ask for a vote on whether or not to commit the cache 
> partition size patch (+1, -1, 0).

I think we're gonna have to "call" this vote, and with only 2 +1 and one 
-1,  we need to postpone this checkin until the 2.0 "branch" is cut (and 
we open up trunk again for new features). In the mean time, my 
suggestion would be that you create a branch in SVN (call it "TS-46" or 
"cache_partition" or something like that), and commit it there. This 
way, people can easily make builds testing it, and we can later just 
merge it back into trunk.

Cheers,

-- leif


Re: cache partition size patch commit vote request

Posted by John Plevyak <jp...@acm.org>.
OK, I have a git repository with cache partition patch setup which is 
up-to-day as of today.
If you are a git wiz you can always merge to keep up-to-day with any 
last minute changes
right before a benchmark test.

git clone git://jplevyak.homeip.net/trafficserver

This has been tested and should automagically wipe your cache when run 
the first time.

john



On 12/9/2009 8:24 PM, Leif Hedstrom wrote:
> On 12/09/2009 05:15 PM, John Plevyak wrote:
>>
>> I would like to ask for a vote on whether or not to commit the cache 
>> partition size patch (+1, -1, 0).
>>
>> Background:
>
> Before I vote, I guess the concern that will come up is how stable 
> this is? How can we verify that the new cache layout is at least as 
> stable as the old one? We just got access to a pretty serious proxy / 
> cache tool, it'd be interesting to running it against a build with and 
> without this patch, and see if that discovers any anomalies (it might 
> detect other bugs though in the code :).
>
> -- Leif


Re: cache partition size patch commit vote request

Posted by Leif Hedstrom <le...@ogre.com>.
On 12/09/2009 10:02 PM, John Plevyak wrote:
>
> That would be great, particularly for a big change like this (of 
> course it would be nice
> to have it automatic and continuous....).
>
> I'll build a tracking git branch and post a URL.

Ok, cool. So, I'll vote a +1 on having this in the 2.0 release, for two 
reasons:

1) The sooner we get it in, the more testing we get done. And the less 
impact we have on people using it if we delay the merge until later release.

2) It's good stuff.


My +1 is contingent on this being reviewed by someone familiar with the 
cache code (maybe VJ or Steve), and a review from Bryan.

Please, everyone, make sure you voice your concerns (pros/cons etc.), 
and cast your votes. IMO, we should do the normal ASF voting process 
here, so in 72 hours from the initial request for vote, John should call 
it. The patch has been available for review for quite some time now.

Cheers,

-- Leif


Re: cache partition size patch commit vote request

Posted by John Plevyak <jp...@acm.org>.
That would be great, particularly for a big change like this (of course 
it would be nice
to have it automatic and continuous....).

I'll build a tracking git branch and post a URL.

john


On 12/9/2009 8:24 PM, Leif Hedstrom wrote:
> On 12/09/2009 05:15 PM, John Plevyak wrote:
>>
>> I would like to ask for a vote on whether or not to commit the cache 
>> partition size patch (+1, -1, 0).
>>
>> Background:
>
> Before I vote, I guess the concern that will come up is how stable 
> this is? How can we verify that the new cache layout is at least as 
> stable as the old one? We just got access to a pretty serious proxy / 
> cache tool, it'd be interesting to running it against a build with and 
> without this patch, and see if that discovers any anomalies (it might 
> detect other bugs though in the code :).
>
> -- Leif


Re: cache partition size patch commit vote request

Posted by Leif Hedstrom <le...@ogre.com>.
On 12/09/2009 05:15 PM, John Plevyak wrote:
>
> I would like to ask for a vote on whether or not to commit the cache 
> partition size patch (+1, -1, 0).
>
> Background:

Before I vote, I guess the concern that will come up is how stable this 
is? How can we verify that the new cache layout is at least as stable as 
the old one? We just got access to a pretty serious proxy / cache tool, 
it'd be interesting to running it against a build with and without this 
patch, and see if that discovers any anomalies (it might detect other 
bugs though in the code :).

-- Leif


Re: cache partition size patch commit vote request

Posted by George Paul <ge...@yahoo.com>.
Sorry I'm late to this vote since I've been reviewing and testing out this patch. Preliminary testing indicates good stability with marked improvement in performance.

+1 to commit and have in 2.0 release provided the cache has proper regression tests. The code base sorely needs to be updated for the last decade of advancement in storage. 

regards,

-George


--- On Wed, 12/9/09, John Plevyak <jp...@acm.org> wrote:

> From: John Plevyak <jp...@acm.org>
> Subject: cache partition size patch commit vote request
> To: trafficserver-dev@incubator.apache.org
> Date: Wednesday, December 9, 2009, 4:15 PM
> 
> I would like to ask for a vote on whether or not to commit
> the cache partition size patch (+1, -1, 0).
> 
> Background:
> 
> In the current cache, the disk is broken up into 8GB
> partitions where
>   - objects are hashed to partitions
>   - an object must fit entirely in a partition
> effectively limiting object size and
>      introducing potential competition
>   - each partition has it's own write pointer,
> resulting in lots of seeks on large
>      disks to write in different
> places
> 
> The patch adds:
>   - support of partitions up to .5PB (500TB) with the
> default 512 block size
>   - larger aggregation buffer: 2MB (since we have
> fewer we can have larger ones)
>   - larger top fast IOBuffer size (64kB), this should
> probably be increased but this
>      patch fixes bugs which prevent
> increasing this in the current code
>   - internal cache support for large objects (still
> requires VIO/IOBuffer changes + HTTP changes)
>   - experimental support for do_io_pread
>   - new on disk format which will support do_io_pread
> as well as non-HTTP header access
> 
> Potential downsize:
>   This patch changes the on disk format which will
> require a cache wipe.
>   I have tried to include all the changes necessary to
> implement large object,
>   pread and pluggable-protocol header usage, but one
> can never be sure.
> 
> The patch is available for under TS-46 in jira.
> 
> Thank you,
> john
>