You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brandon Fosdick <bf...@bfoz.net> on 2005/10/28 07:18:00 UTC

Bucket brigades and large files

I'm still working on mod_dav_userdir, so naturally I have more questions.

Its passing most of the tests in litmus, with the only exceptions being some locking stuff that neither windows nor osx seem to care about. Now I'm testing other stuff, namely large file uploading. I have two large files that I'm working with, one is about 150MB and the other is just over 4GB. My test client is OS X Tiger 10.4.2 and the server is 2.0.55 (Sempron 3000+ with 1GB of RAM).

With the 4GB file I eventually get the errors:

[Thu Oct 27 21:15:19 2005] [error] [client 192.168.0.1] Invalid Content-Length
[Thu Oct 27 21:15:19 2005] [error] [client 192.168.0.1] (34)Result too large: Could not get next bucket brigade  [500, #0]

At this point nothing has been written to the database, even though the client has been transferring for several minutes.

I can't find any mention of the bucket brigade error in the httpd source code, so I have no idea what to do about it. Any ideas?

I have to admit I don't really understand how to interface with the bucket brigade stuff, I blindly followed the example in mod_dav_fs. Even though the brigade handling appears to break large files into pieces, it looks like the entire file is loaded before the repository handler gets it. If that's the case, maybe the above errors are related to memory starvation? 

FWIW, mod_dav_userdir, unlike catacomb, breaks the files into 64KB rows in the database. Consequently I could easily benefit from being given the incoming bits in chunks, but I have no idea how to accomplish that. Any suggestions?

During my travails I've noticed that the progress bars on the client machines (osx or win2k) progress signifcantly faster than the rows are inserted into the database. I'm not entirely sure what that means or what to do about it, but it would be nice to keep things in sync.

BTW, if any of you feel like pounding my server for me, feel free. The test server is at http://site46.com (that's a reverse proxy to the real test server...let me know if it doesn't work). The real server is partially at http://terran-bank.com. How was that for a shameless plug? :)

BTW2, if anyone cares I can make the source for mod_dav_userdir available. It's BSD, but I haven't gotten around to posting it yet.

Thanks

Re: Bucket brigades and large files

Posted by Brandon Fosdick <bf...@bfoz.net>.
Joe Orton wrote:
> 2.0.x doesn't support large request bodies (for some definition of 
> "large" depending on the platform, >2Gb for 32-bit hosts), this works 
> with 2.1.x as Paul says.

The limit seems to actually be around 750MB, even on an AMD64x2 with 4GB of RAM. That's using 2.0.x on a FreeBSD 5.4-S box. The database is MySQL 4.1.14 on a 2.5TB volume.

I'm about to give 2.1.8 a try. Just as soon as I figure out how to install and uninstall it cleanly. The FreeBSD port for 2.1.x is still at 2.1.4.


Re: Bucket brigades and large files

Posted by Joe Orton <jo...@redhat.com>.
On Thu, Oct 27, 2005 at 10:18:00PM -0700, Brandon Fosdick wrote:
> I'm still working on mod_dav_userdir, so naturally I have more questions.
> 
> Its passing most of the tests in litmus, with the only exceptions 
> being some locking stuff that neither windows nor osx seem to care 
> about. Now I'm testing other stuff, namely large file uploading. I 
> have two large files that I'm working with, one is about 150MB and the 
> other is just over 4GB. My test client is OS X Tiger 10.4.2 and the 
> server is 2.0.55 (Sempron 3000+ with 1GB of RAM).
> 
> With the 4GB file I eventually get the errors:

2.0.x doesn't support large request bodies (for some definition of 
"large" depending on the platform, >2Gb for 32-bit hosts), this works 
with 2.1.x as Paul says.

Regards,

joe

Re: Bucket brigades and large files

Posted by Paul Querna <ch...@force-elite.com>.
Brandon Fosdick wrote:
> I'm still working on mod_dav_userdir, so naturally I have more questions.
> 
> Its passing most of the tests in litmus, with the only exceptions being some locking stuff that neither windows nor osx seem to care about. Now I'm testing other stuff, namely large file uploading. I have two large files that I'm working with, one is about 150MB and the other is just over 4GB. My test client is OS X Tiger 10.4.2 and the server is 2.0.55 (Sempron 3000+ with 1GB of RAM).

What OS on the server?

Can you try 2.1.8-beta? Lots of Large File Support fixes are present in
2.1.xx

-Paul



Re: Bucket brigades and large files

Posted by Brandon Fosdick <bf...@bfoz.net>.
Nick Kew wrote:
> Will you be moving to the DBD framework and mod_dbd for that?
> Your application is nontrivial but wasn't an input to the original
> DBD design.  That makes it exactly the kind of thing that'll prove
> the API and/or highlight what needs adding or reviewing.

It had occured to me, but I haven't really looked into it yet.

Re: Bucket brigades and large files

Posted by Nick Kew <ni...@webthing.com>.
On Saturday 29 October 2005 06:02, Brandon Fosdick wrote:

> All of the database magic happens in class ServerConfig. It's an enourmous
> mess of a class. Cleaning it up is on the ToDo list, after switching to
> prepared statements.

Will you be moving to the DBD framework and mod_dbd for that?
Your application is nontrivial but wasn't an input to the original
DBD design.  That makes it exactly the kind of thing that'll prove
the API and/or highlight what needs adding or reviewing.

-- 
Nick Kew

Re: Bucket brigades and large files

Posted by Brandon Fosdick <bf...@bfoz.net>.
Brandon Fosdick wrote:
> http://terran-bank.com/mod_dav_userdir_20051028.tar

I forgot about the SQL for creating tables.

http://terran-bank.com/create_terranbank.sql

sorry about that

Re: Bucket brigades and large files

Posted by Brandon Fosdick <bf...@bfoz.net>.
Brandon Fosdick wrote:
> http://terran-bank.com/mod_dav_userdir_20051028.tar

So maybe I should explain the code a bit. First off, its all C++. Now we wait a sec for the C folks to run away screaming...ok, good.

All of the database magic happens in class ServerConfig. It's an enourmous mess of a class. Cleaning it up is on the ToDo list, after switching to prepared statements. (really, I have a list, it's in OmniOutliner on my pbook) As the name implies, this class is created by the usual create server config handler. All of the config directive handlers call set_X() methods of ServerConfig.

repository.cc is what you think it is if you're familiar with mod_dav, but it mostly passes stuff off to class resource_t, which then passes database requests to ServerConfig.
For those not familiar with mod_dav, every request starts with a call to get_resource(), which is responsible for creating a structure that represents each DAV resource involved in the request. In this case that structure is resource_t. 
deliver() is where the downloading happens. It creates a bucket brigade and dumps all of the blocks from the database into the brigade. I have no idea if I did that right.

hooks.cc is where all of the hooks are registered. Nothing fancy there.

Locks, properties, etc are handled by the appropriately named files. Most of the handlers pass through to a class method of some sort. That sounds like it would be slow, but almost all of the class methods are inline, so its no worse than C. Most of these classes are fairly straightforward. Locks are poorly implemented ATM.

stream_t in stream.h is where the upload magic happens. It serves as a buffer between mod_dav and the database. Incoming bytes are buffered into 64K blocks before being written to MySQL. This would be a great place to use prepared statements, but I haven't gotten around to it. When I started this project I had never used them outside of PHP, and my focus was on getting something working as quickly as possible.

apr_pool_base.h has a base class and a new() operator that helps with using pools. I can't tell if the destructors are being called properly, but I haven't had any problems with memory leaks, so maybe its working.

The sharp eyed will notice that I have a copy of mod_dav.h in the source. That's because the official copy uses the namespace token in two places, and therefore barfs in C++. I sent an email to the list about this several months ago and didn't get a response. So, I use a modified copy.

I should point out that the usernames are constrained to be positive integers, but only because that's what Terran Bank needs. At one point I was maintaining a fork that allowed real usernames, but it fell by the side. Mainly because it was pointless, I think the only difference was two functions in resource_t. Some day I'll add a compile-time config option.

That's the high level overview. Let me know if you want more.

Re: Bucket brigades and large files

Posted by Brandon Fosdick <bf...@bfoz.net>.
Paul Querna wrote:
> Posting Code would help us with the problems and be able to reproduce
> them here too.

http://terran-bank.com/mod_dav_userdir_20051028.tar

Bear in mind that this is a snapshot and not really release ready. The build setup is a bit funky right now and may not work everywhere. I made a few scripts to avoid some typing, but the downside is that a few paths are hard coded. YMMV.

To build:
./doauto.sh
./doconfig.sh
make

To install:
./doapxs.sh

You might need to run the code through a reformatter. I used 3 char tabs all over the place, and since I have 3 monitors most of my editor windows are rather wide (long line warning).

Comments and suggestions are most welcome.

Enjoy

Re: Bucket brigades and large files

Posted by Paul Querna <ch...@force-elite.com>.
> BTW2, if anyone cares I can make the source for mod_dav_userdir available. It's BSD, but I haven't gotten around to posting it yet.

Posting Code would help us with the problems and be able to reproduce
them here too.