You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Ben Collins-Sussman <su...@collab.net> on 2005/05/31 18:18:52 UTC

confusion about largefile support

[Posting to both apr and subversion dev lists.]

I know that largefile support is kludgey in APR 0.9x, but  
unfortunately, thousands of Subversion users are still using that  
branch (and httpd 2.0.x) because of the binary compatibility issue.

Someone privately pointed out to me today that Subversion isn't  
passing the APR_LARGEFILE flag to *any* apr_file_io calls anywhere.   
Are we sitting on time bomb?

(Subversion issue 1819 (http://subversion.tigris.org/issues/ 
show_bug.cgi?id=1819) discusses some problems we had with  
apr_file_copy(), but we worked around it by writing our own copy  
implementation that doesn't use offsets.)

We've not seen any issues reported, but something came into our  
users@ list over the weekend.  A woman was doing an 'svnadmin load'  
of a large dumpfile into an FSFS repository, and got this error:

<<< Started new transaction, based on original revision 1046
      * adding path : trunk/some folder ... done.
      * adding path : trunk/some folder/some_file.zip ...File size
limit exceeded.

The phrase "File size limit exceeded" comes from libc, so now I'm  
wondering if the largefile flag is the problem here.  Perhaps the  
fsfs revision file is > 2GB, and apr_file_open() is tripping over  
it?  (This woman is using the Fedora 3 apr-0.9.4-24.2 rpm, by the way.)

In any case:  I'm wondering if we should be passing APR_LARGEFILE to  
all apr_file_io calls.  Is it necessary?  Should we expect problems  
if we don't?

Here's a link to the original users@subversion.tigris.org thread:

     http://svn.haxx.se/users/archive-2005-05/1801.shtml



Re: confusion about largefile support

Posted by "C. Michael Pilato" <cm...@collab.net>.
Erik Huelsmann <eh...@gmail.com> writes:

> On 5/31/05, Ben Collins-Sussman <su...@collab.net> wrote:
> > 
> > On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
> > >
> > > Funny, KDE is using fsfs, and I would have expected them to run
> > > into a >2GB revision file.
> > >
> > 
> > Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just
> > reported the same "File size limit exceeded" error that we saw on
> > users@ earlier today.  In both cases, the users were loading a
> > dumpfile into an fsfs repository.  And ttimo verified my fear.
> > There's a >2GB file being assembled in db/txns/.
> > 
> > So, um, maybe we should write a FAQ?  One which tells folks that the
> > only workaround here is to recompile subversion against apr 1.x?
> > (And to upgrade to httpd 2.1 if necessary.)
> 
> or use a BDB repos.

In TTimo's case, I seem to recall that the use of a BDB repos was a
cause of entirely different source of pain, and therefore, not as
viable an option.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by "C. Michael Pilato" <cm...@collab.net>.
Erik Huelsmann <eh...@gmail.com> writes:

> On 5/31/05, Ben Collins-Sussman <su...@collab.net> wrote:
> > 
> > On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
> > >
> > > Funny, KDE is using fsfs, and I would have expected them to run
> > > into a >2GB revision file.
> > >
> > 
> > Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just
> > reported the same "File size limit exceeded" error that we saw on
> > users@ earlier today.  In both cases, the users were loading a
> > dumpfile into an fsfs repository.  And ttimo verified my fear.
> > There's a >2GB file being assembled in db/txns/.
> > 
> > So, um, maybe we should write a FAQ?  One which tells folks that the
> > only workaround here is to recompile subversion against apr 1.x?
> > (And to upgrade to httpd 2.1 if necessary.)
> 
> or use a BDB repos.

In TTimo's case, I seem to recall that the use of a BDB repos was a
cause of entirely different source of pain, and therefore, not as
viable an option.

Re: confusion about largefile support

Posted by Erik Huelsmann <eh...@gmail.com>.
On 5/31/05, Ben Collins-Sussman <su...@collab.net> wrote:
> 
> On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
> >
> > Funny, KDE is using fsfs, and I would have expected them to run
> > into a >2GB revision file.
> >
> 
> Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just
> reported the same "File size limit exceeded" error that we saw on
> users@ earlier today.  In both cases, the users were loading a
> dumpfile into an fsfs repository.  And ttimo verified my fear.
> There's a >2GB file being assembled in db/txns/.
> 
> So, um, maybe we should write a FAQ?  One which tells folks that the
> only workaround here is to recompile subversion against apr 1.x?
> (And to upgrade to httpd 2.1 if necessary.)

or use a BDB repos.

bye,

Erik.

Re: confusion about largefile support

Posted by Erik Huelsmann <eh...@gmail.com>.
On 5/31/05, Ben Collins-Sussman <su...@collab.net> wrote:
> 
> On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
> >
> > Funny, KDE is using fsfs, and I would have expected them to run
> > into a >2GB revision file.
> >
> 
> Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just
> reported the same "File size limit exceeded" error that we saw on
> users@ earlier today.  In both cases, the users were loading a
> dumpfile into an fsfs repository.  And ttimo verified my fear.
> There's a >2GB file being assembled in db/txns/.
> 
> So, um, maybe we should write a FAQ?  One which tells folks that the
> only workaround here is to recompile subversion against apr 1.x?
> (And to upgrade to httpd 2.1 if necessary.)

or use a BDB repos.

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org


Re: confusion about largefile support

Posted by Ben Collins-Sussman <su...@collab.net>.
On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
>
> Funny, KDE is using fsfs, and I would have expected them to run  
> into a >2GB revision file.
>

Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just  
reported the same "File size limit exceeded" error that we saw on  
users@ earlier today.  In both cases, the users were loading a  
dumpfile into an fsfs repository.  And ttimo verified my fear.   
There's a >2GB file being assembled in db/txns/.

So, um, maybe we should write a FAQ?  One which tells folks that the  
only workaround here is to recompile subversion against apr 1.x?   
(And to upgrade to httpd 2.1 if necessary.)


Re: confusion about largefile support

Posted by Ben Collins-Sussman <su...@collab.net>.
On May 31, 2005, at 11:49 AM, Ben Collins-Sussman wrote:
>
> Funny, KDE is using fsfs, and I would have expected them to run  
> into a >2GB revision file.
>

Well, whattya know.  Now Timothee Besset (ttimo) in IRC has just  
reported the same "File size limit exceeded" error that we saw on  
users@ earlier today.  In both cases, the users were loading a  
dumpfile into an fsfs repository.  And ttimo verified my fear.   
There's a >2GB file being assembled in db/txns/.

So, um, maybe we should write a FAQ?  One which tells folks that the  
only workaround here is to recompile subversion against apr 1.x?   
(And to upgrade to httpd 2.1 if necessary.)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-05-31 at 11:49 -0500, Ben Collins-Sussman wrote:
> Okay, so then there really is a risk here for very large  
> repositories, particularly ones using FSFS.

It doesn't matter how large the repository is, only how large the
commits to it are.

> Should we start recommending that such projects upgrade to APR 1.X  
> (and httpd 2.1, if they're depending on mod_dav_svn?)  Should we  
> write a FAQ about it?

There's no particularly strong reason not to wait until you have a
problem or are likely to have a problem before going to such lengths.

> Funny, KDE is using fsfs, and I would have expected them to run into  
> a >2GB revision file.

Why would a KDE developer commit more than 2GB of changes in one go?


Re: confusion about largefile support

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-05-31 at 11:49 -0500, Ben Collins-Sussman wrote:
> Okay, so then there really is a risk here for very large  
> repositories, particularly ones using FSFS.

It doesn't matter how large the repository is, only how large the
commits to it are.

> Should we start recommending that such projects upgrade to APR 1.X  
> (and httpd 2.1, if they're depending on mod_dav_svn?)  Should we  
> write a FAQ about it?

There's no particularly strong reason not to wait until you have a
problem or are likely to have a problem before going to such lengths.

> Funny, KDE is using fsfs, and I would have expected them to run into  
> a >2GB revision file.

Why would a KDE developer commit more than 2GB of changes in one go?


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Stephan Kulow <co...@kde.org>.
Am Dienstag 31 Mai 2005 19:02 schrieb Ben Collins-Sussman:
> On May 31, 2005, at 11:56 AM, Garrett Rooney wrote:
> > Well, it's only >2GB in a single revision, so depending on how they
> > did the conversion it's quite possible they just haven't hit one.
> > Did they do a cvs2svn conversion or did they just 'svn import'
> > everything?
>
> They did a *huge* cvs2svn conversion.  Of a really gigantic
> codebase.  But you're right, it's possible that no single revision
> file of theirs is more than 2GB.  The deltas could be spread nicely
> over a half-million revision files.

The top revisions:
-rw-rw-r--  1 kde         kde   103M May 11 18:35 168975
-rw-rw-r--  1 scripty     kde    81M May 20 13:01 416056
-rw-rw-r--  1 kde         kde    49M May 11 18:57 190304
-rw-rw-r--  1 scripty     kde    46M May 21 09:39 416347
-rw-rw-r--  1 kde         kde    38M May 11 23:32 406763
-rw-rw-r--  1 kde         kde    36M May 11 23:40 410608
-rw-rw-r--  1 kde         kde    36M May 11 20:16 272973
-rw-rw-r--  1 scripty     kde    35M May 22 15:03 416912
-rw-rw-r--  1 kde         kde    30M May 11 22:57 385696

Greetings, Stephan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Ben Collins-Sussman <su...@collab.net>.
On May 31, 2005, at 11:56 AM, Garrett Rooney wrote:
>
> Well, it's only >2GB in a single revision, so depending on how they  
> did the conversion it's quite possible they just haven't hit one.   
> Did they do a cvs2svn conversion or did they just 'svn import'  
> everything?
>

They did a *huge* cvs2svn conversion.  Of a really gigantic  
codebase.  But you're right, it's possible that no single revision  
file of theirs is more than 2GB.  The deltas could be spread nicely  
over a half-million revision files.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Garrett Rooney <ro...@electricjellyfish.net>.
Ben Collins-Sussman wrote:

> Okay, so then there really is a risk here for very large  repositories, 
> particularly ones using FSFS.
> 
> Should we start recommending that such projects upgrade to APR 1.X  (and 
> httpd 2.1, if they're depending on mod_dav_svn?)  Should we  write a FAQ 
> about it?

That seems reasonable to me.

> Funny, KDE is using fsfs, and I would have expected them to run into  a 
>  >2GB revision file.

Well, it's only >2GB in a single revision, so depending on how they did 
the conversion it's quite possible they just haven't hit one.  Did they 
do a cvs2svn conversion or did they just 'svn import' everything?

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Ben Collins-Sussman <su...@collab.net>.
On May 31, 2005, at 11:45 AM, Greg Hudson wrote:
>
> However, we most definitely do need to seek around in FSFS rev files,
> which are the most common thing to go above 2GB.  APR_LARGEFILE won't
> help us for that case.
>

Okay, so then there really is a risk here for very large  
repositories, particularly ones using FSFS.

Should we start recommending that such projects upgrade to APR 1.X  
(and httpd 2.1, if they're depending on mod_dav_svn?)  Should we  
write a FAQ about it?

Funny, KDE is using fsfs, and I would have expected them to run into  
a >2GB revision file.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Ben Collins-Sussman <su...@collab.net>.
On May 31, 2005, at 11:45 AM, Greg Hudson wrote:
>
> However, we most definitely do need to seek around in FSFS rev files,
> which are the most common thing to go above 2GB.  APR_LARGEFILE won't
> help us for that case.
>

Okay, so then there really is a risk here for very large  
repositories, particularly ones using FSFS.

Should we start recommending that such projects upgrade to APR 1.X  
(and httpd 2.1, if they're depending on mod_dav_svn?)  Should we  
write a FAQ about it?

Funny, KDE is using fsfs, and I would have expected them to run into  
a >2GB revision file.



Re: confusion about largefile support

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-05-31 at 11:18 -0500, Ben Collins-Sussman wrote:
> In any case:  I'm wondering if we should be passing APR_LARGEFILE to  
> all apr_file_io calls.  Is it necessary?  Should we expect problems  
> if we don't?

Passing APR_LARGEFILE does not magically change the size of apr_off_t;
it does not gain us the ability to seek around in >2GB files.  It's
essentially a promise to APR that we don't need to seek, so it's okay to
go above the 2GB limit.

However, we most definitely do need to seek around in FSFS rev files,
which are the most common thing to go above 2GB.  APR_LARGEFILE won't
help us for that case.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: confusion about largefile support

Posted by Greg Hudson <gh...@MIT.EDU>.
On Tue, 2005-05-31 at 11:18 -0500, Ben Collins-Sussman wrote:
> In any case:  I'm wondering if we should be passing APR_LARGEFILE to  
> all apr_file_io calls.  Is it necessary?  Should we expect problems  
> if we don't?

Passing APR_LARGEFILE does not magically change the size of apr_off_t;
it does not gain us the ability to seek around in >2GB files.  It's
essentially a promise to APR that we don't need to seek, so it's okay to
go above the 2GB limit.

However, we most definitely do need to seek around in FSFS rev files,
which are the most common thing to go above 2GB.  APR_LARGEFILE won't
help us for that case.