You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Laurent CHASTEL <lc...@hotmail.com> on 2004/12/15 15:10:29 UTC

Status very long for big repository and status caching.

Hello,

I'm the "source code manager" of the firm I works for.
I have implemented a solution using SVN.
One part of our project was under CVS, I moved all to SVN a PowerBuilder 
application (2500 objects) and a database (5000 objects : tables, stored 
procedures, ...)
We use tools that are SCC compliants, As there was no SCC dll for SVN (the 
project begins in august), I developped one.
The SCC imposed us to have all files in one folder (one for database, one 
for PB)

We tried to use the solution during this week, and it's impossible to use.
We need 10 to 15 min to get the list of objects under SVN control for the 
database folder and the 10 min again when performing an operation.

I have implemented the retrieve of the list by using the function 
svn_wc_status.
I only want local status : is the file under SVN ? and is it locally 
modified ?
The execution of this function is the longuest part of the treatment.
Did I used the good function ? (You could find the source code I write at 
the end of the message)


As I have understood the reason of the execution time is due to the number 
of file in the folder.
When getting the status of one file, the status of all files of the folder 
is retrieve so there are lots of disk access, several for each file.
Do you have think about "caching" status in a file in work copy ?
So instead of having several access to several files, we have only one file 
to access 2 times for each actions :
one at the beginning of the commands (read the file) and one when the action 
is done (write the file)

The server :
svn 1.1.1 (source archive) under linux redhat entreprise 3 (HP hardware).
Repository in FSFS
The client :
svn 1.1.1 under windows (same version as server)
Visual Studio .NET 2003
Computer : Celeron 1.8Ghz, 256 Mo RAM

Best regards,
Laurent



PS : For those who are interreted, I have the same idea than TortoiseSVNSCC.
I don't want to develop the visual interface so I used TortoiseSVN to 
perform checkout, update and commit actions.
I wiil give back my work on TortoiseSVN and TortoiseSVNSCC to those projects 
as soon as possible.


The source code :

svn_wc_status_t *getSVNStatus(CString fileName)
{
	AFX_MANAGE_STATE(AfxGetStaticModuleState()); //LC
	CsccsvnApp *theApp = (CsccsvnApp *) AfxGetApp( );
	CSCCEditor *editor = theApp->getEditor(); // get editor

	apr_hash_t * statushash;
	apr_array_header_t * statusarray;
	const sort_item * item;
	svn_error_t * m_err;
	svn_wc_status_t * status;
	apr_pool_t * m_pool = svn_pool_create (NULL);
	CString	internalpath;
	CString resToken;
	int curPos= 0;
	svn_client_ctx_t ctx;

	bool update = false; // do not change => will create connection to server, 
failed because no authentification defined
	bool noignore = false;

	if (fileName.GetLength() <= 0) return NULL;

	memset (&ctx, 0, sizeof (ctx));
	m_err = svn_config_get_config (&(ctx.config), NULL, m_pool);
	if (m_err)
	{
		svn_pool_destroy (m_pool);					// free the allocated memory
		return NULL;
	} // if (m_err)

	//we need to convert the path to subversion internal format
	//the internal format uses '/' instead of the windows '\'
	internalpath = fileName.GetString();
	internalpath.Replace("\\", "/");

	statushash = apr_hash_make(m_pool);
	svn_revnum_t youngest = SVN_INVALID_REVNUM;
	svn_opt_revision_t rev;
	rev.kind = svn_opt_revision_unspecified;
	struct hashbaton_t hashbaton;
	hashbaton.hash = statushash;
	hashbaton.pool = m_pool;
	m_err = svn_client_status (&youngest, internalpath, &rev, getstatushash,
			      &hashbaton, FALSE, TRUE, update, noignore, &ctx, m_pool);

	// Error present if function is not under version control
	if ((m_err != NULL) || (apr_hash_count(statushash) == 0))
	{
		return NULL;
		// delete => bug in pb7
	}

	// Convert the unordered hash to an ordered, sorted array
	statusarray = sort_hash (statushash,  sort_compare_items_as_paths, m_pool);

	//only the first entry is needed (no recurse)
	item = &APR_ARRAY_IDX (statusarray, 0, const sort_item);

	status = (svn_wc_status_t *) item->value;
	svn_pool_destroy (m_pool);

	return status;
}



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Status very long for big repository and status caching.

Posted by Max Bowsher <ma...@ukf.net>.
Laurent CHASTEL wrote:
> Hello,
>
> I'm the "source code manager" of the firm I works for.
> I have implemented a solution using SVN.
> One part of our project was under CVS, I moved all to SVN a PowerBuilder
> application (2500 objects) and a database (5000 objects : tables, stored
> procedures, ...)
> We use tools that are SCC compliants, As there was no SCC dll for SVN (the
> project begins in august), I developped one.
> The SCC imposed us to have all files in one folder (one for database, one
> for PB)
>
> We tried to use the solution during this week, and it's impossible to use.
> We need 10 to 15 min to get the list of objects under SVN control for the
> database folder and the 10 min again when performing an operation.
>
> I have implemented the retrieve of the list by using the function
> svn_wc_status.
> I only want local status : is the file under SVN ? and is it locally
> modified ?
> The execution of this function is the longuest part of the treatment.
> Did I used the good function ? (You could find the source code I write at
> the end of the message)
...

Above you claim to be using svn_wc_status, but your code below is using 
svn_client_status.

You are using the svn API in a horribly inefficient way.
For every file, you are:

1. Locking the WC
2. XML-parsing .svn/entries
3. Getting the status for every item in the directory.
4. Throwing away all but one of the statuses.
5. Unlocking the WC.

So, basically you are doing the amount of work that needs to be done, 
squared.

As you have found out, 5000 squared is quite a large number.

You should be using svn_wc_adm_open2, then calling svn_wc_status for each 
item you want the status of for the operation in progess, then calling 
svn_wc_adm_close, OR you should call svn_client_status just *once*, and 
fetch the status for each item from the hash you built in the 
svn_client_status callback.

Max.


> The source code :
>
> svn_wc_status_t *getSVNStatus(CString fileName)
> {
> AFX_MANAGE_STATE(AfxGetStaticModuleState()); //LC
> CsccsvnApp *theApp = (CsccsvnApp *) AfxGetApp( );
> CSCCEditor *editor = theApp->getEditor(); // get editor
>
> apr_hash_t * statushash;
> apr_array_header_t * statusarray;
> const sort_item * item;
> svn_error_t * m_err;
> svn_wc_status_t * status;
> apr_pool_t * m_pool = svn_pool_create (NULL);
> CString internalpath;
> CString resToken;
> int curPos= 0;
> svn_client_ctx_t ctx;
>
> bool update = false; // do not change => will create connection to server,
> failed because no authentification defined
> bool noignore = false;
>
> if (fileName.GetLength() <= 0) return NULL;
>
> memset (&ctx, 0, sizeof (ctx));

BAD! Read the API docs for svn_client_create_context()

> m_err = svn_config_get_config (&(ctx.config), NULL, m_pool);
> if (m_err)
> {
> svn_pool_destroy (m_pool); // free the allocated memory
> return NULL;
> } // if (m_err)
>
> //we need to convert the path to subversion internal format
> //the internal format uses '/' instead of the windows '\'
> internalpath = fileName.GetString();
> internalpath.Replace("\\", "/");

Also bad. See svn_path_internal_style().

> statushash = apr_hash_make(m_pool);
> svn_revnum_t youngest = SVN_INVALID_REVNUM;
> svn_opt_revision_t rev;
> rev.kind = svn_opt_revision_unspecified;
> struct hashbaton_t hashbaton;
> hashbaton.hash = statushash;
> hashbaton.pool = m_pool;
> m_err = svn_client_status (&youngest, internalpath, &rev, getstatushash,
>       &hashbaton, FALSE, TRUE, update, noignore, &ctx, m_pool);
>
> // Error present if function is not under version control
> if ((m_err != NULL) || (apr_hash_count(statushash) == 0))
> {
> return NULL;

Bad! Didn't destroy the pool!

> // delete => bug in pb7
> }
>
> // Convert the unordered hash to an ordered, sorted array
> statusarray = sort_hash (statushash,  sort_compare_items_as_paths, 
> m_pool);
>
> //only the first entry is needed (no recurse)
> item = &APR_ARRAY_IDX (statusarray, 0, const sort_item);
>
> status = (svn_wc_status_t *) item->value;
> svn_pool_destroy (m_pool);
>
> return status;
> }


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org