You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vcl.apache.org by Junaid Ali <al...@iit.edu> on 2016/10/19 20:22:44 UTC

vcl database cleanup

Hello,
We are currently using vcl version 2.3.2 in our environment. We use Active
Directory for LDAP Authentication and user accounts get added to specific
groups in VCL based on user access rights. Since its deployment, the VCL
MySQL database has not been purged of historical data. Curerntly the
querylog table is using 1.5 Gb and continuations table is using 750 Mb
storage. We are interested in cleaning the user accounts that exist in the
database and are not active (during the current academic year). Is there a
recommended procedure for purging user accounts from the VCL database? I
understand there is user data referenced in other VCL tables (e.g. log)
that needs to be deleted before the actual user account can be purged.

Thanks

Junaid Ali
Systems & Virtualization Engineer,
Office of Technology Services/IIT,
Chicago, IL - 60616

Re: vcl database cleanup

Posted by Junaid Ali <al...@iit.edu>.
Thanks for the info Josh.
Let me know if you need any help refactoring.

Thanks.
Junaid.

On Thu, Oct 20, 2016 at 8:18 AM, Josh Thompson <jo...@ncsu.edu>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Junaid,
>
> Good to hear from you - we've just been working on incorporating your AD
> work
> this week.  Thanks for contributing it - sorry it's taken so long to
> incorporate it.
>
> I'd recommend against cleaning out the user entries because they are tied
> to
> so many other table entries.  If you have concerns of having old user data
> in
> there that could potentially be exposed in the event of a security breach,
> I'd recommend to anonymize the unityid, firstname, lastname, preferredname,
> and email fields for the old accounts.
>
> To help with the space usage, cleaning up the continuations and querylog
> tables will be the most helpful.  I'd actually recommend having a
> maintenance
> window once or twice a year to clean those tables.  You can safely delete
> any
> entries from the continuations table with expiretime < NOW().  The querylog
> table is never read from - it is only written to to allow for auditing in
> the
> event of a problem or security incident.  All queries by the web frontend
> that are INSERT, UPDATE, or DELETE are logged to the table.  You can delete
> as many entries from querylog as you'd like based on the timestamp.  If you
> know you'd never look at data in the querylog table, you can disable it by
> setting QUERYLOGGING to 0 in conf.php (that may have been added in 2.4.2).
>
> That said, because the tables are in the innodb format, deleting entries
> will
> not decrease the amount of space consumed on disk.  It will free up space
> for
> future database entries that will be added without increasing the disk
> usage
> further.  It's kind of like a thin provisioned VM disk file.  The only way
> to
> actually reclaim the space is to backup the database by dumping it,
> deleting/recreating the database, and then doing a restore.  You can also
> reconfigure your database to use individual files per innodb table and then
> run an optimize query on the table (which creates a new table, transfers
> the
> data, and deletes the old table).
>
> I hope that helps!
>
> Josh
>
> On Wednesday, October 19, 2016 3:22:44 PM Junaid Ali wrote:
> > Hello,
> > We are currently using vcl version 2.3.2 in our environment. We use
> Active
> > Directory for LDAP Authentication and user accounts get added to specific
> > groups in VCL based on user access rights. Since its deployment, the VCL
> > MySQL database has not been purged of historical data. Curerntly the
> > querylog table is using 1.5 Gb and continuations table is using 750 Mb
> > storage. We are interested in cleaning the user accounts that exist in
> the
> > database and are not active (during the current academic year). Is there
> a
> > recommended procedure for purging user accounts from the VCL database? I
> > understand there is user data referenced in other VCL tables (e.g. log)
> > that needs to be deleted before the actual user account can be purged.
> >
> > Thanks
> >
> > Junaid Ali
> > Systems & Virtualization Engineer,
> > Office of Technology Services/IIT,
> > Chicago, IL - 60616
> - --
> - -------------------------------
> Josh Thompson
> VCL Developer
> North Carolina State University
>
> my GPG/PGP key can be found at pgp.mit.edu
>
> All electronic mail messages in connection with State business which
> are sent to or received by this account are subject to the NC Public
> Records Law and may be disclosed to third parties.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2
>
> iEYEARECAAYFAlgIxEQACgkQV/LQcNdtPQO6IQCdHsj3kLw769IFH7c6zS/cHaI0
> t/8An13UtK+iT1wHCIV0NdW06Oss3Uau
> =Yj8G
> -----END PGP SIGNATURE-----
>
>

Re: vcl database cleanup

Posted by Josh Thompson <jo...@ncsu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Junaid,

Good to hear from you - we've just been working on incorporating your AD work 
this week.  Thanks for contributing it - sorry it's taken so long to 
incorporate it.

I'd recommend against cleaning out the user entries because they are tied to 
so many other table entries.  If you have concerns of having old user data in 
there that could potentially be exposed in the event of a security breach, 
I'd recommend to anonymize the unityid, firstname, lastname, preferredname, 
and email fields for the old accounts.

To help with the space usage, cleaning up the continuations and querylog 
tables will be the most helpful.  I'd actually recommend having a maintenance 
window once or twice a year to clean those tables.  You can safely delete any 
entries from the continuations table with expiretime < NOW().  The querylog 
table is never read from - it is only written to to allow for auditing in the 
event of a problem or security incident.  All queries by the web frontend 
that are INSERT, UPDATE, or DELETE are logged to the table.  You can delete 
as many entries from querylog as you'd like based on the timestamp.  If you 
know you'd never look at data in the querylog table, you can disable it by 
setting QUERYLOGGING to 0 in conf.php (that may have been added in 2.4.2).

That said, because the tables are in the innodb format, deleting entries will 
not decrease the amount of space consumed on disk.  It will free up space for 
future database entries that will be added without increasing the disk usage 
further.  It's kind of like a thin provisioned VM disk file.  The only way to 
actually reclaim the space is to backup the database by dumping it, 
deleting/recreating the database, and then doing a restore.  You can also 
reconfigure your database to use individual files per innodb table and then 
run an optimize query on the table (which creates a new table, transfers the 
data, and deletes the old table).

I hope that helps!

Josh

On Wednesday, October 19, 2016 3:22:44 PM Junaid Ali wrote:
> Hello,
> We are currently using vcl version 2.3.2 in our environment. We use Active
> Directory for LDAP Authentication and user accounts get added to specific
> groups in VCL based on user access rights. Since its deployment, the VCL
> MySQL database has not been purged of historical data. Curerntly the
> querylog table is using 1.5 Gb and continuations table is using 750 Mb
> storage. We are interested in cleaning the user accounts that exist in the
> database and are not active (during the current academic year). Is there a
> recommended procedure for purging user accounts from the VCL database? I
> understand there is user data referenced in other VCL tables (e.g. log)
> that needs to be deleted before the actual user account can be purged.
> 
> Thanks
> 
> Junaid Ali
> Systems & Virtualization Engineer,
> Office of Technology Services/IIT,
> Chicago, IL - 60616
- -- 
- -------------------------------
Josh Thompson
VCL Developer
North Carolina State University

my GPG/PGP key can be found at pgp.mit.edu

All electronic mail messages in connection with State business which
are sent to or received by this account are subject to the NC Public
Records Law and may be disclosed to third parties.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlgIxEQACgkQV/LQcNdtPQO6IQCdHsj3kLw769IFH7c6zS/cHaI0
t/8An13UtK+iT1wHCIV0NdW06Oss3Uau
=Yj8G
-----END PGP SIGNATURE-----