You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jim Kellerman (JIRA)" <ji...@apache.org> on 2008/04/02 05:31:24 UTC
[jira] Updated: (HBASE-469) Streamline HStore startup and
compactions
[ https://issues.apache.org/jira/browse/HBASE-469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jim Kellerman updated HBASE-469:
--------------------------------
Attachment: patch.txt
> Here's a couple of comments:
>
> Why catch the exception? Why not just let out?
{code}
+ if (fs == null) {
+ try {
+ this.fs = FileSystem.get(conf);
+ } catch (IOException e) {
+ LOG.fatal("error getting file system", e);
+ throw e;
+ }
{code}
Beats me. Is what we did before. All I really did was encase the existing code inside the if. I have removed the try/catch
> Regards the below:
{code}
+ protected void deleteLocal(File f) {
+ if (f.exists()) {
+ if (f.isDirectory()) {
+ File[] children = f.listFiles();
+ if (children != null && children.length > 0) {
+ for (int i = 0; i < children.length; i++) {
+ deleteLocal(children[i]);
+ }
+ }
+ }
+ f.delete();
+ }
+ }
{code}
> Its called deleteLocalFile - does it only work against local filesystem?
Yes. It is used to clean up the cruft that MapReduce leaves in the local
file system when it runs a test (because it does not clean up after itself)
> Should this method be in file utils instead?
> Is there a method that does this up in hadoop?
Yes, duh! Removed.
> What if test is run on DFS?
It is only used to clean up the cruft that MapReduce leaves in the local file system.
> Should the System.out.* logging be changed in
> src/test/org/apache/hadoop/hbase/TestHBaseCluster.java?
Yes also in TestHRegion.
> Whats happening with TestMemcache? Is there a renaming going on?
> Its different from TestHMemcache?
It was a new test I wrote while debugging. I folded it into TestHMemcache.
> Why do the below:
{code}
+ } catch (IOException e) {
+ e.printStackTrace();
+ throw e;
{code}
> Doesn't a stacktrace come out anyways?
Yes, removed.
> Log instead? + System.err.println("setup completed.");
> Change all the System.out* to use log?
Yes. Done.
> Just let the below out?
{code}
+ try {
+ r.flushcache();
+ } catch (IOException e) {
+ e.printStackTrace();
+ throw e;
+ }
+ try {
+ r.compactStores();
+ } catch (IOException e) {
+ e.printStackTrace();
+ throw e;
+ }
{code}
> And here:
{code}
+ try {
+ r.flushcache();
+ r.compactStores();
+ } catch (IOException e) {
+ e.printStackTrace();
+ throw e;
+ }
{code}
Done.
> Should this run in tearDown?
{code}
} finally {
mrCluster.shutdown();
+ if (jobConf != null) {
+ deleteLocal(new File(jobConf.get("hadoop.tmp.dir")));
+ }
{code}
Yes. Done.
> Why is this necessary?
{code}
+ void close() {
+ this.lock.writeLock().lock();
+ try {
+ this.memcache.clear();
+ this.snapshot.clear();
+ } finally {
+ this.lock.writeLock().unlock();
+ }
+ }
{code}
> Its just busy work and it will mask real memory retention issues - e.g.
> if the host of this MemCache is being held on to, then this clearing will
> make it that condidtion harder to detect (if the host is not being
> referenced, the memcache and snapshot have no path to root and GC will
> clean them up). I see that we used do this elsewhere...and you're just
> tidying stuff.
I was trying to give the garbage collector some help, but you're right, it's not necessary.
> Can this be final? + protected Memcache memcache = new Memcache();
Yes. Done.
> Why remove the final in below?
{code}
* final FileSystem fs;
* private final HBaseConfiguration conf;
+ FileSystem fs;
+ private HBaseConfiguration conf;
{code}
> Because not set on construction?
No, close at one point nulled out these pointers to try to help the garbage collector. They are final now.
> This is a nice addition -> + private volatile long storeSize;
Yes. HStore for the most part knows how big it is, so why recompute it every time we want to know if the store wants to be split.
> I think one of my patches yesterday might make it so this part doesn't apply:
{code}
@@ -208,29 +220,18 @@
if(LOG.isDebugEnabled()) {
LOG.debug("starting " + storeName +
((reconstructionLog == null || !fs.exists(reconstructionLog)) ?
- " (no reconstruction log)" :
- " with reconstruction log: " + reconstructionLog.toString()));
+ " (no reconstruction log)" :
+ " with reconstruction log: " + reconstructionLog.toString()));
{code}
y, I resolved the conflict.
> What you think our policy on data member access should be? Sometimes we make
> things default access. Other times we add accessors:
{code}
+
+ HColumnDescriptor getFamily() {
+ return this.family;
}
{code}
I don't have a clear opinion on this. If access is package scoped, usually I don't go through an accessor. But if you'd like to formalize this, that's cool with me too.
> Should we log debug something here just in case things are not working as
> we'd hope? Just suggesting...
{code}
+ } catch (IOException e) {
+ // If the HSTORE_LOGINFOFILE doesn't contain a number, just ignore it.
+ // That means it was built prior to the previous run of HStore, and so
+ // it cannot contain any updates also contained in the log.
+ }
{code}
Done.
> You want to remove this?
{code}
+// this.basedir = null;
+// this.info = null;
+// this.family = null;
+// this.fs = null;
{code}
Done.
> Would suggest this log not necessary
{code}
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("Not compacting " + this.storeName +
+ " because no store files to compact.");
+ }
{code}
I think this is useful at debug level. If we ever get our INFO/DEBUG messages right, then this should only show up if we are trying to trace a problem.
> Lets log events and changes in state - not the fact that they did not
> happen (there are a few other logs after this one). Would suggest that as
> part of logging compactions, you add rationale.
Like I said above, it we make INFO level logging provide us with enough information, this, as a debug level log, should disappear.
> Should the below check be done inside the checkSplit method for the sake of
> better coherence?
{code}
+ if (storeSize > this.desiredMaxFileSize) {
+ return checkSplit();
+ }
{code}
Yes. Done.
> This is intentional (Moving where reportOpen happens?)?
{code}
- reportOpen(regionInfo);
}
+ reportOpen(regionInfo);
{code}
Yes. Sometimes the master sends multiple open region requests to the region server, why I don't know. If it does and the region server already has the region open, It can make the master happy by telling it that the region has been opened.
> I don't think the HRegion cleanup you've added helps (See above note on
> how I think it could make memleaks harder to detect).
Agreed. Removed.
> Streamline HStore startup and compactions
> -----------------------------------------
>
> Key: HBASE-469
> URL: https://issues.apache.org/jira/browse/HBASE-469
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 0.2.0
> Reporter: Jim Kellerman
> Assignee: Jim Kellerman
> Fix For: 0.2.0
>
> Attachments: patch.txt, patch.txt, patch.txt
>
>
> Several useful patches that streamline HStore startup and compactions that were a part of the abandoned changes in HBASE-69 should be incorporated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.