You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Arun C Murthy (JIRA)" <ji...@apache.org> on 2009/09/27 06:37:16 UTC

[jira] Created: (HADOOP-6287) Support units for configuration knobs

Support units for configuration knobs
-------------------------------------

                 Key: HADOOP-6287
                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
             Project: Hadoop Common
          Issue Type: Improvement
          Components: conf
            Reporter: Arun C Murthy


We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760199#action_12760199 ] 

Hadoop QA commented on HADOOP-6287:
-----------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12420690/hadoop-6287-1.patch
  against trunk revision 818543.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/66/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/66/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/66/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-h4.grid.sp2.yahoo.net/66/console

This message is automatically generated.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760145#action_12760145 ] 

eric baldeschwieler commented on HADOOP-6287:
---------------------------------------------

We could add range checking and target units too.

Target units would let us deal with variables in the system like mapred.job.memory.mb.
We could make the whole thing a lot more self documenting at the same time...

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760281#action_12760281 ] 

Nigel Daley commented on HADOOP-6287:
-------------------------------------

Can GB, MB, etc be lower case? 
Can there be a space between the number and the units?
Should *-default.xml be updated so that comments say something like "If no units are specified, the default unit is MB."?

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760375#action_12760375 ] 

Chris Douglas commented on HADOOP-6287:
---------------------------------------

The return type is not entirely clear. If one specifies 500k and the caller asks for the spec in megabytes, what does it return? While {{double}} return values would work, most contexts prefer {{long}}. We should probably just truncate residue (following java.util.concurrent.TimeUnit, which should also have get/set in this issue).

bq. For time we could have Configuration#getMilliseconds()

I'd rather follow Java and use Configuration::getTime(String, TimeUnit, long). I wouldn't object to a convenience getMilliseconds, though.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6287) Support units for configuration knobs

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins updated HADOOP-6287:
--------------------------------

    Status: Patch Available  (was: Open)

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760547#action_12760547 ] 

Steve Loughran commented on HADOOP-6287:
----------------------------------------

This is getting very complex, and I am starting to thing -1 is the correct reaction to this problem. There is a risk that this ends up a bit like the XSD type system.

I'm also worried about the impact this has on my plans to add support for >1 configuration back end, including LDAP.

FWIW, in the SmartFrog syntax we have basic math symbols, stops the caller having to care about stuff. But it does mean the type system needs to differentiate between string and numeric values, 

{code}
  dfs.balance.bandwidthPerSec  (1024 * 1024);
  dfs.block.size  (65536 * 1024);
  dfs.blockreport.intervalMsec  (60 * 60 * 1000);
  mapred.tasktracker.expiry.interval (10 * 60 * 1000);
  mapred.jobtracker.retirejob.check (60 * 1000);
{code}
Of course, then you need functions, evaluation order rules, etc, etc, which are bonus complexity, especially if you start worrying about where that work takes place, in the getInt/getLong() operation, or everywhere




> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760258#action_12760258 ] 

Arun C Murthy commented on HADOOP-6287:
---------------------------------------

Eli, thanks for the patch, but I concur with Steve... I don't think we want to support units for Configuration.get{Int|Long}.

I was planning on the lines of adding a new method:

{code}
Configuration.getStorage(String key, Unit unit);
{code}

where

{code}
enum Unit {
  B,
  KB,
  MB,
  GB,
  TB,
  PB,
  ...
}
{code}

Thus, if we are setting ulimit we can do:
{code}Configuration.getStorage("mapreduce.map.child.ulimit", KB);{code}

File i/o buffer sizes:
{code}Configuration.getStorage("io.file.buffer.size", KB);{code}

Memory limits for child jvms:
{code}Configuration.getStorage("mapreduce.job.map.memory", MB);{code}

Thoughts?
                                                          

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6287) Support units for configuration knobs

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins updated HADOOP-6287:
--------------------------------

    Attachment: hadoop-6287-1.patch

Here's a patch which supports units for integer/long parameters. 

+1 for checking ranges and target units, though this patch doesn't.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6287) Support units for configuration knobs

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-6287:
----------------------------------

    Status: Open  (was: Patch Available)

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760208#action_12760208 ] 

Steve Loughran commented on HADOOP-6287:
----------------------------------------

I think units are good, but they might need to be restricted to those conf values that are asking for disk/memory capacity, rather than any integer. 

# if you are worrying about network rates, Gb for gigabit is the unit to consider, anyone passing in PB is really in error
#  in {{dfs.blockreport.intervalMsec}}, and other time ranges, the value should be mS, S, minutes, hours, etc.

While I am not proposing anyone rushes to do a complete type-safe framework, maybe a {{conf.getStorageValue()}} might be the right method to return a storage value; other units of measurement could be patched in for different types later

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760211#action_12760211 ] 

Steve Loughran commented on HADOOP-6287:
----------------------------------------

Also, what happens when you try to getInt() a value that is out of range of an integer? I believe historically Integer.parseInt() would throw a NumberFormatException which would lead to the default. Trying to cast a long to an int may be a different behaviour.

# Tests need to establish what the current behaviour is
# New code needs to maintain that behaviour -if everyone agrees that the old behaviour was the correct one.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760499#action_12760499 ] 

eric baldeschwieler commented on HADOOP-6287:
---------------------------------------------

So what happens if I ask for the same config property, once as a time and once as storage?
It would seem clearer to spec what the unit is, valid ranges, etc in one place, where the property is defined...

Can the current code be bent to support that style?  We could then put config documentation in the same place and auto-generate all kinds of useful information.

It seems like we should allow long text (IE gigabytes) as well as GB.  We need to be careful of bits vs bytes.  gB and gb are not the same thing.  Taking the shorthand seems error prone in networking contexts where bits are the default.  Given that we know this is byes, taking just the scale [pgmk] seems fine, but we would need to know that.  

We should probably throw an exception, not default or truncate when a value is out of range or uses a surprising unit (bits for bytes for example).

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760628#action_12760628 ] 

Philip Zeyliger commented on HADOOP-6287:
-----------------------------------------

I'm +1 eric's suggestion: I think config documentation and code should be co-located, and should generate things.  

I'd like to do something like this:

{code}
@ConfigVariableDeclaration
  public final static ConfigVariable<URI> fsDefaultName = 
    ConfigVariableBuilder.newURI()
      .setDefault(null)
      .setHelp("Default filesystem")
      .setVisibility(Visibility.PUBLIC)
      .addAccessor("fs.default.name")
      .addDeprecatedAccessor("core.default.fs", "Use foo instead")
      .addValidator(new ValidateSupportedFilesystem());
{code}
(See more at https://issues.apache.org/jira/browse/HADOOP-6105#action_12727143)

Yes, this imposes a type system: fsDefaultName now has to be a URI.  There could be other units/validators for time, disk, etc.  You then should access the variable by using "fsDefaultName.get(Configuration)", which means that it's easy to find all usages of the configuration.  I'd very much like Hadoop to find configuration errors early, and for those errors to be sensible.  (It would help both with NPEs deep inside of things that aren't obviously looking at configuration, but also at the wide array of ways we store comma-delimited values in Configuration: we should only be writing that logic correctly once, and, you know, we should support filenames with commas in them.)

Steve, I actually think having a schema on configuration would make LDAP support easier, not harder.  For reasons of backwards compatibility, we're going to support toString/fromString for a long time, and if you know what you're storing, the mapping functions to/from LDAP are going to be considerably easier to write.

-- Philip

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760660#action_12760660 ] 

eric baldeschwieler commented on HADOOP-6287:
---------------------------------------------

I'm not a deep java guy, but I like the basic sound of that.  
Could you expand on the URI thing?
The annotations?

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-6287) Support units for configuration knobs

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eli Collins updated HADOOP-6287:
--------------------------------

    Assignee: Philip Zeyliger  (was: Eli Collins)

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Philip Zeyliger
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Eli Collins (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760361#action_12760361 ] 

Eli Collins commented on HADOOP-6287:
-------------------------------------

@Steve - currently the code always catches NumberFormatExceptions and returns the default value. The above patch truncates it. The right thing to do is support ranges.

@Arun - is the "KB" in getStorage the way to interpret the value if no unit is specified or a gurantee that the return value will be in KB (regardless of what units the user specified) or both?

@Nigel - yes "gb" etc can be lower case. Currently no spaces between number and unit are accepted, that's reasonable though. 

bq. For time we could have Configuration#getMilliseconds(). If no units are spec'd or "ms" is spec'd then it would return the same value as getLong(), but if it ended in in "s", "m", or "h" or it had colons then it could be treated differently. Example values might be "10ms", "2.5h", "2:30:00", "1d".

@Doug - The above sounds good. I also think a typed approach similar to what Philip pointed out in HADOOP-6288 makes sense, not sure how many legs we want the current code to grow. One of the things Ada did right was encourage the use of variables typed with units so not everything was stored in an integer. 

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760335#action_12760335 ] 

Doug Cutting commented on HADOOP-6287:
--------------------------------------

> Configuration.getStorage("io.file.buffer.size", KB);

What would be in the config?  "128" or "128KB"?  Would the specified units just be the default, or would it always multiple the value by 1024?

If a parameter is always in kilobytes, then its name should perhaps state that, e.g., "io.file.buffer.kb" or somesuch.  Otherwise it seems wise to include multipliers in the text of the value, e.g., "128k".

So we could leave off the "B" and add this to getInt() and getLong().  The documentation for a parameter can then indicate whether its bits or bytes or milliseconds.  The value is just a number, and we permit multipliers for numbers, k=1024, m=k*k, etc.

Alternately, we might require the "B" and use a 'long getBytes(String)' method, e.g. "1024", "1kB", "10GB".

For bandwidth we could then use a separate method, "float getBytesPerSecond(String)" that accepts things like "10MB/s", "100Mb/s", and "1Gbps".

For time we could have Configuration#getMilliseconds().  If no units are spec'd or "ms" is spec'd then it would return the same value as getLong(), but if it ended in in "s", "m", or "h" or it had colons then it could be treated differently.  Example values might be "10ms", "2.5h", "2:30:00", "1d".

Thoughts?

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760665#action_12760665 ] 

Philip Zeyliger commented on HADOOP-6287:
-----------------------------------------

The URI is the type that you'll get if you get the value of the configuration variable with "fsDefaultName.get(Configuration)".  Different configuration variables will have different types there.  Today, Configuration.get() returns String, and there are helper methods to get integers, etc.  The basic types we'd want to support right off are Boolean, List<URI>, URI, String, List<String>, Integer, and probably a few others.  In the context of this ticket, we might want to support a Size type, which has a Size.getValue(SizeUnit.BYTES) accessor.

The annotation is used to process the code to generate the documentation.  There's more than one way to do this, but it's pretty easy to ask a variant of javac to spit out all the annotated variables.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-6287) Support units for configuration knobs

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760833#action_12760833 ] 

eric baldeschwieler commented on HADOOP-6287:
---------------------------------------------

Thanks!  Makes sense.  I was not following that URI was an example of a possible configuration type.
I like the approach.

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-6287) Support units for configuration knobs

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-6287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy reassigned HADOOP-6287:
-------------------------------------

    Assignee: Eli Collins

> Support units for configuration knobs
> -------------------------------------
>
>                 Key: HADOOP-6287
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6287
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>            Reporter: Arun C Murthy
>            Assignee: Eli Collins
>         Attachments: hadoop-6287-1.patch
>
>
> We should add support for units in our Configuration system so that we can specify values to be *1GB* or *1MB* rather than forcing every component which consumes such values to be aware of these.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.