You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Tom Nichols <tm...@gmail.com> on 2008/11/25 14:53:23 UTC

DSL for Groovy

Hi --

I've created a builder-style HBase client DSL for Groovy -- currently
it just wraps the client API to make inserts and row scans a little
easier.  There probably plenty of room for improvement so I wanted to
submit it to the community.  Any feedback or suggestions are welcome.

Here's an example:

def hbase = HBase.connect() // may optionally pass host name
/* Create:  this will create a table if it does not exist, or disable
& update column families
   if the table already does exist.  The table will be enabled when
the create statement returns */
hbase.create( 'myTable' ) {
  family( 'familyOne' ) {
    inMemory = true
    bloomFilter = false
  }
}

/* Insert/ update rows:
hbase.update( 'myTable' ) {
  row( 'rowOne' ) {
    family( 'familyOne' ) {
      col 'one', 'someValue'
      col 'two', 'anotherValue'
      col 'three', 1234
    }
    // alternate form that doesn't use nested family name:
    col 'familyOne:four', 12345
  }
  row( 'rowTwo' ) { /* more column values */ }
  // etc
}

So a more realistic example -- if you were iterating through some data
and inserting it would look like this:

hbase.update( 'myTable' ) {
	new URL( someCSV ).eachLine { line ->
		def values = line.split(',')	
		row( values[0] ) {
			col 'fam1:val1', values[1]
			// etc...	
		}
	}
}

There is also wrapper for the scanner API as well:

hbase.scan( cols : ['fam:col1', 'fam:col2'],
            start : '001', end : '200',
            // any timestamp args may be long, Date or Calendar
            timestamp : Date.parse( 'yy/mm/dd HH:MM:ss', '08/11/25 05:00:00' )
            ) { row ->

	// each row is a RowResult instance -- which is a Map!  So all map
operations are valid here:
  row.each { println '${it.key} : ${it.value}' }
}

Re: DSL for Groovy

Posted by stack <st...@duboce.net>.
Tom Nichols wrote:
> Yup, it does roughly follow a builder pattern.  I'd be happy to attach
> the code to a wiki page with some example documentation.
>   

That should do for now.  If you find that your groovy DSL starts to 
become your full time job -- smile -- then it would probably make sense 
hosting it somewhere.
> It seems silly to create a Google code project for what will unlikely
> grow beyond a small handful of classes.  But at the same time I don't
> expect you'd want it in the core project as it adds a dependency on
> the Groovy API -- there are ways to embed Groovy but I don't expect
> you'd want to go that far, right?
>
>   
Probably not.  We currently a little light-headed from the stench of 
ruby underpinning our shell; adding another dynamic language might 
confuse (or set off an uncontrollable chain reaction).  We could add 
under src/contrib but then you'd be dependent on a bunch of lazy fellas 
for keeping your stuff up to date.  You wouldn't want that.

St.Ack
P.S. Have you tried hooking your builder and cascadings'?

> -Tom
>
> On Tue, Nov 25, 2008 at 1:22 PM, stack <st...@duboce.net> wrote:
>   
>> This looks great Tom (Code as well as the DSL).  I'm not a groovy-head but
>> it looks like builder pattern to me and the example parsing and uploading an
>> URL fetch is nicely succinct.
>>
>> How do you want to proceed?  You think it worth sticking it up in google
>> code project somewhere?   Do you think it will evolve at all?  Or, if you
>> want, add the class to an issue -- perhaps call it somethinge else -- and
>> then make a wiki page linking to the issue with the below explication and
>> examples in it (See main hbase page -- pattern seems to be a link off here
>> to a page per language; e.g. jython, jruby, etc.).
>>
>> Good stuff,
>> St.Ack
>>
>>
>> Tom Nichols wrote:
>>     
>>> Hi --
>>>
>>> I've created a builder-style HBase client DSL for Groovy -- currently
>>> it just wraps the client API to make inserts and row scans a little
>>> easier.  There probably plenty of room for improvement so I wanted to
>>> submit it to the community.  Any feedback or suggestions are welcome.
>>>
>>> Here's an example:
>>>
>>> def hbase = HBase.connect() // may optionally pass host name
>>> /* Create:  this will create a table if it does not exist, or disable
>>> & update column families
>>>   if the table already does exist.  The table will be enabled when
>>> the create statement returns */
>>> hbase.create( 'myTable' ) {
>>>  family( 'familyOne' ) {
>>>    inMemory = true
>>>    bloomFilter = false
>>>  }
>>> }
>>>
>>> /* Insert/ update rows:
>>> hbase.update( 'myTable' ) {
>>>  row( 'rowOne' ) {
>>>    family( 'familyOne' ) {
>>>      col 'one', 'someValue'
>>>      col 'two', 'anotherValue'
>>>      col 'three', 1234
>>>    }
>>>    // alternate form that doesn't use nested family name:
>>>    col 'familyOne:four', 12345
>>>  }
>>>  row( 'rowTwo' ) { /* more column values */ }
>>>  // etc
>>> }
>>>
>>> So a more realistic example -- if you were iterating through some data
>>> and inserting it would look like this:
>>>
>>> hbase.update( 'myTable' ) {
>>>        new URL( someCSV ).eachLine { line ->
>>>                def values = line.split(',')
>>>                row( values[0] ) {
>>>                        col 'fam1:val1', values[1]
>>>                        // etc...
>>>                }
>>>        }
>>> }
>>>
>>> There is also wrapper for the scanner API as well:
>>>
>>> hbase.scan( cols : ['fam:col1', 'fam:col2'],
>>>            start : '001', end : '200',
>>>            // any timestamp args may be long, Date or Calendar
>>>            timestamp : Date.parse( 'yy/mm/dd HH:MM:ss', '08/11/25
>>> 05:00:00' )
>>>            ) { row ->
>>>
>>>        // each row is a RowResult instance -- which is a Map!  So all map
>>> operations are valid here:
>>>  row.each { println '${it.key} : ${it.value}' }
>>> }
>>>
>>>       
>>     


Re: DSL for Groovy

Posted by Tom Nichols <tm...@gmail.com>.
Yup, it does roughly follow a builder pattern.  I'd be happy to attach
the code to a wiki page with some example documentation.

I'm sure I'll  improve on it as long as I find more use cases.  But
it's in no way comprehensive so I'd be happy to receive improvements
that other people come up with.

It seems silly to create a Google code project for what will unlikely
grow beyond a small handful of classes.  But at the same time I don't
expect you'd want it in the core project as it adds a dependency on
the Groovy API -- there are ways to embed Groovy but I don't expect
you'd want to go that far, right?

-Tom

On Tue, Nov 25, 2008 at 1:22 PM, stack <st...@duboce.net> wrote:
> This looks great Tom (Code as well as the DSL).  I'm not a groovy-head but
> it looks like builder pattern to me and the example parsing and uploading an
> URL fetch is nicely succinct.
>
> How do you want to proceed?  You think it worth sticking it up in google
> code project somewhere?   Do you think it will evolve at all?  Or, if you
> want, add the class to an issue -- perhaps call it somethinge else -- and
> then make a wiki page linking to the issue with the below explication and
> examples in it (See main hbase page -- pattern seems to be a link off here
> to a page per language; e.g. jython, jruby, etc.).
>
> Good stuff,
> St.Ack
>
>
> Tom Nichols wrote:
>>
>> Hi --
>>
>> I've created a builder-style HBase client DSL for Groovy -- currently
>> it just wraps the client API to make inserts and row scans a little
>> easier.  There probably plenty of room for improvement so I wanted to
>> submit it to the community.  Any feedback or suggestions are welcome.
>>
>> Here's an example:
>>
>> def hbase = HBase.connect() // may optionally pass host name
>> /* Create:  this will create a table if it does not exist, or disable
>> & update column families
>>   if the table already does exist.  The table will be enabled when
>> the create statement returns */
>> hbase.create( 'myTable' ) {
>>  family( 'familyOne' ) {
>>    inMemory = true
>>    bloomFilter = false
>>  }
>> }
>>
>> /* Insert/ update rows:
>> hbase.update( 'myTable' ) {
>>  row( 'rowOne' ) {
>>    family( 'familyOne' ) {
>>      col 'one', 'someValue'
>>      col 'two', 'anotherValue'
>>      col 'three', 1234
>>    }
>>    // alternate form that doesn't use nested family name:
>>    col 'familyOne:four', 12345
>>  }
>>  row( 'rowTwo' ) { /* more column values */ }
>>  // etc
>> }
>>
>> So a more realistic example -- if you were iterating through some data
>> and inserting it would look like this:
>>
>> hbase.update( 'myTable' ) {
>>        new URL( someCSV ).eachLine { line ->
>>                def values = line.split(',')
>>                row( values[0] ) {
>>                        col 'fam1:val1', values[1]
>>                        // etc...
>>                }
>>        }
>> }
>>
>> There is also wrapper for the scanner API as well:
>>
>> hbase.scan( cols : ['fam:col1', 'fam:col2'],
>>            start : '001', end : '200',
>>            // any timestamp args may be long, Date or Calendar
>>            timestamp : Date.parse( 'yy/mm/dd HH:MM:ss', '08/11/25
>> 05:00:00' )
>>            ) { row ->
>>
>>        // each row is a RowResult instance -- which is a Map!  So all map
>> operations are valid here:
>>  row.each { println '${it.key} : ${it.value}' }
>> }
>>
>
>

Re: DSL for Groovy

Posted by stack <st...@duboce.net>.
This looks great Tom (Code as well as the DSL).  I'm not a groovy-head 
but it looks like builder pattern to me and the example parsing and 
uploading an URL fetch is nicely succinct.

How do you want to proceed?  You think it worth sticking it up in google 
code project somewhere?   Do you think it will evolve at all?  Or, if 
you want, add the class to an issue -- perhaps call it somethinge else 
-- and then make a wiki page linking to the issue with the below 
explication and examples in it (See main hbase page -- pattern seems to 
be a link off here to a page per language; e.g. jython, jruby, etc.).

Good stuff,
St.Ack


Tom Nichols wrote:
> Hi --
>
> I've created a builder-style HBase client DSL for Groovy -- currently
> it just wraps the client API to make inserts and row scans a little
> easier.  There probably plenty of room for improvement so I wanted to
> submit it to the community.  Any feedback or suggestions are welcome.
>
> Here's an example:
>
> def hbase = HBase.connect() // may optionally pass host name
> /* Create:  this will create a table if it does not exist, or disable
> & update column families
>    if the table already does exist.  The table will be enabled when
> the create statement returns */
> hbase.create( 'myTable' ) {
>   family( 'familyOne' ) {
>     inMemory = true
>     bloomFilter = false
>   }
> }
>
> /* Insert/ update rows:
> hbase.update( 'myTable' ) {
>   row( 'rowOne' ) {
>     family( 'familyOne' ) {
>       col 'one', 'someValue'
>       col 'two', 'anotherValue'
>       col 'three', 1234
>     }
>     // alternate form that doesn't use nested family name:
>     col 'familyOne:four', 12345
>   }
>   row( 'rowTwo' ) { /* more column values */ }
>   // etc
> }
>
> So a more realistic example -- if you were iterating through some data
> and inserting it would look like this:
>
> hbase.update( 'myTable' ) {
> 	new URL( someCSV ).eachLine { line ->
> 		def values = line.split(',')	
> 		row( values[0] ) {
> 			col 'fam1:val1', values[1]
> 			// etc...	
> 		}
> 	}
> }
>
> There is also wrapper for the scanner API as well:
>
> hbase.scan( cols : ['fam:col1', 'fam:col2'],
>             start : '001', end : '200',
>             // any timestamp args may be long, Date or Calendar
>             timestamp : Date.parse( 'yy/mm/dd HH:MM:ss', '08/11/25 05:00:00' )
>             ) { row ->
>
> 	// each row is a RowResult instance -- which is a Map!  So all map
> operations are valid here:
>   row.each { println '${it.key} : ${it.value}' }
> }
>