You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Roger Maillist <da...@gmail.com> on 2014/10/04 23:15:31 UTC

Anyone good with JRuby?

Hi out there

I am trying to read a (binary) file from the local FS and store it in HBase
using JRuby.

But I fail with the byte-array InputStream needed for the Put-Method:

require "java"

java_import "java.io.File"
java_import "java.io.FileInputStream"

java_import "org.apache.hadoop.hbase.client.HTable"
java_import "org.apache.hadoop.hbase.client.Put"

def jbytes(*args)
  args.map { |arg| arg.to_s.to_java_bytes }
end

puts "Hello from Ruby"

inFile = File.new("/home/roger/Downloads/test.jpg")
inputStream = FileInputStream.new(inFile)

length = inFile.length()
buffer = Java::byte[length].new

inputStream.read(buffer)

table = HTable.new(@hbase.configuration, "emails")
p = Put.new(*jbytes("roger3.pdf"))

p.add(*jbytes("inhalt", "", buffer))

table.put(p)

inputStream.close()
table.close()



Has anyone done this right?

I tried and googled....no breakthrough :-/

Thanks
Roger

Re: Anyone good with JRuby?

Posted by Sean Busbey <bu...@cloudera.com>.
presuming the file can file in memory, you can just use the Ruby IO/File
method to read the entire thing. Note that the version of JRuby used with
the HBase shell is fixed at Ruby 1.8:

buffer = IO.read("/home/roger/Downloads/test.jpg")

ref:

http://ruby-doc.org/core-1.8.7/IO.html#method-c-read

On Sat, Oct 4, 2014 at 4:33 PM, Roger Maillist <da...@gmail.com>
wrote:

> Well, I see they are using .to_java_bytes to cast a string and pass it to
> the put-method, that's ok. But I am having trouble calling
> the inputStream.read method.
>
> I tried this:
>
> inFile = File.new("/home/roger/Downloads/test.jpg")
> inputStream = FileInputStream.new(inFile)
>
> length = inFile.length()
> buffer = ""
>
> inputStream.read(buffer)
>
> But that won't work. It's probably more of a JRuby question than actually
> an HBase issue...
>
> 2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:
>
> > Take a look at _put_internal() method of
> > hbase-shell//src/main/ruby/hbase/table.rb
> >
> > On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <
> darkchanterlist@gmail.com>
> > wrote:
> >
> > > Hi out there
> > >
> > > I am trying to read a (binary) file from the local FS and store it in
> > HBase
> > > using JRuby.
> > >
> > > But I fail with the byte-array InputStream needed for the Put-Method:
> > >
> > > require "java"
> > >
> > > java_import "java.io.File"
> > > java_import "java.io.FileInputStream"
> > >
> > > java_import "org.apache.hadoop.hbase.client.HTable"
> > > java_import "org.apache.hadoop.hbase.client.Put"
> > >
> > > def jbytes(*args)
> > >   args.map { |arg| arg.to_s.to_java_bytes }
> > > end
> > >
> > > puts "Hello from Ruby"
> > >
> > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > inputStream = FileInputStream.new(inFile)
> > >
> > > length = inFile.length()
> > > buffer = Java::byte[length].new
> > >
> > > inputStream.read(buffer)
> > >
> > > table = HTable.new(@hbase.configuration, "emails")
> > > p = Put.new(*jbytes("roger3.pdf"))
> > >
> > > p.add(*jbytes("inhalt", "", buffer))
> > >
> > > table.put(p)
> > >
> > > inputStream.close()
> > > table.close()
> > >
> > >
> > >
> > > Has anyone done this right?
> > >
> > > I tried and googled....no breakthrough :-/
> > >
> > > Thanks
> > > Roger
> > >
> >
>



-- 
Sean

Re: Anyone good with JRuby?

Posted by Roger Maillist <da...@gmail.com>.
Thank you guys for this great discussion. I got my first Jruby-script
running in the HBase shell. It scans a directory and loads each file of a
given type into a Table :-)

2014-10-05 1:43 GMT+02:00 Ted Yu <yu...@gmail.com>:

> Take a look at Bytes.readByteArray():
>
>   public static byte [] readByteArray(final DataInput in)
>
> ...
>
>     byte [] result = new byte[len];
>
>     in.readFully(result, 0, len);
>
> In your case, you have 'buffer' so you don't need to allocate 'result'.
>
> Just plug buffer in the call to readFully().
>
> Cheers
>
> On Sat, Oct 4, 2014 at 4:37 PM, Roger Maillist <da...@gmail.com>
> wrote:
>
> > I still don't see how they would read into a byte[] buffer. This method
> > seems to read an integer value, which is simple. The read(byte[]) method
> > returns the number of bytes read and copies the stream into the
> > out-parameter. That's unclear to me for JRuby...
> >
> > 2014-10-05 0:29 GMT+02:00 Ted Yu <yu...@gmail.com>:
> >
> > > Take a look at readFile() method in bin/region_mover.rb
> > >
> > > Cheers
> > >
> > > On Sat, Oct 4, 2014 at 2:33 PM, Roger Maillist <
> > darkchanterlist@gmail.com>
> > > wrote:
> > >
> > > > Well, I see they are using .to_java_bytes to cast a string and pass
> it
> > to
> > > > the put-method, that's ok. But I am having trouble calling
> > > > the inputStream.read method.
> > > >
> > > > I tried this:
> > > >
> > > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > > inputStream = FileInputStream.new(inFile)
> > > >
> > > > length = inFile.length()
> > > > buffer = ""
> > > >
> > > > inputStream.read(buffer)
> > > >
> > > > But that won't work. It's probably more of a JRuby question than
> > actually
> > > > an HBase issue...
> > > >
> > > > 2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:
> > > >
> > > > > Take a look at _put_internal() method of
> > > > > hbase-shell//src/main/ruby/hbase/table.rb
> > > > >
> > > > > On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <
> > > > darkchanterlist@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi out there
> > > > > >
> > > > > > I am trying to read a (binary) file from the local FS and store
> it
> > in
> > > > > HBase
> > > > > > using JRuby.
> > > > > >
> > > > > > But I fail with the byte-array InputStream needed for the
> > Put-Method:
> > > > > >
> > > > > > require "java"
> > > > > >
> > > > > > java_import "java.io.File"
> > > > > > java_import "java.io.FileInputStream"
> > > > > >
> > > > > > java_import "org.apache.hadoop.hbase.client.HTable"
> > > > > > java_import "org.apache.hadoop.hbase.client.Put"
> > > > > >
> > > > > > def jbytes(*args)
> > > > > >   args.map { |arg| arg.to_s.to_java_bytes }
> > > > > > end
> > > > > >
> > > > > > puts "Hello from Ruby"
> > > > > >
> > > > > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > > > > inputStream = FileInputStream.new(inFile)
> > > > > >
> > > > > > length = inFile.length()
> > > > > > buffer = Java::byte[length].new
> > > > > >
> > > > > > inputStream.read(buffer)
> > > > > >
> > > > > > table = HTable.new(@hbase.configuration, "emails")
> > > > > > p = Put.new(*jbytes("roger3.pdf"))
> > > > > >
> > > > > > p.add(*jbytes("inhalt", "", buffer))
> > > > > >
> > > > > > table.put(p)
> > > > > >
> > > > > > inputStream.close()
> > > > > > table.close()
> > > > > >
> > > > > >
> > > > > >
> > > > > > Has anyone done this right?
> > > > > >
> > > > > > I tried and googled....no breakthrough :-/
> > > > > >
> > > > > > Thanks
> > > > > > Roger
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Anyone good with JRuby?

Posted by Ted Yu <yu...@gmail.com>.
Take a look at Bytes.readByteArray():

  public static byte [] readByteArray(final DataInput in)

...

    byte [] result = new byte[len];

    in.readFully(result, 0, len);

In your case, you have 'buffer' so you don't need to allocate 'result'.

Just plug buffer in the call to readFully().

Cheers

On Sat, Oct 4, 2014 at 4:37 PM, Roger Maillist <da...@gmail.com>
wrote:

> I still don't see how they would read into a byte[] buffer. This method
> seems to read an integer value, which is simple. The read(byte[]) method
> returns the number of bytes read and copies the stream into the
> out-parameter. That's unclear to me for JRuby...
>
> 2014-10-05 0:29 GMT+02:00 Ted Yu <yu...@gmail.com>:
>
> > Take a look at readFile() method in bin/region_mover.rb
> >
> > Cheers
> >
> > On Sat, Oct 4, 2014 at 2:33 PM, Roger Maillist <
> darkchanterlist@gmail.com>
> > wrote:
> >
> > > Well, I see they are using .to_java_bytes to cast a string and pass it
> to
> > > the put-method, that's ok. But I am having trouble calling
> > > the inputStream.read method.
> > >
> > > I tried this:
> > >
> > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > inputStream = FileInputStream.new(inFile)
> > >
> > > length = inFile.length()
> > > buffer = ""
> > >
> > > inputStream.read(buffer)
> > >
> > > But that won't work. It's probably more of a JRuby question than
> actually
> > > an HBase issue...
> > >
> > > 2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:
> > >
> > > > Take a look at _put_internal() method of
> > > > hbase-shell//src/main/ruby/hbase/table.rb
> > > >
> > > > On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <
> > > darkchanterlist@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi out there
> > > > >
> > > > > I am trying to read a (binary) file from the local FS and store it
> in
> > > > HBase
> > > > > using JRuby.
> > > > >
> > > > > But I fail with the byte-array InputStream needed for the
> Put-Method:
> > > > >
> > > > > require "java"
> > > > >
> > > > > java_import "java.io.File"
> > > > > java_import "java.io.FileInputStream"
> > > > >
> > > > > java_import "org.apache.hadoop.hbase.client.HTable"
> > > > > java_import "org.apache.hadoop.hbase.client.Put"
> > > > >
> > > > > def jbytes(*args)
> > > > >   args.map { |arg| arg.to_s.to_java_bytes }
> > > > > end
> > > > >
> > > > > puts "Hello from Ruby"
> > > > >
> > > > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > > > inputStream = FileInputStream.new(inFile)
> > > > >
> > > > > length = inFile.length()
> > > > > buffer = Java::byte[length].new
> > > > >
> > > > > inputStream.read(buffer)
> > > > >
> > > > > table = HTable.new(@hbase.configuration, "emails")
> > > > > p = Put.new(*jbytes("roger3.pdf"))
> > > > >
> > > > > p.add(*jbytes("inhalt", "", buffer))
> > > > >
> > > > > table.put(p)
> > > > >
> > > > > inputStream.close()
> > > > > table.close()
> > > > >
> > > > >
> > > > >
> > > > > Has anyone done this right?
> > > > >
> > > > > I tried and googled....no breakthrough :-/
> > > > >
> > > > > Thanks
> > > > > Roger
> > > > >
> > > >
> > >
> >
>

Re: Anyone good with JRuby?

Posted by Roger Maillist <da...@gmail.com>.
I still don't see how they would read into a byte[] buffer. This method
seems to read an integer value, which is simple. The read(byte[]) method
returns the number of bytes read and copies the stream into the
out-parameter. That's unclear to me for JRuby...

2014-10-05 0:29 GMT+02:00 Ted Yu <yu...@gmail.com>:

> Take a look at readFile() method in bin/region_mover.rb
>
> Cheers
>
> On Sat, Oct 4, 2014 at 2:33 PM, Roger Maillist <da...@gmail.com>
> wrote:
>
> > Well, I see they are using .to_java_bytes to cast a string and pass it to
> > the put-method, that's ok. But I am having trouble calling
> > the inputStream.read method.
> >
> > I tried this:
> >
> > inFile = File.new("/home/roger/Downloads/test.jpg")
> > inputStream = FileInputStream.new(inFile)
> >
> > length = inFile.length()
> > buffer = ""
> >
> > inputStream.read(buffer)
> >
> > But that won't work. It's probably more of a JRuby question than actually
> > an HBase issue...
> >
> > 2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:
> >
> > > Take a look at _put_internal() method of
> > > hbase-shell//src/main/ruby/hbase/table.rb
> > >
> > > On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <
> > darkchanterlist@gmail.com>
> > > wrote:
> > >
> > > > Hi out there
> > > >
> > > > I am trying to read a (binary) file from the local FS and store it in
> > > HBase
> > > > using JRuby.
> > > >
> > > > But I fail with the byte-array InputStream needed for the Put-Method:
> > > >
> > > > require "java"
> > > >
> > > > java_import "java.io.File"
> > > > java_import "java.io.FileInputStream"
> > > >
> > > > java_import "org.apache.hadoop.hbase.client.HTable"
> > > > java_import "org.apache.hadoop.hbase.client.Put"
> > > >
> > > > def jbytes(*args)
> > > >   args.map { |arg| arg.to_s.to_java_bytes }
> > > > end
> > > >
> > > > puts "Hello from Ruby"
> > > >
> > > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > > inputStream = FileInputStream.new(inFile)
> > > >
> > > > length = inFile.length()
> > > > buffer = Java::byte[length].new
> > > >
> > > > inputStream.read(buffer)
> > > >
> > > > table = HTable.new(@hbase.configuration, "emails")
> > > > p = Put.new(*jbytes("roger3.pdf"))
> > > >
> > > > p.add(*jbytes("inhalt", "", buffer))
> > > >
> > > > table.put(p)
> > > >
> > > > inputStream.close()
> > > > table.close()
> > > >
> > > >
> > > >
> > > > Has anyone done this right?
> > > >
> > > > I tried and googled....no breakthrough :-/
> > > >
> > > > Thanks
> > > > Roger
> > > >
> > >
> >
>

Re: Anyone good with JRuby?

Posted by Ted Yu <yu...@gmail.com>.
Take a look at readFile() method in bin/region_mover.rb

Cheers

On Sat, Oct 4, 2014 at 2:33 PM, Roger Maillist <da...@gmail.com>
wrote:

> Well, I see they are using .to_java_bytes to cast a string and pass it to
> the put-method, that's ok. But I am having trouble calling
> the inputStream.read method.
>
> I tried this:
>
> inFile = File.new("/home/roger/Downloads/test.jpg")
> inputStream = FileInputStream.new(inFile)
>
> length = inFile.length()
> buffer = ""
>
> inputStream.read(buffer)
>
> But that won't work. It's probably more of a JRuby question than actually
> an HBase issue...
>
> 2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:
>
> > Take a look at _put_internal() method of
> > hbase-shell//src/main/ruby/hbase/table.rb
> >
> > On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <
> darkchanterlist@gmail.com>
> > wrote:
> >
> > > Hi out there
> > >
> > > I am trying to read a (binary) file from the local FS and store it in
> > HBase
> > > using JRuby.
> > >
> > > But I fail with the byte-array InputStream needed for the Put-Method:
> > >
> > > require "java"
> > >
> > > java_import "java.io.File"
> > > java_import "java.io.FileInputStream"
> > >
> > > java_import "org.apache.hadoop.hbase.client.HTable"
> > > java_import "org.apache.hadoop.hbase.client.Put"
> > >
> > > def jbytes(*args)
> > >   args.map { |arg| arg.to_s.to_java_bytes }
> > > end
> > >
> > > puts "Hello from Ruby"
> > >
> > > inFile = File.new("/home/roger/Downloads/test.jpg")
> > > inputStream = FileInputStream.new(inFile)
> > >
> > > length = inFile.length()
> > > buffer = Java::byte[length].new
> > >
> > > inputStream.read(buffer)
> > >
> > > table = HTable.new(@hbase.configuration, "emails")
> > > p = Put.new(*jbytes("roger3.pdf"))
> > >
> > > p.add(*jbytes("inhalt", "", buffer))
> > >
> > > table.put(p)
> > >
> > > inputStream.close()
> > > table.close()
> > >
> > >
> > >
> > > Has anyone done this right?
> > >
> > > I tried and googled....no breakthrough :-/
> > >
> > > Thanks
> > > Roger
> > >
> >
>

Re: Anyone good with JRuby?

Posted by Roger Maillist <da...@gmail.com>.
Well, I see they are using .to_java_bytes to cast a string and pass it to
the put-method, that's ok. But I am having trouble calling
the inputStream.read method.

I tried this:

inFile = File.new("/home/roger/Downloads/test.jpg")
inputStream = FileInputStream.new(inFile)

length = inFile.length()
buffer = ""

inputStream.read(buffer)

But that won't work. It's probably more of a JRuby question than actually
an HBase issue...

2014-10-04 23:22 GMT+02:00 Ted Yu <yu...@gmail.com>:

> Take a look at _put_internal() method of
> hbase-shell//src/main/ruby/hbase/table.rb
>
> On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <da...@gmail.com>
> wrote:
>
> > Hi out there
> >
> > I am trying to read a (binary) file from the local FS and store it in
> HBase
> > using JRuby.
> >
> > But I fail with the byte-array InputStream needed for the Put-Method:
> >
> > require "java"
> >
> > java_import "java.io.File"
> > java_import "java.io.FileInputStream"
> >
> > java_import "org.apache.hadoop.hbase.client.HTable"
> > java_import "org.apache.hadoop.hbase.client.Put"
> >
> > def jbytes(*args)
> >   args.map { |arg| arg.to_s.to_java_bytes }
> > end
> >
> > puts "Hello from Ruby"
> >
> > inFile = File.new("/home/roger/Downloads/test.jpg")
> > inputStream = FileInputStream.new(inFile)
> >
> > length = inFile.length()
> > buffer = Java::byte[length].new
> >
> > inputStream.read(buffer)
> >
> > table = HTable.new(@hbase.configuration, "emails")
> > p = Put.new(*jbytes("roger3.pdf"))
> >
> > p.add(*jbytes("inhalt", "", buffer))
> >
> > table.put(p)
> >
> > inputStream.close()
> > table.close()
> >
> >
> >
> > Has anyone done this right?
> >
> > I tried and googled....no breakthrough :-/
> >
> > Thanks
> > Roger
> >
>

Re: Anyone good with JRuby?

Posted by Ted Yu <yu...@gmail.com>.
Take a look at _put_internal() method of
hbase-shell//src/main/ruby/hbase/table.rb

On Sat, Oct 4, 2014 at 2:15 PM, Roger Maillist <da...@gmail.com>
wrote:

> Hi out there
>
> I am trying to read a (binary) file from the local FS and store it in HBase
> using JRuby.
>
> But I fail with the byte-array InputStream needed for the Put-Method:
>
> require "java"
>
> java_import "java.io.File"
> java_import "java.io.FileInputStream"
>
> java_import "org.apache.hadoop.hbase.client.HTable"
> java_import "org.apache.hadoop.hbase.client.Put"
>
> def jbytes(*args)
>   args.map { |arg| arg.to_s.to_java_bytes }
> end
>
> puts "Hello from Ruby"
>
> inFile = File.new("/home/roger/Downloads/test.jpg")
> inputStream = FileInputStream.new(inFile)
>
> length = inFile.length()
> buffer = Java::byte[length].new
>
> inputStream.read(buffer)
>
> table = HTable.new(@hbase.configuration, "emails")
> p = Put.new(*jbytes("roger3.pdf"))
>
> p.add(*jbytes("inhalt", "", buffer))
>
> table.put(p)
>
> inputStream.close()
> table.close()
>
>
>
> Has anyone done this right?
>
> I tried and googled....no breakthrough :-/
>
> Thanks
> Roger
>