You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "Mathias Herberts (JIRA)" <ji...@apache.org> on 2011/01/17 22:12:04 UTC

[jira] Created: (THRIFT-1039) Refactor methods accessing binary data

Refactor methods accessing binary data
--------------------------------------

                 Key: THRIFT-1039
                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
             Project: Thrift
          Issue Type: Improvement
          Components: Java - Compiler
         Environment: All
            Reporter: Mathias Herberts
            Priority: Minor


Since THRIFT-830, binary fields are implemented using ByteBuffer.

struct A {
  1: binary bin_field,
}

will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.

This has several implications.

First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.

Thus the following will fail:

byte[] b = new byte[2];
b[0] = 0x01;
A a = new A();
a.setBin_field(ByteBuffer.wrap(b, 0, 1));
byte[] bb = a.getBin_field();
b[0] = 0x02;
Assert.assertEquals(0x02, bb[0]);

Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.

I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.

Such a method could be generated for each structure, with the following signature:

public byte[] copyBytes(ByteBuffer b);

This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (THRIFT-1039) Refactor methods accessing binary data

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-1039:
----------------------------------

    Fix Version/s:     (was: 0.7)
                   0.8

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: THRIFT-1039.patch
>
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (THRIFT-1039) Refactor methods accessing binary data

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987279#action_12987279 ] 

Mathias Herberts commented on THRIFT-1039:
------------------------------------------

How far down the road do you foresee 0.7.0?

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>             Fix For: 0.7
>
>         Attachments: THRIFT-1039.patch
>
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-1039) Refactor methods accessing binary data

Posted by "Mathias Herberts (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mathias Herberts updated THRIFT-1039:
-------------------------------------

    Attachment: THRIFT-1039.patch

Attached patch that replaces the 'byte[] getXxx()' method with 'ByteBuffer getXxx()' and adds a 'byte[] copyBytes(ByteBuffer buf)' method to the generated Java classes.

The change in getXxx will break existing code using structs with binary fields, but the introduction of copyBytes will clarify the semantics of retrieving byte arrays from ByteBuffers.

I voluntarly added copyBytes to the generated code instead of adding it to TBaseHelper as it made the uses more clear (IMHO):

struct A {
  1: binary bin_field,
}

A a = new A();
a.setBin_field(new byte[2]);
byte[] b = a.copyBytes(a.getBin_field());

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>         Attachments: THRIFT-1039.patch
>
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-1039) Refactor methods accessing binary data

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12987281#action_12987281 ] 

Bryan Duxbury commented on THRIFT-1039:
---------------------------------------

Realistically, at least a month. Mostly it'll be determined by how quickly we accumulate enough changes to merit a release.

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>             Fix For: 0.7
>
>         Attachments: THRIFT-1039.patch
>
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (THRIFT-1039) Refactor methods accessing binary data

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983315#action_12983315 ] 

Bryan Duxbury commented on THRIFT-1039:
---------------------------------------

Would we remove the old getBin_field accessor, or maybe replace it with a method that returns the ByteBuffer? This would be the forward-moving approach.

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (THRIFT-1039) Refactor methods accessing binary data

Posted by "Bryan Duxbury (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/THRIFT-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Duxbury updated THRIFT-1039:
----------------------------------

    Fix Version/s: 0.7

I think we need to carefully decide if we want to break backwards compatibility in this fashion. We should put this off until 0.7, I think.

> Refactor methods accessing binary data
> --------------------------------------
>
>                 Key: THRIFT-1039
>                 URL: https://issues.apache.org/jira/browse/THRIFT-1039
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Java - Compiler
>         Environment: All
>            Reporter: Mathias Herberts
>            Priority: Minor
>             Fix For: 0.7
>
>         Attachments: THRIFT-1039.patch
>
>
> Since THRIFT-830, binary fields are implemented using ByteBuffer.
> struct A {
>   1: binary bin_field,
> }
> will generate a method 'byte[] getBin_field()' which will optionally resize the underlying ByteBuffer so it has a capacity of 'remaining()' and return the backing array of this new ByteBuffer.
> This has several implications.
> First the ByteBuffer in 'bin_field' might be modified, thus misleading users into thinking that the call to 'getBin_field()' had no effect on the underlying structure.
> Thus the following will fail:
> byte[] b = new byte[2];
> b[0] = 0x01;
> A a = new A();
> a.setBin_field(ByteBuffer.wrap(b, 0, 1));
> byte[] bb = a.getBin_field();
> b[0] = 0x02;
> Assert.assertEquals(0x02, bb[0]);
> Second it creates a singularity in the getters as all other getters involving containers of binary data will return collections of ByteBuffer.
> I suggest we refactor to something more in the line of what HADOOP-6298 has done, i.e. provide a copyBytes() method which copies the bytes in a ByteBuffer and returns a correctly sized array.
> Such a method could be generated for each structure, with the following signature:
> public byte[] copyBytes(ByteBuffer b);
> This method would basically do what the current 'getBin_field' type methods do (allocate a byte array and fill it with the ByteBuffer's data), except it would not modify the structure's internal ByteBuffers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.