You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by "David Mollitor (Jira)" <ji...@apache.org> on 2020/10/02 19:14:00 UTC
[jira] [Updated] (THRIFT-5288) Better Support for ByteBuffer in
Compact Protocol
[ https://issues.apache.org/jira/browse/THRIFT-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mollitor updated THRIFT-5288:
-----------------------------------
Description:
{code:java|title=TCompactProtocol.java}
/**
* Write a byte array, using a varint for the size.
*/
public void writeBinary(ByteBuffer bin) throws TException {
int length = bin.limit() - bin.position();
writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length);
}
{code}
I was working on something with Parquet and this code was causing some issues:
{code}
java.lang.Exception: java.nio.ReadOnlyBufferException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.nio.ReadOnlyBufferException
at java.nio.ByteBuffer.array(ByteBuffer.java:996)
at shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375)
at org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135)
at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945)
at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820)
at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728)
at org.apache.parquet.format.Util.write(Util.java:372)
at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69)
at org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087)
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050)
{code}
This happens, because not all {{Buffer}}s allow for direct access to the backing Array,... for example a ByteBuffer tied to a file does not have an Array. Read-only (immutable) {{ByteBuffer}}s do not allow for this kind of access to the array since it could then be modified.
There are two approaches here:
# Assert and throw Exception if the backing array must be allowed for access
# Make "deal directly" with the ByteBuffer
I propose the latter. However, the initial naive I approach I propose is to "deal directly" with the ByteBuffer by making a copy of the contents.
was:
{code:java|title=TCompactProtocol.java}
/**
* Write a byte array, using a varint for the size.
*/
public void writeBinary(ByteBuffer bin) throws TException {
int length = bin.limit() - bin.position();
writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length);
}
{code}
I was working on something with Parquet and this code was causing some issues:
{code}
java.lang.Exception: java.nio.ReadOnlyBufferException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.nio.ReadOnlyBufferException
at java.nio.ByteBuffer.array(ByteBuffer.java:996)
at shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375)
at org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135)
at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945)
at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820)
at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728)
at org.apache.parquet.format.Util.write(Util.java:372)
at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69)
at org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087)
at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050)
{code}
This happens, because not all {{Buffer}}s allow for direct access to the backing Array,... for example a ByteBuffer tied to a file does not have an Array. Read-only (immutable) {{ByteBuffer}s do not allow for this kind of access to the array since it could then be modified.
There are two approaches here:
# Assert and throw Exception if the backing array must be allowed for access
# Make "deal directly" with the ByteBuffer
I propose the latter. However, the initial naive I approach I propose is to "deal directly" with the ByteBuffer by making a copy of the contents.
> Better Support for ByteBuffer in Compact Protocol
> -------------------------------------------------
>
> Key: THRIFT-5288
> URL: https://issues.apache.org/jira/browse/THRIFT-5288
> Project: Thrift
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {code:java|title=TCompactProtocol.java}
> /**
> * Write a byte array, using a varint for the size.
> */
> public void writeBinary(ByteBuffer bin) throws TException {
> int length = bin.limit() - bin.position();
> writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length);
> }
> {code}
> I was working on something with Parquet and this code was causing some issues:
> {code}
> java.lang.Exception: java.nio.ReadOnlyBufferException
> at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
> Caused by: java.nio.ReadOnlyBufferException
> at java.nio.ByteBuffer.array(ByteBuffer.java:996)
> at shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375)
> at org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135)
> at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945)
> at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820)
> at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728)
> at org.apache.parquet.format.Util.write(Util.java:372)
> at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69)
> at org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087)
> at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050)
> {code}
> This happens, because not all {{Buffer}}s allow for direct access to the backing Array,... for example a ByteBuffer tied to a file does not have an Array. Read-only (immutable) {{ByteBuffer}}s do not allow for this kind of access to the array since it could then be modified.
> There are two approaches here:
> # Assert and throw Exception if the backing array must be allowed for access
> # Make "deal directly" with the ByteBuffer
> I propose the latter. However, the initial naive I approach I propose is to "deal directly" with the ByteBuffer by making a copy of the contents.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)