You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Marta Kuczora (Jira)" <ji...@apache.org> on 2020/01/10 09:45:00 UTC

[jira] [Updated] (HIVE-22716) Reading to ByteBuffer is broken in ParquetFooterInputFromCache

     [ https://issues.apache.org/jira/browse/HIVE-22716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marta Kuczora updated HIVE-22716:
---------------------------------
    Description: 
The ParquetFooterInputFromCache.read(ByteBuffer bb) calls the readInternal method with the result parameter passed as 'len'. The value of the result parameter will always be -1 at this point, and because of this, the readInternal method won't read anything.
{noformat}
  public int read(ByteBuffer bb) throws IOException {
    // Simple implementation for now - currently Parquet uses heap buffers.
    int result = -1;
    if (bb.hasArray()) {
      result = readInternal(bb.array(), bb.arrayOffset(), result);  // The readInternal is called with result=-1
      if (result > 0) {
        bb.position(bb.position() + result);
      }
    } else {
      byte[] b = new byte[bb.remaining()];
      result = readInternal(b, 0, result);     // The readInternal is called with result=-1
      bb.put(b, 0, result);
    }
    return result;
  }
{noformat}
{noformat}
  public int readInternal(byte[] b, int offset, int len) {
    if (position >= length) return -1;
    int argPos = offset, argEnd = offset + len;      // Here argEnd will be -1
    while (argPos < argEnd) {             // This condition will never be true, since argEnd=-1
      if (bufferIx == cacheData.length) return (argPos - offset);
      ByteBuffer data = cacheData[bufferIx].getByteBufferDup();
      int toConsume = Math.min(argEnd - argPos, data.remaining() - bufferPos);
      data.position(data.position() + bufferPos);
      data.get(b, argPos, toConsume);
      if (data.remaining() == 0) {
        ++bufferIx;
        bufferPos = 0;
      } else {
        bufferPos += toConsume;
      }
      argPos += toConsume;
    }
    return len;
  }
{noformat}
The read(ByteBuffer bb) method wasn't called before, but in the 1.11.0 Parquet version, there were some optimizations (PARQUET-1542|https://issues.apache.org/jira/browse/PARQUET-1542), so this method is called now. This bug causes the TestMiniLlapCliDriver and TestMiniLlapLocalCliDriver q tests failing with the new Parquet version.

  was:
The ParquetFooterInputFromCache.read(ByteBuffer bb) calls the readInternal method with the result parameter passed as 'len'. The value of the result parameter will always be -1 at this point, and because of this, the readInternal method won't read anything.
{noformat}
  public int read(ByteBuffer bb) throws IOException {
    // Simple implementation for now - currently Parquet uses heap buffers.
    int result = -1;
    if (bb.hasArray()) {
      result = readInternal(bb.array(), bb.arrayOffset(), result);     // The readInternal is called with result=-1
      if (result > 0) {
        bb.position(bb.position() + result);
      }
    } else {
      byte[] b = new byte[bb.remaining()];
      result = readInternal(b, 0, result);                                            // The readInternal is called with result=-1
      bb.put(b, 0, result);
    }
    return result;
  }
{noformat}
{noformat}
  public int readInternal(byte[] b, int offset, int len) {
    if (position >= length) return -1;
    int argPos = offset, argEnd = offset + len;      // Here argEnd will be -1
    while (argPos < argEnd) {                                 // This condition will never be true, since argEnd=-1
      if (bufferIx == cacheData.length) return (argPos - offset);
      ByteBuffer data = cacheData[bufferIx].getByteBufferDup();
      int toConsume = Math.min(argEnd - argPos, data.remaining() - bufferPos);
      data.position(data.position() + bufferPos);
      data.get(b, argPos, toConsume);
      if (data.remaining() == 0) {
        ++bufferIx;
        bufferPos = 0;
      } else {
        bufferPos += toConsume;
      }
      argPos += toConsume;
    }
    return len;
  }
{noformat}
The read(ByteBuffer bb) method wasn't called before, but in the 1.11.0 Parquet version, there were some optimizations (PARQUET-1542|https://issues.apache.org/jira/browse/PARQUET-1542), so this method is called now. This bug causes the TestMiniLlapCliDriver and TestMiniLlapLocalCliDriver q tests failing with the new Parquet version.


> Reading to ByteBuffer is broken in ParquetFooterInputFromCache
> --------------------------------------------------------------
>
>                 Key: HIVE-22716
>                 URL: https://issues.apache.org/jira/browse/HIVE-22716
>             Project: Hive
>          Issue Type: Bug
>          Components: llap
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>            Priority: Major
>             Fix For: 4.0.0
>
>
> The ParquetFooterInputFromCache.read(ByteBuffer bb) calls the readInternal method with the result parameter passed as 'len'. The value of the result parameter will always be -1 at this point, and because of this, the readInternal method won't read anything.
> {noformat}
>   public int read(ByteBuffer bb) throws IOException {
>     // Simple implementation for now - currently Parquet uses heap buffers.
>     int result = -1;
>     if (bb.hasArray()) {
>       result = readInternal(bb.array(), bb.arrayOffset(), result);  // The readInternal is called with result=-1
>       if (result > 0) {
>         bb.position(bb.position() + result);
>       }
>     } else {
>       byte[] b = new byte[bb.remaining()];
>       result = readInternal(b, 0, result);     // The readInternal is called with result=-1
>       bb.put(b, 0, result);
>     }
>     return result;
>   }
> {noformat}
> {noformat}
>   public int readInternal(byte[] b, int offset, int len) {
>     if (position >= length) return -1;
>     int argPos = offset, argEnd = offset + len;      // Here argEnd will be -1
>     while (argPos < argEnd) {             // This condition will never be true, since argEnd=-1
>       if (bufferIx == cacheData.length) return (argPos - offset);
>       ByteBuffer data = cacheData[bufferIx].getByteBufferDup();
>       int toConsume = Math.min(argEnd - argPos, data.remaining() - bufferPos);
>       data.position(data.position() + bufferPos);
>       data.get(b, argPos, toConsume);
>       if (data.remaining() == 0) {
>         ++bufferIx;
>         bufferPos = 0;
>       } else {
>         bufferPos += toConsume;
>       }
>       argPos += toConsume;
>     }
>     return len;
>   }
> {noformat}
> The read(ByteBuffer bb) method wasn't called before, but in the 1.11.0 Parquet version, there were some optimizations (PARQUET-1542|https://issues.apache.org/jira/browse/PARQUET-1542), so this method is called now. This bug causes the TestMiniLlapCliDriver and TestMiniLlapLocalCliDriver q tests failing with the new Parquet version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)