You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Bilal (Jira)" <ji...@apache.org> on 2021/11/16 11:07:00 UTC
[jira] [Created] (NIFI-9380) PutParquet - Compression Type: SNAPPY (Not Working)
Bilal created NIFI-9380:
---------------------------
Summary: PutParquet - Compression Type: SNAPPY (Not Working)
Key: NIFI-9380
URL: https://issues.apache.org/jira/browse/NIFI-9380
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.15.0, 1.14.0
Environment: CentOS 7.4, RedHat 7.9
Reporter: Bilal
I have tested different compression types which is a feature of _PutParquet_ and _ConvertAvroToParquet_ Processors on different NiFi versions.
Summary information:
* Compression types (UNCOMPRESSED, GZIP, {*}SNAPPY{*}) of _PutParquet_ Processor works correctly on NiFi 1.12.1 and 1.13.2
* Compression types (UNCOMPRESSED, GZIP) of _PutParquet_ Processor works correctly on NiFi 1.14.0 and 1.5.0; *SNAPPY* gives an error.
* Compression types (UNCOMPRESSED, GZIP, {*}SNAPPY{*}) of _ConvertAvroToParquet_ Processor works correctly on NiFi 1.12.1, 1.13.2, 1.14.0 and 1.15.0.
_PutParquet_ – Properties:
* Hadoop Configuration Resources: File locations
* Kerberos Credentials Service: Keytab service
* Record Reader: AvroReader Service (Embedded Avro Schema)
* Overwrite Files: True
* Compression Type: SNAPPY
* Other Properties: Default
In order to do lean testing, the default configuration was used generally:
* nifi-env.sh file has the default configuration.
* bootstrap.conf file has the default configuration.
* nifi.properties file has the default configuration except security configuration.
* _PutParquet_ Processor has the default configuration. (But SNAPPY compression is not working)
* _ConvertAvroToParquet_ Processor has the default configuration. (SNAPPY compression is working correctly)
* There is no custom processor in our NiFi environment.
* There is no custom lib location in Nifi properties.
Error Log (nifi-app.log):
{noformat}
Error Log (nifi-app.log):
ERROR [Timer-Driven Process Thread-12] o.a.nifi.processors.parquet.PutParquet PutParquet[id=6caab337-68e8-3834-b64a-1d2cbd93aba8] Failed to write due to java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative does not implement the requested interface org.xerial.snappy.SnappyApi: java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative does not implement the requested interface org.xerial.snappy.SnappyApi
java.lang.IncompatibleClassChangeError: Class org.xerial.snappy.SnappyNative does not implement the requested interface org.xerial.snappy.SnappyApi
at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:380)
at org.apache.parquet.hadoop.codec.SnappyCompressor.compress(SnappyCompressor.java:67)
at org.apache.hadoop.io.compress.CompressorStream.compress(CompressorStream.java:81)
at org.apache.hadoop.io.compress.CompressorStream.finish(CompressorStream.java:92)
at org.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.compress(CodecFactory.java:167)
at org.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writePage(ColumnChunkPageWriteStore.java:168)
at org.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:59)
at org.apache.parquet.column.impl.ColumnWriterBase.writePage(ColumnWriterBase.java:387)
at org.apache.parquet.column.impl.ColumnWriteStoreBase.flush(ColumnWriteStoreBase.java:186)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.flush(ColumnWriteStoreV1.java:29)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:185)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.close(InternalParquetRecordWriter.java:124)
at org.apache.parquet.hadoop.ParquetWriter.close(ParquetWriter.java:319)
at org.apache.nifi.parquet.hadoop.AvroParquetHDFSRecordWriter.close(AvroParquetHDFSRecordWriter.java:49)
at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:534)
at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:466)
at org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.lambda$null$0(AbstractPutHDFSRecord.java:326)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2466)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2434)
at org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.lambda$onTrigger$1(AbstractPutHDFSRecord.java:303)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1822)
at org.apache.nifi.processors.hadoop.AbstractPutHDFSRecord.onTrigger(AbstractPutHDFSRecord.java:271)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1202)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:103)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)