You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/11/05 16:32:00 UTC

[jira] [Commented] (NIFI-5757) AvroRecordSetWriter synchronize every access to compiledAvroSchemaCache

    [ https://issues.apache.org/jira/browse/NIFI-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675407#comment-16675407 ] 

ASF GitHub Bot commented on NIFI-5757:
--------------------------------------

Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/3111
  
    @arkadius thanks for compiling that list. Sorry it took so long to reply! Looking through the list, I do think you're right - these all appear to be the same pattern. I certainly didn't realize that we were making such prolific use of this pattern. Reading through the Caffeine docs, it probably does make sense to update these as well.


> AvroRecordSetWriter synchronize every access to compiledAvroSchemaCache
> -----------------------------------------------------------------------
>
>                 Key: NIFI-5757
>                 URL: https://issues.apache.org/jira/browse/NIFI-5757
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>    Affects Versions: 1.7.1
>            Reporter: Arek Burdach
>            Priority: Major
>
> Avro record serialization is a quite expensive operation.
> This stack trace I very often see in thread dumps:
> {noformat}
> Thread 48583: (state = BLOCKED)
>  - org.apache.nifi.avro.AvroRecordSetWriter.compileAvroSchema(java.lang.String) @bci=9, line=124 (Compiled frame)
>  - org.apache.nifi.avro.AvroRecordSetWriter.createWriter(org.apache.nifi.logging.ComponentLog, org.apache.nifi.serialization.record.RecordSchema, java.io.OutputStream) @bci=96, line=92 (Compiled frame)
>  - sun.reflect.GeneratedMethodAccessor183.invoke(java.lang.Object, java.lang.Object[]) @bci=56 (Compiled frame)
>  - sun.reflect.DelegatingMethodAccessorImpl.invoke(java.lang.Object, java.lang.Object[]) @bci=6, line=43 (Compiled frame)
>  - java.lang.reflect.Method.invoke(java.lang.Object, java.lang.Object[]) @bci=56, line=498 (Compiled frame)
>  - org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(java.lang.Object, java.lang.reflect.Method, java.lang.Object[]) @bci=309, line=89 (Compiled frame)
>  - com.sun.proxy.$Proxy100.createWriter(org.apache.nifi.logging.ComponentLog, org.apache.nifi.serialization.record.RecordSchema, java.io.OutputStream) @bci=24 (Compiled frame)
>  - org.apache.nifi.processors.kafka.pubsub.PublisherLease.publish(org.apache.nifi.flowfile.FlowFile, org.apache.nifi.serialization.record.RecordSet, org.apache.nifi.serialization.RecordSetWriterFactory, org.apache.nifi.serialization.record.RecordSchema, java.lang.String, java.lang.String) @bci=71, line=169 (Compiled frame)
>  - org.apache.nifi.processors.kafka.pubsub.PublishKafkaRecord_1_0$1.process(java.io.InputStream) @bci=94, line=412 (Compiled frame)
> {noformat}
> The reason why it happens is because {{AvroRecordSetWriter}} synchronizing every access to cache of compiled schemas.
> I've prepared PR that is fixing this issue by using {{ConcurrentHashMap}} instead: https://github.com/apache/nifi/pull/3111
> It is not a perfect fix because it removes cache size limitation which BTW was hardcoded to {{20}}. Services can be reusable by many flows so such a hard limit is not a good choice.
> What do you think about such an improvement?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)