You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Yu-Wu Chu (Jira)" <ji...@apache.org> on 2022/05/24 23:06:00 UTC
[jira] [Created] (AVRO-3524) Memory leak when not reusing avro schema instance
Yu-Wu Chu created AVRO-3524:
-------------------------------
Summary: Memory leak when not reusing avro schema instance
Key: AVRO-3524
URL: https://issues.apache.org/jira/browse/AVRO-3524
Project: Apache Avro
Issue Type: Bug
Components: java
Affects Versions: 1.10.2, 1.9.2
Environment: * openJdk 8
* tested in Avro 1.9.2 and 1.10.2
Reporter: Yu-Wu Chu
When deserializing avro record, if we do not use shared schema instance, the memory usage start growing as the number of deserializing growth.
Code with shared schema:
{code:java}
public void myTest() throws Exception {
Schema schema = new Schema.Parser().parse(schemaString);
final AvroEntity avroEntity = buildAvroEntity();
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(outputStream, null);
final DatumWriter<AvroEntity> writer = new SpecificDatumWriter<>(schema);
writer.write( avroEntity, encoder);
encoder.flush();
final byte[] data = outputStream.toByteArray();
DatumReader<AvroEntity> reader =new SpecificDatumReader<>(schema);
int count = 0;
while (count < 100000) {
final Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);
//final Schema mySchema = new Schema.Parser().parse(schemaString);
reader.setSchema(schema);
reader.read(null, decoder);
count++;
if (count % 1000 == 0) {
System.gc();
System.out.println("test" + count);
}
}
System.out.println("test" + count);
}{code}
Code without shared schema:
{code:java}
public void myTest() throws Exception {
schema = new Schema.Parser().parse(schemaString);
final AvroEntity avroEntity = buildAvroEntity();
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
final BinaryEncoder encoder = EncoderFactory.get().binaryEncoder(outputStream, null);
final DatumWriter<AvroEntity> writer = new SpecificDatumWriter<>(schema);
writer.write( avroEntity, encoder);
encoder.flush();
final byte[] data = outputStream.toByteArray();
DatumReader<AvroEntity> reader =new SpecificDatumReader<>(schema);
int count = 0;
while (count < 100000) {
final Decoder decoder = DecoderFactory.get().binaryDecoder(data, null);
final Schema mySchema = new Schema.Parser().parse(schemaString);
reader.setSchema(mySchema);
reader.read(null, decoder);
count++;
if (count % 1000 == 0) {
System.gc();
System.out.println("test" + count);
}
}
System.out.println("test" + count);
}{code}
Number of ConcurrentHashMapNode instances between shared schema and not-shared schema are 5,000 vs 1,500,000.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)