You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Lucas Heimberg (Jira)" <ji...@apache.org> on 2020/12/18 16:47:00 UTC
[jira] [Created] (AVRO-3005) Deserialization of string with > 256
characters fails
Lucas Heimberg created AVRO-3005:
------------------------------------
Summary: Deserialization of string with > 256 characters fails
Key: AVRO-3005
URL: https://issues.apache.org/jira/browse/AVRO-3005
Project: Apache Avro
Issue Type: Bug
Components: csharp
Affects Versions: 1.10.1
Reporter: Lucas Heimberg
Avro.IO.BinaryDecoder.ReadString() fails for strings with length > 256, i.e. when the StackallocThreshold is exceeded.
This can be seen when serializing and subsequently deserializing a GenericRecord of schema
{code:java}
{
"type": "record",
"name": "Foo",
"fields": [
{ "name": "x", "type": "string" }
]
}{code}
with a field x containing a string of length > 256, as done in the test case:
{code:java}
public void Test()
{
var schema = (RecordSchema) Schema.Parse("{ \"type\":\"record\", \"name\":\"Foo\",\"fields\":[{\"name\":\"x\",\"type\":\"string\"}]}");
var datum = new GenericRecord(schema);
datum.Add("x", new String('x', 257));
byte[] serialized;
using (var ms = new MemoryStream())
{
var enc = new BinaryEncoder(ms);
var writer = new GenericDatumWriter<GenericRecord>(schema);
writer.Write(datum, enc);
serialized = ms.ToArray();
}
using (var ms = new MemoryStream(serialized))
{
var dec = new BinaryDecoder(ms);
var deserialized = new GenericRecord(schema);
var reader = new GenericDatumReader<GenericRecord>(schema, schema);
reader.Read(deserialized, dec);
Assert.Equal(datum, deserialized);
}
}{code}
which yields the following exception
{code:java}
Avro.AvroException
End of stream reached
at Avro.IO.BinaryDecoder.Read(Span`1 buffer)
at Avro.IO.BinaryDecoder.ReadString()
at Avro.Generic.PreresolvingDatumReader`1.<>c.<ResolveReader>b__21_1(Decoder d)
at Avro.Generic.PreresolvingDatumReader`1.<>c__DisplayClass37_0.<Read>b__0(Object r, Decoder d)
at Avro.Generic.PreresolvingDatumReader`1.<>c__DisplayClass23_1.<ResolveRecord>b__2(Object rec, Decoder d)
at Avro.Generic.PreresolvingDatumReader`1.ReadRecord(Object reuse, Decoder decoder, RecordAccess recordAccess, IEnumerable`1 readSteps)
at Avro.Generic.PreresolvingDatumReader`1.<>c__DisplayClass23_0.<ResolveRecord>b__0(Object r, Decoder d)
at Avro.Generic.PreresolvingDatumReader`1.Read(T reuse, Decoder decoder)
at AvroTests.AvroTests.Test(Int32 n) in C:\Users\l.heimberg\Source\Repos\AvroTests\AvroTests\AvroTests.cs:line 41
{code}
It seems that Avro.IO.BinaryDecoder.Read(Span<byte> buffer) reads over the end of the input stream when being passed the span returned by ArrayPool<byte>.Shared.Rent(length) (where length is the length of the string).
Possiby related: [https://github.com/confluentinc/confluent-kafka-dotnet/issues/1398#issuecomment-748171083]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)