You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Thiruvalluvan M. G. (JIRA)" <ji...@apache.org> on 2018/12/23 16:52:00 UTC

[jira] [Commented] (AVRO-2280) Calling DataFileWriter::flush() when there is no data to write can subsequently cause an exception when the file is read

    [ https://issues.apache.org/jira/browse/AVRO-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16728001#comment-16728001 ] 

Thiruvalluvan M. G. commented on AVRO-2280:
-------------------------------------------

Thank you [~bwalshe] for identifying the problem and providing an example where it breaks.

Even through your suggested solution addresses the example problem, it does not address a more general issue. The trouble is with data file blocks with zero objects (call them empty blocks) in them. [Avro specification|https://avro.apache.org/docs/1.8.1/spec.html] allows for empty blocks. Thus, even if the C++ implementation does not generate empty blocks, other language bindings may generate them. So the C++ reader should be able to handle them.

I've made a [pull request|http://https://github.com/apache/avro/pull/414] to address the reader's problem.

> Calling DataFileWriter::flush() when there is no data to write can subsequently cause an exception when the file is read
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AVRO-2280
>                 URL: https://issues.apache.org/jira/browse/AVRO-2280
>             Project: Apache Avro
>          Issue Type: Bug
>          Components: c++
>            Reporter: Brian Walshe
>            Priority: Major
>              Labels: newbie, pull-request-available
>
> If you call flush() on a DataFileWriter object that has no data waiting to be written, this will produce an empty block at the end of the file which will cause an exception on the last call to DataFileReader::read(T& datum)
> h2. Example
> For example adding the following to the Data File unit tests will cause them to break [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/test/DataFileTests.cc#L192]
> h2. Possible Solution
> Altering DataFileWriter::sync() to check if there are objects to be written before proceeding will get the code to pass the unit tests. e.g.: [https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141 |https://github.com/bwalshe/avro/blob/7c6a229b2fcbb0b88368e1503a58daef9f43ee64/lang/c%2B%2B/impl/DataFile.cc#L141]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)