You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Niraj Tolia <nt...@gmail.com> on 2010/03/17 20:21:38 UTC

C Implementation, missing avro_flush()

I was going through the avro.h header file (from the 1.3.0 release)
and noticed that the avro_flush() call is defined but has no
implementation. If someone does try to use it, compilation will fail
with an undefined reference error. I am not sure if this was an
accidental oversight but figured I should let someone know.

Also, would I be correct in assuming that the C API doesn't allow me
to actually call fsync() on a file writer? Digging through the code
didn't turn up anything obvious.

Cheers,
Niraj

-- 
http://www.tolia.org/

Re: C Implementation, missing avro_flush()

Posted by Matt Massie <ma...@cloudera.com>.
Thanks for the report Niraj and thanks for the patch, Bruce.  I just
committed AVRO-480 to trunk.

I agree that we need to clean up the I/O API and remove the buffered I/O.
 That work would be best to address with 1.4 since it will in all likelihood
break the existing API in (hopefully small) ways.

-Matt



On Wed, Mar 17, 2010 at 3:19 PM, Bruce Mitchener
<br...@gmail.com>wrote:

> This is AVRO-480 in JIRA. Patch is attached for dealing with flush.
>
>  - Bruce
>
>
> On Wed, Mar 17, 2010 at 3:59 PM, Bruce Mitchener <
> bruce.mitchener@gmail.com> wrote:
>
>> Niraj,
>>
>> So, I've been looking at this.
>>
>> I have a patch ready to roll for the first thing (dealing with
>> avro_flush()).  I'll go ahead and open a bug about that and attach the patch
>> for massie to commit.
>>
>> However, for fsync(), I think we'd like to wait until 1.4 to address this
>> as it will lead a confusing API at the moment that we'll just have to remove
>> in short order.
>>
>> Right now, the container files are implemented with buffered I/O, so you'd
>> have to call 3 functions to ensure that it hit disk.  I think we can get
>> away without buffered I/O and that would make it a more reasonable 2
>> functions.
>>
>> Do you have an overwhelming need for fsync functionality on container
>> files (datafile.c stuff) in the next month or two?
>>
>>  - Bruce
>>
>>
>> On Wed, Mar 17, 2010 at 2:47 PM, Bruce Mitchener <
>> bruce.mitchener@gmail.com> wrote:
>>
>>> Niraj,
>>>
>>> The header says avro_flush(), but the implementation says
>>> avro_writer_flush().  We'll get this addressed shortly.
>>>
>>> There isn't currently a way to call fsync() directly ... but since you
>>> pass the FILE* to the file writer, you could call fsync(fileno(FILE*)) on
>>> your own, unless you're using the container file.
>>>
>>> If you want to open a bug on each of these, I'll work up the patches and
>>> work with massie to get them into SVN.
>>>
>>> Cheers,
>>>
>>>  - Bruce
>>>
>>>
>>> On Wed, Mar 17, 2010 at 1:21 PM, Niraj Tolia <nt...@gmail.com> wrote:
>>>
>>>> I was going through the avro.h header file (from the 1.3.0 release)
>>>> and noticed that the avro_flush() call is defined but has no
>>>> implementation. If someone does try to use it, compilation will fail
>>>> with an undefined reference error. I am not sure if this was an
>>>> accidental oversight but figured I should let someone know.
>>>>
>>>> Also, would I be correct in assuming that the C API doesn't allow me
>>>> to actually call fsync() on a file writer? Digging through the code
>>>> didn't turn up anything obvious.
>>>>
>>>> Cheers,
>>>> Niraj
>>>>
>>>> --
>>>> http://www.tolia.org/
>>>>
>>>
>>>
>>
>

Re: C Implementation, missing avro_flush()

Posted by Bruce Mitchener <br...@gmail.com>.
This is AVRO-480 in JIRA. Patch is attached for dealing with flush.

 - Bruce

On Wed, Mar 17, 2010 at 3:59 PM, Bruce Mitchener
<br...@gmail.com>wrote:

> Niraj,
>
> So, I've been looking at this.
>
> I have a patch ready to roll for the first thing (dealing with
> avro_flush()).  I'll go ahead and open a bug about that and attach the patch
> for massie to commit.
>
> However, for fsync(), I think we'd like to wait until 1.4 to address this
> as it will lead a confusing API at the moment that we'll just have to remove
> in short order.
>
> Right now, the container files are implemented with buffered I/O, so you'd
> have to call 3 functions to ensure that it hit disk.  I think we can get
> away without buffered I/O and that would make it a more reasonable 2
> functions.
>
> Do you have an overwhelming need for fsync functionality on container files
> (datafile.c stuff) in the next month or two?
>
>  - Bruce
>
>
> On Wed, Mar 17, 2010 at 2:47 PM, Bruce Mitchener <
> bruce.mitchener@gmail.com> wrote:
>
>> Niraj,
>>
>> The header says avro_flush(), but the implementation says
>> avro_writer_flush().  We'll get this addressed shortly.
>>
>> There isn't currently a way to call fsync() directly ... but since you
>> pass the FILE* to the file writer, you could call fsync(fileno(FILE*)) on
>> your own, unless you're using the container file.
>>
>> If you want to open a bug on each of these, I'll work up the patches and
>> work with massie to get them into SVN.
>>
>> Cheers,
>>
>>  - Bruce
>>
>>
>> On Wed, Mar 17, 2010 at 1:21 PM, Niraj Tolia <nt...@gmail.com> wrote:
>>
>>> I was going through the avro.h header file (from the 1.3.0 release)
>>> and noticed that the avro_flush() call is defined but has no
>>> implementation. If someone does try to use it, compilation will fail
>>> with an undefined reference error. I am not sure if this was an
>>> accidental oversight but figured I should let someone know.
>>>
>>> Also, would I be correct in assuming that the C API doesn't allow me
>>> to actually call fsync() on a file writer? Digging through the code
>>> didn't turn up anything obvious.
>>>
>>> Cheers,
>>> Niraj
>>>
>>> --
>>> http://www.tolia.org/
>>>
>>
>>
>

Re: C Implementation, missing avro_flush()

Posted by Niraj Tolia <nt...@gmail.com>.
On Wed, Mar 17, 2010 at 2:59 PM, Bruce Mitchener
<br...@gmail.com> wrote:
> Niraj,
> So, I've been looking at this.
> I have a patch ready to roll for the first thing (dealing with
> avro_flush()).  I'll go ahead and open a bug about that and attach the patch
> for massie to commit.

Hi Bruce,

Thanks for the quick response. I was going to create the JIRAs but I
see you beat me to it. The patch with AVRO-480 should work for what I
am doing.

> However, for fsync(), I think we'd like to wait until 1.4 to address this as
> it will lead a confusing API at the moment that we'll just have to remove in
> short order.
> Right now, the container files are implemented with buffered I/O, so you'd
> have to call 3 functions to ensure that it hit disk.  I think we can get
> away without buffered I/O and that would make it a more reasonable 2
> functions.
> Do you have an overwhelming need for fsync functionality on container files
> (datafile.c stuff) in the next month or two?

I can probably work around it for now but I do think this would be
necessary to have in the long-term.

Thanks again for the help.

Cheers,
Niraj

>  - Bruce
>
> On Wed, Mar 17, 2010 at 2:47 PM, Bruce Mitchener <br...@gmail.com>
> wrote:
>>
>> Niraj,
>> The header says avro_flush(), but the implementation says
>> avro_writer_flush().  We'll get this addressed shortly.
>> There isn't currently a way to call fsync() directly ... but since you
>> pass the FILE* to the file writer, you could call fsync(fileno(FILE*)) on
>> your own, unless you're using the container file.
>> If you want to open a bug on each of these, I'll work up the patches and
>> work with massie to get them into SVN.
>> Cheers,
>>  - Bruce
>>
>> On Wed, Mar 17, 2010 at 1:21 PM, Niraj Tolia <nt...@gmail.com> wrote:
>>>
>>> I was going through the avro.h header file (from the 1.3.0 release)
>>> and noticed that the avro_flush() call is defined but has no
>>> implementation. If someone does try to use it, compilation will fail
>>> with an undefined reference error. I am not sure if this was an
>>> accidental oversight but figured I should let someone know.
>>>
>>> Also, would I be correct in assuming that the C API doesn't allow me
>>> to actually call fsync() on a file writer? Digging through the code
>>> didn't turn up anything obvious.
>>>
>>> Cheers,
>>> Niraj
>>>
>>> --
>>> http://www.tolia.org/
>>
>
>

Re: C Implementation, missing avro_flush()

Posted by Bruce Mitchener <br...@gmail.com>.
Niraj,

So, I've been looking at this.

I have a patch ready to roll for the first thing (dealing with
avro_flush()).  I'll go ahead and open a bug about that and attach the patch
for massie to commit.

However, for fsync(), I think we'd like to wait until 1.4 to address this as
it will lead a confusing API at the moment that we'll just have to remove in
short order.

Right now, the container files are implemented with buffered I/O, so you'd
have to call 3 functions to ensure that it hit disk.  I think we can get
away without buffered I/O and that would make it a more reasonable 2
functions.

Do you have an overwhelming need for fsync functionality on container files
(datafile.c stuff) in the next month or two?

 - Bruce

On Wed, Mar 17, 2010 at 2:47 PM, Bruce Mitchener
<br...@gmail.com>wrote:

> Niraj,
>
> The header says avro_flush(), but the implementation says
> avro_writer_flush().  We'll get this addressed shortly.
>
> There isn't currently a way to call fsync() directly ... but since you pass
> the FILE* to the file writer, you could call fsync(fileno(FILE*)) on your
> own, unless you're using the container file.
>
> If you want to open a bug on each of these, I'll work up the patches and
> work with massie to get them into SVN.
>
> Cheers,
>
>  - Bruce
>
>
> On Wed, Mar 17, 2010 at 1:21 PM, Niraj Tolia <nt...@gmail.com> wrote:
>
>> I was going through the avro.h header file (from the 1.3.0 release)
>> and noticed that the avro_flush() call is defined but has no
>> implementation. If someone does try to use it, compilation will fail
>> with an undefined reference error. I am not sure if this was an
>> accidental oversight but figured I should let someone know.
>>
>> Also, would I be correct in assuming that the C API doesn't allow me
>> to actually call fsync() on a file writer? Digging through the code
>> didn't turn up anything obvious.
>>
>> Cheers,
>> Niraj
>>
>> --
>> http://www.tolia.org/
>>
>
>

Re: C Implementation, missing avro_flush()

Posted by Bruce Mitchener <br...@gmail.com>.
Niraj,

The header says avro_flush(), but the implementation says
avro_writer_flush().  We'll get this addressed shortly.

There isn't currently a way to call fsync() directly ... but since you pass
the FILE* to the file writer, you could call fsync(fileno(FILE*)) on your
own, unless you're using the container file.

If you want to open a bug on each of these, I'll work up the patches and
work with massie to get them into SVN.

Cheers,

 - Bruce

On Wed, Mar 17, 2010 at 1:21 PM, Niraj Tolia <nt...@gmail.com> wrote:

> I was going through the avro.h header file (from the 1.3.0 release)
> and noticed that the avro_flush() call is defined but has no
> implementation. If someone does try to use it, compilation will fail
> with an undefined reference error. I am not sure if this was an
> accidental oversight but figured I should let someone know.
>
> Also, would I be correct in assuming that the C API doesn't allow me
> to actually call fsync() on a file writer? Digging through the code
> didn't turn up anything obvious.
>
> Cheers,
> Niraj
>
> --
> http://www.tolia.org/
>