You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Terry Healy <th...@bnl.gov> on 2013/01/15 18:59:47 UTC

How to "repair" .avro files with "Invalid Sync"

I have an .avro file that I'm trying to use within a Map/Reduce job. I
believe it was corrupted when I appended one file to another by mistake.

Are there any tools to repair this?




Re: How to "repair" .avro files with "Invalid Sync"

Posted by Terry Healy <th...@bnl.gov>.
thanks Alan.

I ran a caveman version which I'm testing now. I used avrotools -tojason
to a file, then ran avrotools -fromjson.

-Terry

On 01/15/2013 01:39 PM, Alan Miller wrote:
> Just an idea but...
> I thought there were some low level methods available that you could use to  get the sync markers.  Maybe then you could sequentially step through the orig file and try to write each record to a new file.  
> 
> Alan. 
> 
> 
> Sent from my iPhone
> 
> On Jan 15, 2013, at 18:59, Terry Healy <th...@bnl.gov> wrote:
> 
>> I have an .avro file that I'm trying to use within a Map/Reduce job. I
>> believe it was corrupted when I appended one file to another by mistake.
>>
>> Are there any tools to repair this?
>>
>>
>>

Re: How to "repair" .avro files with "Invalid Sync"

Posted by Terry Healy <th...@bnl.gov>.
I will have to try and make some lower-level way to validate and repair
corrupted .avro files and/or append them correctly, since this is
killing my M/R jobs. And it takes a long time digging to find the
offending file (it would be nice if the 'Invalid Sync!' exception listed
this).

I'll let you know if I come up with anything useful. Too many other
things to do now....



On 01/15/2013 01:39 PM, Alan Miller wrote:
> Just an idea but...
> I thought there were some low level methods available that you could use to  get the sync markers.  Maybe then you could sequentially step through the orig file and try to write each record to a new file.  
> 
> Alan. 
> 
> 
> Sent from my iPhone
> 
> On Jan 15, 2013, at 18:59, Terry Healy <th...@bnl.gov> wrote:
> 
>> I have an .avro file that I'm trying to use within a Map/Reduce job. I
>> believe it was corrupted when I appended one file to another by mistake.
>>
>> Are there any tools to repair this?
>>
>>
>>

Re: How to "repair" .avro files with "Invalid Sync"

Posted by Alan Miller <al...@gmail.com>.
Just an idea but...
I thought there were some low level methods available that you could use to  get the sync markers.  Maybe then you could sequentially step through the orig file and try to write each record to a new file.  

Alan. 


Sent from my iPhone

On Jan 15, 2013, at 18:59, Terry Healy <th...@bnl.gov> wrote:

> I have an .avro file that I'm trying to use within a Map/Reduce job. I
> believe it was corrupted when I appended one file to another by mistake.
> 
> Are there any tools to repair this?
> 
> 
>