You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Jyotsna G <jy...@gmail.com> on 2018/06/21 12:54:37 UTC

Issue similar to Flume-2052

Hi All,
Am currently using Flume 1.7.0.

While processing a set of files, am encountering the below exception and
Flume hangs unpredictably.

18/06/21 05:40:40 ERROR source.SpoolDirectorySource: FATAL: Spool Directory
source src-tpa_idoru: { spoolDir: /tmp/tpa-idoru/ }: Uncaught exception in
SpoolDirectorySource thread. Restart or reconfigure Flume to continue
processing.
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at
org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:283)
at
org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:132)
at
org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:70)
at
org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:89)
at
org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readDeserializerEvents(ReliableSpoolingFileEventReader.java:343)
at
org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:331)
at
org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


This issue looks similar to https://jira.apache.org/jira/browse/FLUME-2052

Wouldn't this fix be already included in 1.7? How do I enable flume to skip
these errors and proceed further?

Thanks,
Jyotsna

handling of modified files in spooldir - similar to Flume-2052

Posted by Marina <pp...@yahoo.com>.
Hi,I know that Flume-2052 fixed the issue with encountering invalid characters in the input and ignoring them if so configured.I was wondering if the same could be done for the infamous "java.lang.IllegalStateException: File has changed size since being read:" issue.Currently, Flume just barfs when this happens, kills the processing thread but leaves the process itself running. This makes it hard to detect this failure using usual process monitoring (the process is up) - while the actual processing is dead. Also, if this happens, Flume does not just ignore the offending file in the spooldir, but becomes a "zombie" forever .... until you somehow detect this and restart it.

Are there any plans to make handling of this situation configurable, just as was done for Flume-2052 issue?I think it would be a much more robust and user-friendly behavior if Flume could report the offense, but ignore it and continue processing other good files if so configured.
Thank you!Marina




Re: Issue similar to Flume-2052

Posted by Jyotsna G <jy...@gmail.com>.
Thank you so much Peter. I set  'decodeErrorPolicy'  to IGNORE and its
processing fine now.

Regards,
Jyotsna

On Thu, Jun 21, 2018 at 7:07 PM, Peter Turcsanyi <tu...@cloudera.com>
wrote:

> Hi Jyotsna,
>
> What is your 'decodeErrorPolicy' setting on the spool dir source?
> By default it is FAIL, but it must be IGNORE or REPLACE in order to
> proceed further when decoding error occurs.
>
> Regrads,
> Peter Turcsanyi
>
> On Thu, Jun 21, 2018 at 2:54 PM, Jyotsna G <jy...@gmail.com> wrote:
>
>> Hi All,
>> Am currently using Flume 1.7.0.
>>
>> While processing a set of files, am encountering the below exception and
>> Flume hangs unpredictably.
>>
>> 18/06/21 05:40:40 ERROR source.SpoolDirectorySource: FATAL: Spool
>> Directory source src-tpa_idoru: { spoolDir: /tmp/tpa-idoru/ }: Uncaught
>> exception in SpoolDirectorySource thread. Restart or reconfigure Flume to
>> continue processing.
>> java.nio.charset.MalformedInputException: Input length = 1
>> at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
>> at org.apache.flume.serialization.ResettableFileInputStream.rea
>> dChar(ResettableFileInputStream.java:283)
>> at org.apache.flume.serialization.LineDeserializer.readLine(Lin
>> eDeserializer.java:132)
>> at org.apache.flume.serialization.LineDeserializer.readEvent(Li
>> neDeserializer.java:70)
>> at org.apache.flume.serialization.LineDeserializer.readEvents(L
>> ineDeserializer.java:89)
>> at org.apache.flume.client.avro.ReliableSpoolingFileEventReader
>> .readDeserializerEvents(ReliableSpoolingFileEventReader.java:343)
>> at org.apache.flume.client.avro.ReliableSpoolingFileEventReader
>> .readEvents(ReliableSpoolingFileEventReader.java:331)
>> at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryR
>> unnable.run(SpoolDirectorySource.java:250)
>> at java.util.concurrent.Executors$RunnableAdapter.call(
>> Executors.java:511)
>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:294)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>> Executor.java:1142)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>> lExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>> This issue looks similar to https://jira.apache.org/jir
>> a/browse/FLUME-2052
>>
>> Wouldn't this fix be already included in 1.7? How do I enable flume to
>> skip these errors and proceed further?
>>
>> Thanks,
>> Jyotsna
>>
>
>

Re: Issue similar to Flume-2052

Posted by Peter Turcsanyi <tu...@cloudera.com>.
Hi Jyotsna,

What is your 'decodeErrorPolicy' setting on the spool dir source?
By default it is FAIL, but it must be IGNORE or REPLACE in order to proceed
further when decoding error occurs.

Regrads,
Peter Turcsanyi

On Thu, Jun 21, 2018 at 2:54 PM, Jyotsna G <jy...@gmail.com> wrote:

> Hi All,
> Am currently using Flume 1.7.0.
>
> While processing a set of files, am encountering the below exception and
> Flume hangs unpredictably.
>
> 18/06/21 05:40:40 ERROR source.SpoolDirectorySource: FATAL: Spool
> Directory source src-tpa_idoru: { spoolDir: /tmp/tpa-idoru/ }: Uncaught
> exception in SpoolDirectorySource thread. Restart or reconfigure Flume to
> continue processing.
> java.nio.charset.MalformedInputException: Input length = 1
> at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
> at org.apache.flume.serialization.ResettableFileInputStream.readChar(
> ResettableFileInputStream.java:283)
> at org.apache.flume.serialization.LineDeserializer.readLine(
> LineDeserializer.java:132)
> at org.apache.flume.serialization.LineDeserializer.readEvent(
> LineDeserializer.java:70)
> at org.apache.flume.serialization.LineDeserializer.readEvents(
> LineDeserializer.java:89)
> at org.apache.flume.client.avro.ReliableSpoolingFileEventReade
> r.readDeserializerEvents(ReliableSpoolingFileEventReader.java:343)
> at org.apache.flume.client.avro.ReliableSpoolingFileEventReade
> r.readEvents(ReliableSpoolingFileEventReader.java:331)
> at org.apache.flume.source.SpoolDirectorySource$
> SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
>
> This issue looks similar to https://jira.apache.org/jira/browse/FLUME-2052
>
> Wouldn't this fix be already included in 1.7? How do I enable flume to
> skip these errors and proceed further?
>
> Thanks,
> Jyotsna
>