You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Xikui Wang (JIRA)" <ji...@apache.org> on 2018/03/10 04:14:00 UTC

[jira] [Updated] (ASTERIXDB-2323) LogFlusher failed after repeating insert operation a lot of times

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xikui Wang updated ASTERIXDB-2323:
----------------------------------
    Description: 
I'm running an experiment which inserts records into dataset. Each insert contains a frame of data and records have a unique increasing primary key. After a lot of inserts, the LogFlluser fails which causes the system to hang. Here is the log snippet (JobID roughly equals the number of inserts that I tried). The full log is attached too.

{code}

15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: StartTasks
15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - Initializing TAID:TID:ANID:ODID:3:0:0:0 -> [FeedMeta [org.apache.asterix.runtime.operators.LSMPrimaryUpsertOperatorDescriptor@6ba3bb2f ], AlgebricksMeta [stream-project [2], commit]] for JID:0.27223
15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:2
15:40:41.820 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27214
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.Joblet - Freeing leaked 163840 bytes
15:40:41.882 [Worker:local2] WARN org.apache.hyracks.control.common.work.WorkQueue - Work CleanupJoblet waited 1 times (~0ms), blocked 1 times (~0ms)
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27215
15:40:41.883 [Executor-2:local2] ERROR org.apache.asterix.transaction.management.service.logging.LogFlusher - LogFlusher is terminating abnormally. System is in unusable state.
java.lang.IllegalStateException: java.lang.IllegalStateException
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:204) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.flush(LogBuffer.java:174) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:711) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:666) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.lang.IllegalStateException
 at org.apache.asterix.transaction.management.service.logging.LogBufferTailReader.next(LogBufferTailReader.java:49) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.batchUnlock(LogBuffer.java:245) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:201) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 ... 7 more
15:41:32.392 [Checkpoint Thread (31)] INFO org.apache.asterix.transaction.management.service.recovery.CheckpointManager - Attemping soft checkpoint...

{code}


  was:
I'm running an experiment which inserts records into dataset. Each insert contains a frame of data and records have a unique increasing primary key. After a lot of inserts, the LogFlluser fails which causes the system to hang. Here is the log snippet (JobID roughly equals the number of inserts that I tried):

{code}

15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: StartTasks
15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - Initializing TAID:TID:ANID:ODID:3:0:0:0 -> [FeedMeta [org.apache.asterix.runtime.operators.LSMPrimaryUpsertOperatorDescriptor@6ba3bb2f ], AlgebricksMeta [stream-project [2], commit]] for JID:0.27223
15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:2
15:40:41.820 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27214
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.Joblet - Freeing leaked 163840 bytes
15:40:41.882 [Worker:local2] WARN org.apache.hyracks.control.common.work.WorkQueue - Work CleanupJoblet waited 1 times (~0ms), blocked 1 times (~0ms)
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27215
15:40:41.883 [Executor-2:local2] ERROR org.apache.asterix.transaction.management.service.logging.LogFlusher - LogFlusher is terminating abnormally. System is in unusable state.
java.lang.IllegalStateException: java.lang.IllegalStateException
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:204) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.flush(LogBuffer.java:174) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:711) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:666) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
Caused by: java.lang.IllegalStateException
 at org.apache.asterix.transaction.management.service.logging.LogBufferTailReader.next(LogBufferTailReader.java:49) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.batchUnlock(LogBuffer.java:245) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:201) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
 ... 7 more
15:41:32.392 [Checkpoint Thread (31)] INFO org.apache.asterix.transaction.management.service.recovery.CheckpointManager - Attemping soft checkpoint...

{code}



> LogFlusher failed after repeating insert operation a lot of times
> -----------------------------------------------------------------
>
>                 Key: ASTERIXDB-2323
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2323
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Xikui Wang
>            Assignee: Chen Luo
>            Priority: Major
>         Attachments: nc-local2.log
>
>
> I'm running an experiment which inserts records into dataset. Each insert contains a frame of data and records have a unique increasing primary key. After a lot of inserts, the LogFlluser fails which causes the system to hang. Here is the log snippet (JobID roughly equals the number of inserts that I tried). The full log is attached too.
> {code}
> 15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: StartTasks
> 15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - Initializing TAID:TID:ANID:ODID:3:0:0:0 -> [FeedMeta [org.apache.asterix.runtime.operators.LSMPrimaryUpsertOperatorDescriptor@6ba3bb2f ], AlgebricksMeta [stream-project [2], commit]] for JID:0.27223
> 15:40:41.783 [Worker:local2] INFO org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:2
> 15:40:41.820 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
> 15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
> 15:40:41.821 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27214
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.Joblet - Freeing leaked 163840 bytes
> 15:40:41.882 [Worker:local2] WARN org.apache.hyracks.control.common.work.WorkQueue - Work CleanupJoblet waited 1 times (~0ms), blocked 1 times (~0ms)
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: NotifyTaskCompleteWork:TAID:TID:ANID:ODID:3:0:0:0
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.common.work.WorkQueue - Executing: CleanupJoblet
> 15:40:41.882 [Worker:local2] INFO org.apache.hyracks.control.nc.work.CleanupJobletWork - Cleaning up after job: JID:0.27215
> 15:40:41.883 [Executor-2:local2] ERROR org.apache.asterix.transaction.management.service.logging.LogFlusher - LogFlusher is terminating abnormally. System is in unusable state.
> java.lang.IllegalStateException: java.lang.IllegalStateException
>  at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:204) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at org.apache.asterix.transaction.management.service.logging.LogBuffer.flush(LogBuffer.java:174) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:711) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at org.apache.asterix.transaction.management.service.logging.LogFlusher.call(LogManager.java:666) [asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
>  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]
> Caused by: java.lang.IllegalStateException
>  at org.apache.asterix.transaction.management.service.logging.LogBufferTailReader.next(LogBufferTailReader.java:49) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at org.apache.asterix.transaction.management.service.logging.LogBuffer.batchUnlock(LogBuffer.java:245) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  at org.apache.asterix.transaction.management.service.logging.LogBuffer.internalFlush(LogBuffer.java:201) ~[asterix-transactions-0.9.4-SNAPSHOT.jar:0.9.4-SNAPSHOT]
>  ... 7 more
> 15:41:32.392 [Checkpoint Thread (31)] INFO org.apache.asterix.transaction.management.service.recovery.CheckpointManager - Attemping soft checkpoint...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)