You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@fluo.apache.org by GitBox <gi...@apache.org> on 2021/09/01 09:24:26 UTC
[GitHub] [fluo] jaredwinick commented on issue #660: Lock resolution failed
jaredwinick commented on issue #660:
URL: https://github.com/apache/fluo/issues/660#issuecomment-909401013
I am not sure if this is 100% the same thing, but on Fluo 2.0.0-SNAPSHOT we ran into something that looks similar. This occurred on a single test server that was likely very overloaded at the time of failure. When trying to scan we see that exception
```
root@fluo-oracle:/# fluo scan -a crucible -p dataset_offsets:Alerts:
dataset_offsets:Alerts:0 offset 21179
dataset_offsets:Alerts:1 offset 21179
dataset_offsets:Alerts:2 offset 21179
dataset_offsets:Alerts:3 offset 21179
Exception in thread "main" java.lang.IllegalStateException: can not abort : record:Alerts:4:000021178 10 143265458 (UNKNOWN)
at org.apache.fluo.core.impl.LockResolver.resolveLocks(LockResolver.java:201)
at org.apache.fluo.core.impl.SnapshotScanner$SnapIter.resolveLock(SnapshotScanner.java:184)
at org.apache.fluo.core.impl.SnapshotScanner$SnapIter.getNext(SnapshotScanner.java:221)
at org.apache.fluo.core.impl.SnapshotScanner$SnapIter.hasNext(SnapshotScanner.java:131)
at com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:42)
at org.apache.fluo.core.util.ScanUtil.scan(ScanUtil.java:124)
at org.apache.fluo.core.util.ScanUtil.scanFluo(ScanUtil.java:152)
at org.apache.fluo.command.FluoScan.execute(FluoScan.java:109)
at org.apache.fluo.command.FluoProgram.runFluoCommand(FluoProgram.java:69)
at org.apache.fluo.command.FluoProgram.main(FluoProgram.java:33)
```
When looking at the raw data we see the lock to the primary
```
...
dataset_offsets:Alerts:4 :offset [] 113249162-WRITE 113249161
dataset_offsets:Alerts:4 :offset [] 113247916-WRITE 113247915
dataset_offsets:Alerts:4 :offset [] 113246900-WRITE 113246897
dataset_offsets:Alerts:4 :offset [] 85418754-WRITE 85418753
dataset_offsets:Alerts:4 :offset [] 143265458-LOCK record:Alerts:4:000021178 10 WRITE NOT_DELETE NOT_TRIGGER c
dataset_offsets:Alerts:4 :offset [] 143265458-DATA 21178
dataset_offsets:Alerts:4 :offset [] 143265003-DATA 21177
dataset_offsets:Alerts:4 :offset [] 114952314-DATA 21159
...
```
But maybe in this case the primary does exist?
```
record:Alerts:4:000021178 :10 [] 143265458-LOCK record:Alerts:4:000021178 10 WRITE NOT_DELETE NOT_TRIGGER c
```
Are there any recovery tools or process for cleaning up a situation like this? Thanks for any advice anyone may have. cc @wjsl
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@fluo.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org