You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@manifoldcf.apache.org by Julien <ju...@francelabs.com> on 2019/05/28 12:37:11 UTC
Long running queries on jobqueue
Hi,
I need some help to better understand what I am experiencing with a job using a JDBC connector. The job crawls a table containing approximately 9M of documents. I am currently unable to complete this job as it randomly fails with the Following error :
Error : Unexpected jobqueue status - record id 15588697113928, expecting active status, saw 0
The logs are showing a lot of long running queries, here is an example but there are a lot and on different query kind, but all of them concern the jobqueue table :
WARN 2019-05-27T01:16:56,048 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db|Found a long-running query (92568 ms): [SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount FROM jobqueue t0 WHERE t0.docpriority<? AND (t0.status=? OR t0.status=?) AND t0.checkaction=? AND t0.checktime<=? AND EXISTS(SELECT 'x' FROM jobs t1 WHERE (t1.status=? OR t1.status=?) AND t1.id=t0.jobid AND t1.priority=?) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND (t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR t2.status=? OR t2.status=?) AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC LIMIT 40000]
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 0: '1.000000001E9'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 1: 'P'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 2: 'G'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 3: 'R'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 4: '1558919723463'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 5: 'A'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 6: 'a'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 7: '5'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 8: 'A'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 9: 'F'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 10: 'a'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 11: 'f'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 12: 'D'
WARN 2019-05-27T01:16:56,053 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Parameter 13: 'd'
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Limit (cost=89494.71..89495.11 rows=160 width=91) (actual time=15468.494..15481.412 rows=40000 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Sort (cost=89494.71..89495.11 rows=160 width=91) (actual time=15468.493..15477.328 rows=40000 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Key: t0.docpriority
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Sort Method: external sort Disk: 102584kB
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested Loop Anti Join (cost=42423.35..89488.86 rows=160 width=91) (actual time=234.622..13792.727 rows=1164882 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash Anti Join (cost=42422.79..88190.34 rows=160 width=91) (actual time=234.593..1316.236 rows=1164882 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Hash Cond: (t0.id = t3.owner)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Nested Loop (cost=42395.08..88160.62 rows=160 width=91) (actual time=234.568..1002.111 rows=1164882 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Join Filter: (t0.jobid = t1.id)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on jobs t1 (cost=0.00..11.40 rows=1 width=8) (actual time=0.015..0.016 rows=1 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: ((priority = '5'::bigint) AND ((status = 'A'::bpchar) OR (status = 'a'::bpchar)))
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Rows Removed by Filter: 6
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Heap Scan on jobqueue t0 (cost=42395.08..87989.60 rows=12769 width=91) (actual time=234.548..656.692 rows=1164882 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Recheck Cond: (((docpriority < '1000000001'::double precision) AND (status = 'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)) OR ((status = 'G'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint)))
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: (docpriority < '1000000001'::double precision)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Heap Blocks: exact=20094
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> BitmapOr (cost=42395.08..42395.08 rows=12769 width=0) (actual time=229.219..229.219 rows=0 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Index Scan on i1556284697879 (cost=0.00..42384.25 rows=12769 width=0) (actual time=229.176..229.176 rows=1164882 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((docpriority < '1000000001'::double precision) AND (status = 'P'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint))
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Bitmap Index Scan on i1556284697880 (cost=0.00..4.45 rows=1 width=0) (actual time=0.041..0.041 rows=93 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((status = 'G'::bpchar) AND (checkaction = 'R'::bpchar) AND (checktime <= '1558919723463'::bigint))
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash (cost=26.08..26.08 rows=130 width=8) (actual time=0.007..0.007 rows=0 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Buckets: 1024 Batches: 1 Memory Usage: 8kB
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash Join (cost=12.93..26.08 rows=130 width=8) (actual time=0.006..0.006 rows=0 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Hash Cond: ((t3.eventname)::text = (t4.name)::text)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on prereqevents t3 (cost=0.00..11.40 rows=140 width=524) (actual time=0.005..0.005 rows=0 loops=1)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Hash (cost=11.30..11.30 rows=130 width=516) (never executed)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Seq Scan on events t4 (cost=0.00..11.30 rows=130 width=516) (never executed)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: -> Index Scan using i1556284697878 on jobqueue t2 (cost=0.56..8.11 rows=1 width=49) (actual time=0.010..0.010 rows=0 loops=1164882)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Index Cond: ((dochash)::text = (t0.dochash)::text)
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Filter: ((jobid <> t0.jobid) AND ((status = 'A'::bpchar) OR (status = 'F'::bpchar) OR (status = 'a'::bpchar) OR (status = 'f'::bpchar) OR (status = 'D'::bpchar) OR (status = 'd'::bpchar)))
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Rows Removed by Filter: 1
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Planning time: 1.455 ms
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Plan: Execution time: 15518.479 ms
WARN 2019-05-27T01:17:11,635 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db|
WARN 2019-05-27T01:17:11,749 (Stuffer thread) - MCF|MCF-agent|apache.manifoldcf.db| Stats: n_distinct=3.0 most_common_vals={C,P,A} most_common_freqs={0.779200017,0.220666662,0.000133333335}
The same happens if I split the job into several jobs of 1.5M of docs at the difference that if I restart the jobs, some may end but with no guarantee.
I don’t really understand what is causing those long running queries and the fact that the job/jobs can fail on an Error : Unexpected status…
I tried to add the following parameter to the properties.xml to have better logs but it changed nothing : <property name="org.apache.manifoldcf.diagnostics" value="DEBUG"/>
The only thing I did that can explain a lack of performances is that I raised the org.apache.manifoldcf.ui.maxstatuscount from 500k to 2M. But I was only expecting low performances on job status UI. Is it possible that it is responsible of the long running queries ?
Thanks for the help,
Julien
---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast.
https://www.avast.com/antivirus