You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (Jira)" <ji...@apache.org> on 2023/05/12 18:39:00 UTC
[jira] [Assigned] (CONNECTORS-1746) Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling become extremely slow.
[ https://issues.apache.org/jira/browse/CONNECTORS-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wright reassigned CONNECTORS-1746:
---------------------------------------
Assignee: Karl Wright
> Adding conditions to execute PostgreSQL's ANALYZE command to avoid crawling become extremely slow.
> --------------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-1746
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1746
> Project: ManifoldCF
> Issue Type: Improvement
> Components: Web connector
> Environment: Using ManifoldCF 2.24 with PostgreSQL 12.14 as the database.
> Reporter: Mingchun Zhao
> Assignee: Karl Wright
> Priority: Major
> Attachments: DBInterfacePostgreSQL.java.patch
>
>
> Sometimes, the crawling does not process any documents for a while and there is nothing logged about long-running queries. The performance can be restored by firing the 'ANALYZE' command manually. It seems that a bad query plan caused this performance problem.
> Therefore, in addition to the current configuration parameter 'org.apache.manifoldcf.db.postgres.analyze.<tablename>', it is considered necessary to execute the 'ANALYZE' even in the following situations.
> 1. When the number of records in the table exceeds the number required for creating a execution plan after the job starts.
> 2. When the crawling performance slows down. For example, if the processing rate of documents drops below a specified threshold.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)