You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hawq.apache.org by "Chunling Wang (JIRA)" <ji...@apache.org> on 2017/09/14 06:37:02 UTC

[jira] [Updated] (HAWQ-1525) Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog

     [ https://issues.apache.org/jira/browse/HAWQ-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chunling Wang updated HAWQ-1525:
--------------------------------
    Description: 
When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. 

{code}
2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master pr
ocess received signal SIGSEGV",,,,,,,0,,,,"1    0x96f57c postgres <symbol not found> + 0x96f57c
2    0x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
3    0x88b04f postgres CdbProgramErrorHandler + 0xf1
4    0x3a16a0f7e0 libpthread.so.0 <symbol not found> + 0x16a0f7e0
5    0x973048 postgres FunctionCall2 + 0x8e
6    0xabefab postgres <symbol not found> + 0xabefab
7    0xabfee4 postgres InMemHeap_GetNext + 0x408
8    0x4f7bc6 postgres <symbol not found> + 0x4f7bc6
9    0x4f7abc postgres systable_getnext + 0x50
10   0x953fb8 postgres SearchCatCache + 0x276
11   0x95ce10 postgres SearchSysCache + 0x93
12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
13   0x5a07fc postgres caql_getoid_plus + 0x176
14   0x5c4888 postgres LookupNamespaceId + 0x129
15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
16   0x687897 postgres <symbol not found> + 0x687897
17   0x687574 postgres CreateSchemaCommand + 0x8f
18   0x8952d1 postgres ProcessUtility + 0x4ff
19   0x5c5728 postgres <symbol not found> + 0x5c5728
20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
21   0x6e43f3 postgres <symbol not found> + 0x6e43f3
22   0x6e49c4 postgres <symbol not found> + 0x6e49c4
23   0x6e1401 postgres <symbol not found> + 0x6e1401
24   0x6deb2d postgres ExecutorStart + 0xb01
25   0x738594 postgres <symbol not found> + 0x738594
26   0x73809f postgres <symbol not found> + 0x73809f
27   0x7351a9 postgres SPI_execute + 0x13c
28   0x6490f2 postgres spiExecuteWithCallback + 0x130
29   0x64956b postgres <symbol not found> + 0x64956b
30   0x648be0 postgres <symbol not found> + 0x648be0
31   0x647be0 postgres analyzeStmt + 0x91d
32   0x647247 postgres analyzeStatement + 0xb1
33   0x6ca11d postgres vacuum + 0xe5
34   0x827910 postgres autostats_issue_analyze + 0x160
35   0x827e10 postgres auto_stats + 0x19b
36   0x8906b5 postgres <symbol not found> + 0x8906b5
37   0x8930f5 postgres <symbol not found> + 0x8930f5
38   0x892619 postgres PortalRun + 0x3e6
39   0x8884f6 postgres <symbol not found> + 0x8884f6
{code}

This is because reindex command clear the relcache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in relcache, which is not same with that when heap relation is reopened.

  was:
When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. 

{code}
2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master pr
ocess received signal SIGSEGV",,,,,,,0,,,,"1    0x96f57c postgres <symbol not found> + 0x96f57c
2    0x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
3    0x88b04f postgres CdbProgramErrorHandler + 0xf1
4    0x3a16a0f7e0 libpthread.so.0 <symbol not found> + 0x16a0f7e0
5    0x973048 postgres FunctionCall2 + 0x8e
6    0xabefab postgres <symbol not found> + 0xabefab
7    0xabfee4 postgres InMemHeap_GetNext + 0x408
8    0x4f7bc6 postgres <symbol not found> + 0x4f7bc6
9    0x4f7abc postgres systable_getnext + 0x50
10   0x953fb8 postgres SearchCatCache + 0x276
11   0x95ce10 postgres SearchSysCache + 0x93
12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
13   0x5a07fc postgres caql_getoid_plus + 0x176
14   0x5c4888 postgres LookupNamespaceId + 0x129
15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
16   0x687897 postgres <symbol not found> + 0x687897
17   0x687574 postgres CreateSchemaCommand + 0x8f
18   0x8952d1 postgres ProcessUtility + 0x4ff
19   0x5c5728 postgres <symbol not found> + 0x5c5728
20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
21   0x6e43f3 postgres <symbol not found> + 0x6e43f3
22   0x6e49c4 postgres <symbol not found> + 0x6e49c4
23   0x6e1401 postgres <symbol not found> + 0x6e1401
24   0x6deb2d postgres ExecutorStart + 0xb01
25   0x738594 postgres <symbol not found> + 0x738594
26   0x73809f postgres <symbol not found> + 0x73809f
27   0x7351a9 postgres SPI_execute + 0x13c
28   0x6490f2 postgres spiExecuteWithCallback + 0x130
29   0x64956b postgres <symbol not found> + 0x64956b
30   0x648be0 postgres <symbol not found> + 0x648be0
31   0x647be0 postgres analyzeStmt + 0x91d
32   0x647247 postgres analyzeStatement + 0xb1
33   0x6ca11d postgres vacuum + 0xe5
34   0x827910 postgres autostats_issue_analyze + 0x160
35   0x827e10 postgres auto_stats + 0x19b
36   0x8906b5 postgres <symbol not found> + 0x8906b5
37   0x8930f5 postgres <symbol not found> + 0x8930f5
38   0x892619 postgres PortalRun + 0x3e6
39   0x8884f6 postgres <symbol not found> + 0x8884f6
{code}

This is because reindex command clear the syscache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in syscache, which is not same with that when heap relation is reopened.


> Segmentation fault occurs if reindex database when loading data from Hive to HAWQ using hcatalog
> ------------------------------------------------------------------------------------------------
>
>                 Key: HAWQ-1525
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1525
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: Query Execution
>            Reporter: Chunling Wang
>            Assignee: Chunling Wang
>
> When we use hcatalog to load data from Hive to HAWQ, if the amount of data is big enough, it will trigger automatic statistics collection, calling vacuum analyze. At that time if we reindex the database, the system will panic on the next auto analyze. Here is the call stack. 
> {code}
> 2017-09-07 13:34:41.441970 IST,,,p34393,th0,,,2017-09-07 13:34:09 IST,0,con1140,cmd6,seg-1,,,,,"PANIC","XX000","Unexpected internal error: Master pr
> ocess received signal SIGSEGV",,,,,,,0,,,,"1    0x96f57c postgres <symbol not found> + 0x96f57c
> 2    0x96f785 postgres StandardHandlerForSigillSigsegvSigbus_OnMainThread + 0x2b
> 3    0x88b04f postgres CdbProgramErrorHandler + 0xf1
> 4    0x3a16a0f7e0 libpthread.so.0 <symbol not found> + 0x16a0f7e0
> 5    0x973048 postgres FunctionCall2 + 0x8e
> 6    0xabefab postgres <symbol not found> + 0xabefab
> 7    0xabfee4 postgres InMemHeap_GetNext + 0x408
> 8    0x4f7bc6 postgres <symbol not found> + 0x4f7bc6
> 9    0x4f7abc postgres systable_getnext + 0x50
> 10   0x953fb8 postgres SearchCatCache + 0x276
> 11   0x95ce10 postgres SearchSysCache + 0x93
> 12   0x95cecb postgres SearchSysCacheKeyArray + 0x9f
> 13   0x5a07fc postgres caql_getoid_plus + 0x176
> 14   0x5c4888 postgres LookupNamespaceId + 0x129
> 15   0x5c475d postgres LookupInternalNamespaceId + 0x1d
> 16   0x687897 postgres <symbol not found> + 0x687897
> 17   0x687574 postgres CreateSchemaCommand + 0x8f
> 18   0x8952d1 postgres ProcessUtility + 0x4ff
> 19   0x5c5728 postgres <symbol not found> + 0x5c5728
> 20   0x5c2fea postgres RangeVarGetCreationNamespace + 0x253
> 21   0x6e43f3 postgres <symbol not found> + 0x6e43f3
> 22   0x6e49c4 postgres <symbol not found> + 0x6e49c4
> 23   0x6e1401 postgres <symbol not found> + 0x6e1401
> 24   0x6deb2d postgres ExecutorStart + 0xb01
> 25   0x738594 postgres <symbol not found> + 0x738594
> 26   0x73809f postgres <symbol not found> + 0x73809f
> 27   0x7351a9 postgres SPI_execute + 0x13c
> 28   0x6490f2 postgres spiExecuteWithCallback + 0x130
> 29   0x64956b postgres <symbol not found> + 0x64956b
> 30   0x648be0 postgres <symbol not found> + 0x648be0
> 31   0x647be0 postgres analyzeStmt + 0x91d
> 32   0x647247 postgres analyzeStatement + 0xb1
> 33   0x6ca11d postgres vacuum + 0xe5
> 34   0x827910 postgres autostats_issue_analyze + 0x160
> 35   0x827e10 postgres auto_stats + 0x19b
> 36   0x8906b5 postgres <symbol not found> + 0x8906b5
> 37   0x8930f5 postgres <symbol not found> + 0x8930f5
> 38   0x892619 postgres PortalRun + 0x3e6
> 39   0x8884f6 postgres <symbol not found> + 0x8884f6
> {code}
> This is because reindex command clear the relcache, and inmemscan->rs_rd->rel in InMemHeap_GetNext() using the address of this heap relation in relcache, which is not same with that when heap relation is reopened.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)