You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hawq.apache.org by Leon Zhang <le...@gmail.com> on 2015/12/05 03:15:06 UTC

Crashed while hawq init master

Hi, HAWQ dev:

   I recently rebuild the latest hawq on Centos7, it crashed while "hawq
init master". The stack dump looks like this:

   The files belonging to this database system will be owned by user
"xiaolin".
This user must also own the server process.

The database cluster will be initialized with locale en_US.utf8.

fixing permissions on existing directory
/mnt/xiaolin/hawq-data-directory/masterdd ... ok
creating subdirectories ... ok
selecting default max_connections ... 1280
selecting default shared_buffers/max_fsm_pages ... 125MB/200000
creating configuration files ... ok
creating template1 database in
/mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
02:00:44.038224
GMT,,,p106570,th2038606336,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
can not be set by the user and will be
ignored.",,,,,,,,"set_config_option","guc.c",9990,
ok
loading file-system persistent tables for template1 ...
2015-12-05 02:00:50.473854
GMT,,,p106586,th2067827200,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
can not be set by the user and will be
ignored.",,,,,,,,"set_config_option","guc.c",9990,
ok
initializing pg_authid ... 2015-12-05 02:00:51.873844
GMT,,,p106590,th-1267525120,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
can not be set by the user and will be
ignored.",,,,,,,,"set_config_option","guc.c",9990,
2015-12-05 10:00:52.434633
CST,,,p106590,th-1267525120,,,,0,,cmd1,seg-10000,,,x6,sx1,"FATAL","XX000","wrong
number of index expressions (index.c:1186)",,,,,,"CREATE TRIGGER
pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database   FOR
EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
",,"FormIndexDatum","index.c",1186,1    0x8c2b28 postgres errstart
(elog.c:473)
2    0x8c489b postgres elog_finish (elog.c:1421)
3    0x5735e5 postgres FormIndexDatum (index.c:1186)
4    0x575030 postgres CatalogIndexInsert (discriminator 2)
5    0x562f14 postgres caql_insert (caqlaccess.c:830)
6    0x63fb38 postgres CreateTrigger (trigger.c:427)
7    0x7ed0ec postgres ProcessUtility (utility.c:1578)
8    0x7e8d6e postgres <symbol not found> (pquery.c:1885)
9    0x7ea54e postgres <symbol not found> (pquery.c:1989)
10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
11   0x7e41f1 postgres <symbol not found> (postgres.c:1732)
12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
13   0x4a2982 postgres main (main.c:204)
14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
15   0x4a2a1d postgres <symbol not found> (??:?)

child process exited with exit code 1
initdb: removing contents of data directory
"/mnt/xiaolin/hawq-data-directory/masterdd"
Master postgres initdb failed
20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master postgres
initdb failed
20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
failed, exit




I have no idea why it crashed, any help will be appreciated.

Thanks.

Re: Crashed while hawq init master

Posted by Leon Zhang <le...@gmail.com>.
Hi, Zhanwei

   Glad you create a issue. We have investigated this issue, and we believe
we found the root cause. We explained it on JIRA, and attached our patch to
fix it. Hope you can review it, and any comment are welcome.

   Thanks.

On Tue, Dec 8, 2015 at 4:35 PM, Zhanwei Wang <zw...@pivotal.io> wrote:

> Hi Leon
>
> HAWQ-233 is create for this issue
>
> https://issues.apache.org/jira/browse/HAWQ-233
>
> On Tue, Dec 8, 2015 at 4:28 PM, Zhanwei Wang <zw...@pivotal.io> wrote:
>
> > Hi Leon
> >
> > I have reproduce the issue.  I will file a JIRA to for it.
> >
> >
> >
> > On Sat, Dec 5, 2015 at 10:15 AM, Leon Zhang <le...@gmail.com> wrote:
> >
> >> Hi, HAWQ dev:
> >>
> >>    I recently rebuild the latest hawq on Centos7, it crashed while "hawq
> >> init master". The stack dump looks like this:
> >>
> >>    The files belonging to this database system will be owned by user
> >> "xiaolin".
> >> This user must also own the server process.
> >>
> >> The database cluster will be initialized with locale en_US.utf8.
> >>
> >> fixing permissions on existing directory
> >> /mnt/xiaolin/hawq-data-directory/masterdd ... ok
> >> creating subdirectories ... ok
> >> selecting default max_connections ... 1280
> >> selecting default shared_buffers/max_fsm_pages ... 125MB/200000
> >> creating configuration files ... ok
> >> creating template1 database in
> >> /mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
> >> 02:00:44.038224
> >>
> >>
> GMT,,,p106570,th2038606336,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> >> can not be set by the user and will be
> >> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> >> ok
> >> loading file-system persistent tables for template1 ...
> >> 2015-12-05 02:00:50.473854
> >>
> >>
> GMT,,,p106586,th2067827200,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> >> can not be set by the user and will be
> >> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> >> ok
> >> initializing pg_authid ... 2015-12-05 02:00:51.873844
> >>
> >>
> GMT,,,p106590,th-1267525120,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> >> can not be set by the user and will be
> >> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> >> 2015-12-05 10:00:52.434633
> >>
> >>
> CST,,,p106590,th-1267525120,,,,0,,cmd1,seg-10000,,,x6,sx1,"FATAL","XX000","wrong
> >> number of index expressions (index.c:1186)",,,,,,"CREATE TRIGGER
> >> pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database
> >>  FOR
> >> EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
> >> ",,"FormIndexDatum","index.c",1186,1    0x8c2b28 postgres errstart
> >> (elog.c:473)
> >> 2    0x8c489b postgres elog_finish (elog.c:1421)
> >> 3    0x5735e5 postgres FormIndexDatum (index.c:1186)
> >> 4    0x575030 postgres CatalogIndexInsert (discriminator 2)
> >> 5    0x562f14 postgres caql_insert (caqlaccess.c:830)
> >> 6    0x63fb38 postgres CreateTrigger (trigger.c:427)
> >> 7    0x7ed0ec postgres ProcessUtility (utility.c:1578)
> >> 8    0x7e8d6e postgres <symbol not found> (pquery.c:1885)
> >> 9    0x7ea54e postgres <symbol not found> (pquery.c:1989)
> >> 10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
> >> 11   0x7e41f1 postgres <symbol not found> (postgres.c:1732)
> >> 12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
> >> 13   0x4a2982 postgres main (main.c:204)
> >> 14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
> >> 15   0x4a2a1d postgres <symbol not found> (??:?)
> >>
> >> child process exited with exit code 1
> >> initdb: removing contents of data directory
> >> "/mnt/xiaolin/hawq-data-directory/masterdd"
> >> Master postgres initdb failed
> >> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master
> >> postgres
> >> initdb failed
> >> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
> >> failed, exit
> >>
> >>
> >>
> >>
> >> I have no idea why it crashed, any help will be appreciated.
> >>
> >> Thanks.
> >>
> >
> >
> >
> > --
> > Best Regards
> > ----------
> >
> > Zhanwei Wang
> >
> >
>
>
> --
> Best Regards
> ----------
>
> Zhanwei Wang
>

Re: Crashed while hawq init master

Posted by Zhanwei Wang <zw...@pivotal.io>.
Hi Leon

HAWQ-233 is create for this issue

https://issues.apache.org/jira/browse/HAWQ-233

On Tue, Dec 8, 2015 at 4:28 PM, Zhanwei Wang <zw...@pivotal.io> wrote:

> Hi Leon
>
> I have reproduce the issue.  I will file a JIRA to for it.
>
>
>
> On Sat, Dec 5, 2015 at 10:15 AM, Leon Zhang <le...@gmail.com> wrote:
>
>> Hi, HAWQ dev:
>>
>>    I recently rebuild the latest hawq on Centos7, it crashed while "hawq
>> init master". The stack dump looks like this:
>>
>>    The files belonging to this database system will be owned by user
>> "xiaolin".
>> This user must also own the server process.
>>
>> The database cluster will be initialized with locale en_US.utf8.
>>
>> fixing permissions on existing directory
>> /mnt/xiaolin/hawq-data-directory/masterdd ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 1280
>> selecting default shared_buffers/max_fsm_pages ... 125MB/200000
>> creating configuration files ... ok
>> creating template1 database in
>> /mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
>> 02:00:44.038224
>>
>> GMT,,,p106570,th2038606336,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.",,,,,,,,"set_config_option","guc.c",9990,
>> ok
>> loading file-system persistent tables for template1 ...
>> 2015-12-05 02:00:50.473854
>>
>> GMT,,,p106586,th2067827200,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.",,,,,,,,"set_config_option","guc.c",9990,
>> ok
>> initializing pg_authid ... 2015-12-05 02:00:51.873844
>>
>> GMT,,,p106590,th-1267525120,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
>> can not be set by the user and will be
>> ignored.",,,,,,,,"set_config_option","guc.c",9990,
>> 2015-12-05 10:00:52.434633
>>
>> CST,,,p106590,th-1267525120,,,,0,,cmd1,seg-10000,,,x6,sx1,"FATAL","XX000","wrong
>> number of index expressions (index.c:1186)",,,,,,"CREATE TRIGGER
>> pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database
>>  FOR
>> EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
>> ",,"FormIndexDatum","index.c",1186,1    0x8c2b28 postgres errstart
>> (elog.c:473)
>> 2    0x8c489b postgres elog_finish (elog.c:1421)
>> 3    0x5735e5 postgres FormIndexDatum (index.c:1186)
>> 4    0x575030 postgres CatalogIndexInsert (discriminator 2)
>> 5    0x562f14 postgres caql_insert (caqlaccess.c:830)
>> 6    0x63fb38 postgres CreateTrigger (trigger.c:427)
>> 7    0x7ed0ec postgres ProcessUtility (utility.c:1578)
>> 8    0x7e8d6e postgres <symbol not found> (pquery.c:1885)
>> 9    0x7ea54e postgres <symbol not found> (pquery.c:1989)
>> 10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
>> 11   0x7e41f1 postgres <symbol not found> (postgres.c:1732)
>> 12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
>> 13   0x4a2982 postgres main (main.c:204)
>> 14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
>> 15   0x4a2a1d postgres <symbol not found> (??:?)
>>
>> child process exited with exit code 1
>> initdb: removing contents of data directory
>> "/mnt/xiaolin/hawq-data-directory/masterdd"
>> Master postgres initdb failed
>> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master
>> postgres
>> initdb failed
>> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
>> failed, exit
>>
>>
>>
>>
>> I have no idea why it crashed, any help will be appreciated.
>>
>> Thanks.
>>
>
>
>
> --
> Best Regards
> ----------
>
> Zhanwei Wang
>
>


-- 
Best Regards
----------

Zhanwei Wang

Re: Crashed while hawq init master

Posted by Zhanwei Wang <zw...@pivotal.io>.
Hi Leon

I have reproduce the issue.  I will file a JIRA to for it.



On Sat, Dec 5, 2015 at 10:15 AM, Leon Zhang <le...@gmail.com> wrote:

> Hi, HAWQ dev:
>
>    I recently rebuild the latest hawq on Centos7, it crashed while "hawq
> init master". The stack dump looks like this:
>
>    The files belonging to this database system will be owned by user
> "xiaolin".
> This user must also own the server process.
>
> The database cluster will be initialized with locale en_US.utf8.
>
> fixing permissions on existing directory
> /mnt/xiaolin/hawq-data-directory/masterdd ... ok
> creating subdirectories ... ok
> selecting default max_connections ... 1280
> selecting default shared_buffers/max_fsm_pages ... 125MB/200000
> creating configuration files ... ok
> creating template1 database in
> /mnt/xiaolin/hawq-data-directory/masterdd/base/1 ... 2015-12-05
> 02:00:44.038224
>
> GMT,,,p106570,th2038606336,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> ok
> loading file-system persistent tables for template1 ...
> 2015-12-05 02:00:50.473854
>
> GMT,,,p106586,th2067827200,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> ok
> initializing pg_authid ... 2015-12-05 02:00:51.873844
>
> GMT,,,p106590,th-1267525120,,,,0,,,seg-10000,,,,,"WARNING","01000","""fsync"":
> can not be set by the user and will be
> ignored.",,,,,,,,"set_config_option","guc.c",9990,
> 2015-12-05 10:00:52.434633
>
> CST,,,p106590,th-1267525120,,,,0,,cmd1,seg-10000,,,x6,sx1,"FATAL","XX000","wrong
> number of index expressions (index.c:1186)",,,,,,"CREATE TRIGGER
> pg_sync_pg_database   AFTER INSERT OR UPDATE OR DELETE ON pg_database   FOR
> EACH STATEMENT EXECUTE PROCEDURE flatfile_update_trigger();
> ",,"FormIndexDatum","index.c",1186,1    0x8c2b28 postgres errstart
> (elog.c:473)
> 2    0x8c489b postgres elog_finish (elog.c:1421)
> 3    0x5735e5 postgres FormIndexDatum (index.c:1186)
> 4    0x575030 postgres CatalogIndexInsert (discriminator 2)
> 5    0x562f14 postgres caql_insert (caqlaccess.c:830)
> 6    0x63fb38 postgres CreateTrigger (trigger.c:427)
> 7    0x7ed0ec postgres ProcessUtility (utility.c:1578)
> 8    0x7e8d6e postgres <symbol not found> (pquery.c:1885)
> 9    0x7ea54e postgres <symbol not found> (pquery.c:1989)
> 10   0x7ec2a5 postgres PortalRun (pquery.c:1510)
> 11   0x7e41f1 postgres <symbol not found> (postgres.c:1732)
> 12   0x7e56b4 postgres PostgresMain (postgres.c:4697)
> 13   0x4a2982 postgres main (main.c:204)
> 14   0x7f1eb174faf5 libc.so.6 __libc_start_main (??:0)
> 15   0x4a2a1d postgres <symbol not found> (??:?)
>
> child process exited with exit code 1
> initdb: removing contents of data directory
> "/mnt/xiaolin/hawq-data-directory/masterdd"
> Master postgres initdb failed
> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[INFO]:-Master postgres
> initdb failed
> 20151205:10:01:00:106236 hawq_init:dserver1:xiaolin-[ERROR]:-Master init
> failed, exit
>
>
>
>
> I have no idea why it crashed, any help will be appreciated.
>
> Thanks.
>



-- 
Best Regards
----------

Zhanwei Wang