You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Daniel Harper <Da...@bbc.co.uk> on 2015/04/15 17:41:52 UTC

[Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Hi there,

We've been encountering the exception

Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100

On a very small dataset (180 lines) using the following setup

CREATE TABLE enriched_data (
enriched_json_data string
)
PARTITIONED BY (yyyy string, mm string, dd string, identifier string, sub_identifier string, unique_run_id string)
CLUSTERED BY (enriched_json_data) INTO 128 BUCKETS
LOCATION "${OUTDIR}";

INSERT OVERWRITE TABLE enriched_data PARTITION (yyyy, mm, dd, identifier, sub_identifier, unique_run_id)
SELECT …

We’ve not seen this issue before (normally our dataset is billions of lines), but in this case we have a very tiny amount of data causing this issue.

After looking at the code, it appears as if this condition is failing https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java#L745
I downloaded and rebuilt the branch with a bit of debugging/stdout printing on the contents of the valToPaths map and it fails as there are 101 entries in it

All the entries look like this

yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/000047_0
yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/000048_0
yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/000049_0
yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/000051_0
….

We’re just confused as to why Hive considers the final bit of the output path (e.g. 000047_0) to be a “dynamic partition”, as this is not in our PARTITIONED BY clause

The only thing I can think of is the CLUSTERED BY 128 BUCKETS clause, combined with the dataset being really small (180 lines), is loading everything into 1 REDUCER task – but the hashing of each line is distributing the rows fairly uniformly so we have > 100 buckets to write to via one reducer

Any help will be greatly appreciated

With thanks,

Daniel Harper
Software Engineer, OTG ANT
BC5 A5

Re: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Posted by Daniel Harper <Da...@bbc.co.uk>.

In our case we’ve chose 128 buckets, but that’s just an arbitrary figure we’ve chosen to get a good even distribution

To fix the issue we were having with the small file we just updated the setting hive.exec.max.dynamic.partitions.pernode to 10000, that way if we do run a tiny file (very rarely) which only allocates one reducer – we can be sure we don’t run into this issue again

With thanks,

Daniel Harper
Software Engineer, OTG ANT
BC5 A5

From: Mich Talebzadeh <mi...@peridale.co.uk>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Friday, 17 April 2015 10:18
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Hi Lefty,

I took a look at the documentation link and I noticed that it can be improved. For example the paragraph below:


“How does Hive distribute the rows across the buckets? In general, the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. (There's a '0x7FFFFFFF in there too, but that's not that important). The hash_function depends on the type of the bucketing column. For an int, it's easy, hash_int(i) == i. For example, if user_id were an int, and there were 10 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc. For other datatypes, it's a little tricky. In particular, the hash of a BIGINT is not the same as the BIGINT. And the hash of a string or a complex datatype will be some number that's derived from the value, but not anything humanly-recognizable. For example, if user_id were a STRING, then the user_id's in bucket 1 would probably not end in 0. In general, distributing rows based on the hash will give you a even distribution in the buckets.
So, what can go wrong? As long as you set hive.enforce.bucketing = true, and use the syntax above, the tables should be populated properly. Things can go wrong if the bucketing column type is different during the insert and on read, or if you manually cluster by a value that's different from the table definition.”

So in a nutshell num_buckets determines the granularity of hashing and the number of files. So eventually the table will have in total number_partitions x num_buckets files. The example mentions (not shown above) 256 buckets but that is just a number.

It also states “For example, …and there were 10 buckets”. This is not standard. In a nutshell bucketing is a method to get data “evenly distributed” over many files. Thus, one should define the number of num_buckets by a power of two -- 2^n,  like 2, 4, 8, 16 etc to achieve best results and getting best clustering.

I will try to see the upper limits on the number of buckets within a partition and will get back on that.

HTH

Mich Talebzadeh

http://talebzadehmich.wordpress.com

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7.
co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4
Publications due shortly:
Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache
Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

From: Lefty Leverenz [mailto:leftyleverenz@gmail.com]
Sent: 17 April 2015 00:06
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Re: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

If the number of buckets in a partitioned table has a limit, we need to document it in the wiki.  Currently the example<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables> shows 256 buckets.

-- Lefty

On Thu, Apr 16, 2015 at 4:35 AM, Daniel Harper <Da...@bbc.co.uk>> wrote:
As in you can only have 32 buckets (rather than 128 in our case?)
With thanks,

Daniel Harper
Software Engineer, OTG ANT
BC5 A5

From: Mich Talebzadeh <mi...@peridale.co.uk>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Wednesday, 15 April 2015 16:56
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions





----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.

Hi Lefty,

 

I took a look at the documentation link and I noticed that it can be improved. For example the paragraph below:

 

“How does Hive distribute the rows across the buckets? In general, the bucket number is determined by the expression hash_function(bucketing_column) mod num_buckets. (There's a '0x7FFFFFFF in there too, but that's not that important). The hash_function depends on the type of the bucketing column. For an int, it's easy, hash_int(i) == i. For example, if user_id were an int, and there were 10 buckets, we would expect all user_id's that end in 0 to be in bucket 1, all user_id's that end in a 1 to be in bucket 2, etc. For other datatypes, it's a little tricky. In particular, the hash of a BIGINT is not the same as the BIGINT. And the hash of a string or a complex datatype will be some number that's derived from the value, but not anything humanly-recognizable. For example, if user_id were a STRING, then the user_id's in bucket 1 would probably not end in 0. In general, distributing rows based on the hash will give you a even distribution in the buckets.

So, what can go wrong? As long as you set hive.enforce.bucketing = true, and use the syntax above, the tables should be populated properly. Things can go wrong if the bucketing column type is different during the insert and on read, or if you manually cluster by a value that's different from the table definition.”

 

So in a nutshell num_buckets determines the granularity of hashing and the number of files. So eventually the table will have in total number_partitions x num_buckets files. The example mentions (not shown above) 256 buckets but that is just a number. 

 

It also states “For example, …and there were 10 buckets”. This is not standard. In a nutshell bucketing is a method to get data “evenly distributed” over many files. Thus, one should define the number of num_buckets by a power of two -- 2^n,  like 2, 4, 8, 16 etc to achieve best results and getting best clustering.

 

I will try to see the upper limits on the number of buckets within a partition and will get back on that.

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one out shortly

 

NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: Lefty Leverenz [mailto:leftyleverenz@gmail.com] 
Sent: 17 April 2015 00:06
To: user@hive.apache.org
Subject: Re: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

 

If the number of buckets in a partitioned table has a limit, we need to document it in the wiki.  Currently the example <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables>  shows 256 buckets.




-- Lefty

 

On Thu, Apr 16, 2015 at 4:35 AM, Daniel Harper <Da...@bbc.co.uk> wrote:

As in you can only have 32 buckets (rather than 128 in our case?) 

With thanks,

 

Daniel Harper

Software Engineer, OTG ANT

BC5 A5

 

From: Mich Talebzadeh <mi...@peridale.co.uk>
Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
Date: Wednesday, 15 April 2015 16:56
To: "user@hive.apache.org" <us...@hive.apache.org>
Subject: RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

 

 

 

----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

Re: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Posted by Lefty Leverenz <le...@gmail.com>.

If the number of buckets in a partitioned table has a limit, we need to
document it in the wiki.  Currently the example
<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables>
shows 256 buckets.

-- Lefty

On Thu, Apr 16, 2015 at 4:35 AM, Daniel Harper <Da...@bbc.co.uk>
wrote:

>   As in you can only have 32 buckets (rather than 128 in our case?)
>  With thanks,
>
>  Daniel Harper
> Software Engineer, OTG ANT
> BC5 A5
>
>   From: Mich Talebzadeh <mi...@peridale.co.uk>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>
> Date: Wednesday, 15 April 2015 16:56
> To: "user@hive.apache.org" <us...@hive.apache.org>
> Subject: RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error
> occurred when node tried to create too many dynamic partitions" on small
> dataset with dynamic partitions
>
>
>
>
> ----------------------------
>
> http://www.bbc.co.uk
> This e-mail (and any attachments) is confidential and may contain personal
> views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in
> reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
>
> ---------------------
>

Re: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Posted by Daniel Harper <Da...@bbc.co.uk>.

As in you can only have 32 buckets (rather than 128 in our case?)
With thanks,

Daniel Harper
Software Engineer, OTG ANT
BC5 A5

From: Mich Talebzadeh <mi...@peridale.co.uk>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Date: Wednesday, 15 April 2015 16:56
To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>
Subject: RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

RE: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred when node tried to create too many dynamic partitions" on small dataset with dynamic partitions

Posted by Mich Talebzadeh <mi...@peridale.co.uk>.

Hi,

 

I believe partitioning followed by hash cluster allows only up to 32 buckets
within a single partition?

 

HTH,

 

Mich

 

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Ltd, its
subsidiaries or their employees, unless expressly so stated. It is the
responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

From: Daniel Harper [mailto:Daniel.Harper@bbc.co.uk] 
Sent: 15 April 2015 16:42
To: user@hive.apache.org
Subject: [Hive 0.13.1] - Explanation/confusion over "Fatal error occurred
when node tried to create too many dynamic partitions" on small dataset with
dynamic partitions

 

Hi there,

 

We've been encountering the exception

 

Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal
error occurred when node tried to create too many dynamic partitions. The
maximum number of dynamic partitions is controlled by
hive.exec.max.dynamic.partitions and
hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100

 

On a very small dataset (180 lines) using the following setup

 

CREATE TABLE enriched_data (

enriched_json_data string

)

PARTITIONED BY (yyyy string, mm string, dd string, identifier string,
sub_identifier string, unique_run_id string)

CLUSTERED BY (enriched_json_data) INTO 128 BUCKETS

LOCATION "${OUTDIR}"; 

 

INSERT OVERWRITE TABLE enriched_data PARTITION (yyyy, mm, dd, identifier,
sub_identifier, unique_run_id)

SELECT .

 

We've not seen this issue before (normally our dataset is billions of
lines), but in this case we have a very tiny amount of data causing this
issue.

 

After looking at the code, it appears as if this condition is failing
https://github.com/apache/hive/blob/branch-0.13/ql/src/java/org/apache/hadoo
p/hive/ql/exec/FileSinkOperator.java#L745

I downloaded and rebuilt the branch with a bit of debugging/stdout printing
on the contents of the valToPaths map and it fails as there are 101 entries
in it

 

All the entries look like this

 

yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/
000047_0

yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/
000048_0

yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/
000049_0

yyyy=2015/mm=04/dd=09/identifier=1/sub-identifier=3/unique_run_id=df-345345/
000051_0

..

 

We're just confused as to why Hive considers the final bit of the output
path (e.g. 000047_0) to be a "dynamic partition", as this is not in our
PARTITIONED BY clause

 

The only thing I can think of is the CLUSTERED BY 128 BUCKETS clause,
combined with the dataset being really small (180 lines), is loading
everything into 1 REDUCER task - but the hashing of each line is
distributing the rows fairly uniformly so we have > 100 buckets to write to
via one reducer

 

Any help will be greatly appreciated

 

With thanks,

 

Daniel Harper

Software Engineer, OTG ANT

BC5 A5