You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Lucas de Castro Magalhães <lu...@santodigital.com.br> on 2021/10/25 14:19:34 UTC

Refusing to split Position of last group processed was b'w\xd5\xdd\x82\x00\x01'." python sdk

Hi guys.

Anyone could help me with a problem on my pipeline. The pipeline stuck and
doesn't do anything.

The logging that I received is

jsonPayload: {
job: "2021-10-20_21_00_01-5873629180911776129"
logger: "root:shuffle.py:try_split"
message: "Refusing to split
<dataflow_worker.shuffle.GroupedShuffleRangeTracker object at
0x7f267e6411d0> at b'w\xd5\xdd\x83\x00\x01': proposed split position is out
of range [b'qni\xad\x00\x01', b'w\xd5\xdd\x83\x00\x01'). Position of last
group processed was b'w\xd5\xdd\x82\x00\x01'."
thread: "63:139803304449792"
worker: "template-calendarapi-bigq-10202100-p8n3-harness-lr2k"


my pipeline is and in red is the part that the job stuck. Sometimes the job
is completed and sometimes not.

[image: image.png]




-- 

Lucas de Castro Magalhães

Innovation Manager

+55 (11) 99420-4667
 Agende sua reunião comigo aqui
<ht...@santodigital.com.br>

santodigital.com.br <https://www.santodigital.com.br>
[image: LinkedIn] <https://www.linkedin.com/company/santodigital/> [image:
Instagram] <https://www.instagram.com/santodigital/> [image: YouTube]
<https://www.youtube.com/channel/UCw4UVLYRFUyYUDdLwmmlfcA> [image: Facebook]
<https://www.facebook.com/santodigital> [image: Twitter]
<https://twitter.com/santodigital>
  <https://www.santoid.com.br/>

Re: Refusing to split Position of last group processed was b'w\xd5\xdd\x82\x00\x01'." python sdk

Posted by Luke Cwik <lc...@google.com>.
Split requests at the end of processing that fail are ok and do happen when
processing is stuck on the last key.

Do you have a hot key that is being processed near the end which is taking
a long time and makes it look like the system is stuck?

Do thread dumps of the worker that is processing this last request show
anything of interest?

This also looks like a Dataflow job so I would also suggest reaching out to
GCP support if none of the above lead you anywhere during debugging of the
job.

On Mon, Oct 25, 2021 at 7:19 AM Lucas de Castro Magalhães <
lucas.castro@santodigital.com.br> wrote:

> Hi guys.
>
> Anyone could help me with a problem on my pipeline. The pipeline stuck and
> doesn't do anything.
>
> The logging that I received is
>
> jsonPayload: {
> job: "2021-10-20_21_00_01-5873629180911776129"
> logger: "root:shuffle.py:try_split"
> message: "Refusing to split
> <dataflow_worker.shuffle.GroupedShuffleRangeTracker object at
> 0x7f267e6411d0> at b'w\xd5\xdd\x83\x00\x01': proposed split position is out
> of range [b'qni\xad\x00\x01', b'w\xd5\xdd\x83\x00\x01'). Position of last
> group processed was b'w\xd5\xdd\x82\x00\x01'."
> thread: "63:139803304449792"
> worker: "template-calendarapi-bigq-10202100-p8n3-harness-lr2k"
>
>
> my pipeline is and in red is the part that the job stuck. Sometimes the
> job is completed and sometimes not.
>
> [image: image.png]
>
>
>
>
> --
>
> Lucas de Castro Magalhães
>
> Innovation Manager
>
> +55 (11) 99420-4667 <+55%2011%2099420-4667>
>  Agende sua reunião comigo aqui
> <ht...@santodigital.com.br>
>
> santodigital.com.br <https://www.santodigital.com.br>
> [image: LinkedIn] <https://www.linkedin.com/company/santodigital/> [image:
> Instagram] <https://www.instagram.com/santodigital/> [image: YouTube]
> <https://www.youtube.com/channel/UCw4UVLYRFUyYUDdLwmmlfcA> [image:
> Facebook] <https://www.facebook.com/santodigital> [image: Twitter]
> <https://twitter.com/santodigital>
>   <https://www.santoid.com.br/>
>