You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by Rene Cordier <rc...@apache.org> on 2023/12/14 09:52:38 UTC

Event dead letter group deserialization too strict?

Hello guys,

Today I noticed on one of our servers that the redelivery of all event 
dead letters task was failing.

The problem being... too strict on event dead letter group 
deserialization. The problem I had was that after a migration, I had 
some groups still registered in group_table that were from before the 
migration. After the migration, those group didn't exist anymore or have 
been refactored and changed name. But the api is so strict on it that 
when we try to redeliver all events, we fetch all groups in the group 
table and try to deserialize them into their own class. If one of the 
group changed, it just crashes!

I've been trying to use webadmin for trying to delete the groups that 
were not in the code anymore: 
https://james.apache.org/server/manage-webadmin.html#Deleting_all_events_of_a_group

Looking at the code, it deletes events of a group, then the group. 
Checking in cassandra, there was no events left going around related to 
those groups. But still the task failed... because we try to strictly 
deserialize the group in to a class again .

I understand that being strict in some cases is good, but it reached the 
point where I had to go delete the faulty group lines in cassandra 
myself to do the clean up and allow the redeliver all events task to do 
its job again properly (which I find even more riskier).

Can we be a bit more relax on that? Or at least giving the possibility 
to delete via the webadmin api those outdated group without a strict 
deserialization first? Like maybe refactoring the webadmin route I 
posted above to accept a header, like "I-KNOW-WHAT-I-AM-DOING" (as we 
have for some other sensitive routes) and if we get that, we accept to 
not be strict and just do a blind remove against cassandra for that group?

It's not the first time I encounter this, and it's frustrating each time.

Would be interested to know what the community thinks.

Best regards,

Rene.


---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org


Re: Event dead letter group deserialization too strict?

Posted by Benoit TELLIER <bt...@apache.org>.
 > Can we be a bit more relax on that?

+1 redelivery should ignore unknown groups IMO

Webadmin APIs for reading deadletter and deleting events / groups shall 
be more lenient (ease diagnostics)

However keeping strict group validation when appending events looks 
beneficial.

Maybe a string wrapper called UnvalidatedGroup that can be validated?

I would support having an Java interface rework for deadletter interface 
reflecting this, as it is not the first time I hear such complains 
within Linagora walls...

Regards,

Benoit

On 14/12/2023 10:52, Rene Cordier wrote:
> Hello guys,
>
> Today I noticed on one of our servers that the redelivery of all event 
> dead letters task was failing.
>
> The problem being... too strict on event dead letter group 
> deserialization. The problem I had was that after a migration, I had 
> some groups still registered in group_table that were from before the 
> migration. After the migration, those group didn't exist anymore or 
> have been refactored and changed name. But the api is so strict on it 
> that when we try to redeliver all events, we fetch all groups in the 
> group table and try to deserialize them into their own class. If one 
> of the group changed, it just crashes!
>
> I've been trying to use webadmin for trying to delete the groups that 
> were not in the code anymore: 
> https://james.apache.org/server/manage-webadmin.html#Deleting_all_events_of_a_group
>
> Looking at the code, it deletes events of a group, then the group. 
> Checking in cassandra, there was no events left going around related 
> to those groups. But still the task failed... because we try to 
> strictly deserialize the group in to a class again .
>
> I understand that being strict in some cases is good, but it reached 
> the point where I had to go delete the faulty group lines in cassandra 
> myself to do the clean up and allow the redeliver all events task to 
> do its job again properly (which I find even more riskier).
>
> Can we be a bit more relax on that? Or at least giving the possibility 
> to delete via the webadmin api those outdated group without a strict 
> deserialization first? Like maybe refactoring the webadmin route I 
> posted above to accept a header, like "I-KNOW-WHAT-I-AM-DOING" (as we 
> have for some other sensitive routes) and if we get that, we accept to 
> not be strict and just do a blind remove against cassandra for that 
> group?
>
> It's not the first time I encounter this, and it's frustrating each time.
>
> Would be interested to know what the community thinks.
>
> Best regards,
>
> Rene.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
> For additional commands, e-mail: server-dev-help@james.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org