You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@thrift.apache.org by Antoine Pitrou <an...@python.org> on 2021/09/27 09:51:07 UTC

Analysis and guidelines concerning CVE-2020-13949

Hello,

(sorry, this is a rehash of a question asked on
https://issues.apache.org/jira/browse/THRIFT-5237, since I haven't
received any reply there)

In Apache Parquet, some of our users have encountered situations where
the Thrift 0.14 message size limitations would prevent from reading
legitimate real-world data (see
https://issues.apache.org/jira/browse/ARROW-13655 ).  I have been
trying to understand what kind of vulnerability the new limitations are
designed to address, but have failed to find any precise analysis of
the issue.

Therefore I have tried to go by the Thrift C++ library source code and
have come to the understanding that the vulnerability arises when using
one of the streaming transports where the encoded message size isn't
known in advance (such as socket-based). However, in Parquet C++ we read
the full message in one block from the underlying random access file,
and therefore it seems that disabling the max message size is
legitimate in our case.

Is my understanding ok? If not, can somebody shed a bit more light on
what the vulnerability consists in?

Regards

Antoine.

Re: Analysis and guidelines concerning CVE-2020-13949

Posted by Antoine Pitrou <an...@python.org>.

Ok, I am going to take the lack the answer as an admission that our
analysis is correct.

Thank you

Antoine.


On Mon, 27 Sep 2021 11:51:07 +0200
Antoine Pitrou <an...@python.org> wrote:
> Hello,
> 
> (sorry, this is a rehash of a question asked on
> https://issues.apache.org/jira/browse/THRIFT-5237, since I haven't
> received any reply there)
> 
> In Apache Parquet, some of our users have encountered situations where
> the Thrift 0.14 message size limitations would prevent from reading
> legitimate real-world data (see
> https://issues.apache.org/jira/browse/ARROW-13655 ).  I have been
> trying to understand what kind of vulnerability the new limitations are
> designed to address, but have failed to find any precise analysis of
> the issue.
> 
> Therefore I have tried to go by the Thrift C++ library source code and
> have come to the understanding that the vulnerability arises when using
> one of the streaming transports where the encoded message size isn't
> known in advance (such as socket-based). However, in Parquet C++ we read
> the full message in one block from the underlying random access file,
> and therefore it seems that disabling the max message size is
> legitimate in our case.
> 
> Is my understanding ok? If not, can somebody shed a bit more light on
> what the vulnerability consists in?
> 
> Regards
> 
> Antoine.
> 
> 
>