You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (Jira)" <ji...@apache.org> on 2022/12/19 11:12:00 UTC

[jira] [Resolved] (HADOOP-18521) ABFS ReadBufferManager buffer sharing across concurrent HTTP requests

     [ https://issues.apache.org/jira/browse/HADOOP-18521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steve Loughran resolved HADOOP-18521.
-------------------------------------
    Fix Version/s: 3.3.5
       Resolution: Fixed

This is fixed in HADOOP-18546; the followups were just tuning.

closing as such. my larger bit of work has some advantages (better testability, iostats of use) but that makes it too complex to put in 3.3.5 and means more work remaining to fix.

when we do a rework of the read buffer manager some aspects of it can be applied. 
* iostats
* tryEvict prioritising eviction of completed fetches with buffers belonging to closed windows
* AbfsInputStream calls to go an interface, with unit tests

It'd also be good to include split start/end and read policy from stream to manager
* don't prefetch past end of split (or at most, one block)
* on random IO, use optimised policy (no prefetch? one block max)
* on vectored IO: no prefetching
j

> ABFS ReadBufferManager buffer sharing across concurrent HTTP requests
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-18521
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18521
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/azure
>    Affects Versions: 3.3.2, 3.3.3, 3.3.4
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 3.3.5
>
>
> AbfsInputStream.close() can trigger the return of buffers used for active prefetch GET requests into the ReadBufferManager free buffer pool.
> A subsequent prefetch by a different stream in the same process may acquire this same buffer. This can lead to risk of corruption of its own prefetched data, data which may then be returned to that other thread.
> On releases without the fix for this (3.3.2+), the bug can be avoided by disabling all prefetching 
> {code}
> fs.azure.readaheadqueue.depth = 0
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org