You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-dev@james.apache.org by "Benoit Tellier (Jira)" <se...@james.apache.org> on 2021/04/08 06:20:00 UTC

[jira] [Created] (JAMES-3555) Allow 'real' streaming upon BlobStore reads

Benoit Tellier created JAMES-3555:
-------------------------------------

             Summary: Allow 'real' streaming upon BlobStore reads
                 Key: JAMES-3555
                 URL: https://issues.apache.org/jira/browse/JAMES-3555
             Project: James Server
          Issue Type: Improvement
          Components: Blob
    Affects Versions: 3.6.0
            Reporter: Benoit Tellier


I investigated a build instability... https://github.com/apache/james-project/pull/370

One time out of 25 the uploaded content was altered (Different SHA 256 in
the data store). Several Apache CI builds had been failing because of this.

Investigating this, I found that a piped input stream strategy was not
hitting this pitfall (RectorUtils::toInputStream). I thus incriminate
some weird data races on the Reactor side...

Other areas of the code are likely affected as both the Cassandra and the
S3 blobStores relies on those for streaming (only relied upon for
attachment upload).

Changing the BlobStore API should likely be considered:

{code:java}
public interface BlobStore {
    Flux<ByteBuffer> read(BucketName bucketName, BlobId blobId);
}
{code}

Why:

 - Supported by both implementation (Cassandra S3)
 - Avoids intermediate transformations requiring blocking
 operations/dedicated thread resources
 - We convert a Flux[ByteBuffer] => InputStream (to conform to BlobStore API) => Flux[ByteBuffer] (reactor-netty layer)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: server-dev-unsubscribe@james.apache.org
For additional commands, e-mail: server-dev-help@james.apache.org