You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/01/24 04:33:39 UTC

[jira] [Resolved] (PHOENIX-1465) Provide a configuration option to disable spooling query results to disk

     [ https://issues.apache.org/jira/browse/PHOENIX-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Taylor resolved PHOENIX-1465.
-----------------------------------
    Resolution: Duplicate

Duplicate of PHOENIX-1428.

> Provide a configuration option to disable spooling query results to disk
> ------------------------------------------------------------------------
>
>                 Key: PHOENIX-1465
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1465
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.2.0
>            Reporter: Jan Fernando
>              Labels: SFDC
>
> For compliance and disk space reasons there are use cases where we users need to provide a strong guarantee that Phoenix will not spool data to disk across a heterogeneous set of query patterns. 
> Currently all scans run through the SpoolingResultIterator and in the constructor we do the following as part of delegating to the underlying iterators that do the scan:
> {code}
> DeferredFileOutputStream spoolTo = new DeferredFileOutputStream(size, tempFile) {
>                 @Override
>                 protected void thresholdReached() throws IOException {
>                     super.thresholdReached();
>                     chunk.close();
>                 }
>             };
>             DataOutputStream out = new DataOutputStream(spoolTo);
>             final long maxBytesAllowed = maxSpoolToDisk == -1 ? 
>             		Long.MAX_VALUE : thresholdBytes + maxSpoolToDisk;
>             long bytesWritten = 0L;
>             int maxSize = 0;
>             for (Tuple result = scanner.next(); result != null; result = scanner.next()) {
>                 int length = TupleUtil.write(result, out);
>                 bytesWritten += length;
>                 if(bytesWritten > maxBytesAllowed){
>                 		throw new SpoolTooBigToDiskException("result too big, max allowed(bytes): " + maxBytesAllowed);
>                 }
>                 maxSize = Math.max(length, maxSize);
>             }
> {code}
> We always go through the Spooling iterator and looking at the code it looks like that even if we configure the spool size to 0 we only check after we have written the data to the DataOutputStream which could result in a spool file being written.
> I think it would be much more straightforward if we:
> a) Had a simple boolean configuration that would allow us to disable spooling
> b) If this config disables spooling we bypass the spooling iterator and the above logic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)