You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2016/01/24 04:33:39 UTC
[jira] [Resolved] (PHOENIX-1465) Provide a configuration option to
disable spooling query results to disk
[ https://issues.apache.org/jira/browse/PHOENIX-1465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Taylor resolved PHOENIX-1465.
-----------------------------------
Resolution: Duplicate
Duplicate of PHOENIX-1428.
> Provide a configuration option to disable spooling query results to disk
> ------------------------------------------------------------------------
>
> Key: PHOENIX-1465
> URL: https://issues.apache.org/jira/browse/PHOENIX-1465
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.2.0
> Reporter: Jan Fernando
> Labels: SFDC
>
> For compliance and disk space reasons there are use cases where we users need to provide a strong guarantee that Phoenix will not spool data to disk across a heterogeneous set of query patterns.
> Currently all scans run through the SpoolingResultIterator and in the constructor we do the following as part of delegating to the underlying iterators that do the scan:
> {code}
> DeferredFileOutputStream spoolTo = new DeferredFileOutputStream(size, tempFile) {
> @Override
> protected void thresholdReached() throws IOException {
> super.thresholdReached();
> chunk.close();
> }
> };
> DataOutputStream out = new DataOutputStream(spoolTo);
> final long maxBytesAllowed = maxSpoolToDisk == -1 ?
> Long.MAX_VALUE : thresholdBytes + maxSpoolToDisk;
> long bytesWritten = 0L;
> int maxSize = 0;
> for (Tuple result = scanner.next(); result != null; result = scanner.next()) {
> int length = TupleUtil.write(result, out);
> bytesWritten += length;
> if(bytesWritten > maxBytesAllowed){
> throw new SpoolTooBigToDiskException("result too big, max allowed(bytes): " + maxBytesAllowed);
> }
> maxSize = Math.max(length, maxSize);
> }
> {code}
> We always go through the Spooling iterator and looking at the code it looks like that even if we configure the spool size to 0 we only check after we have written the data to the DataOutputStream which could result in a spool file being written.
> I think it would be much more straightforward if we:
> a) Had a simple boolean configuration that would allow us to disable spooling
> b) If this config disables spooling we bypass the spooling iterator and the above logic
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)