You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Steve Jacobs (JIRA)" <ji...@apache.org> on 2017/10/12 20:14:00 UTC
[jira] [Created] (DRILL-5871) Large files fail to write to s3 datastore using hdfs s3a.

Steve Jacobs created DRILL-5871:
-----------------------------------

             Summary: Large files fail to write to s3 datastore using hdfs s3a.
                 Key: DRILL-5871
                 URL: https://issues.apache.org/jira/browse/DRILL-5871
             Project: Apache Drill
          Issue Type: Bug
          Components:  Server
    Affects Versions: 1.11.0
         Environment: Centos 7.4, Oracle Java SE 1.80.0_131-b11, x86_64, vmware. Zookeeper cluster, two drillbits, 3 zookeepers.
            Reporter: Steve Jacobs


When storing CSV files to a S3a storage driver using a CTAS, if the files are large enough to implicate the multi-part upload functionality, the CTAS fails with the following stack trace:
Error: SYSTEM ERROR: UnsupportedOperationException

Fragment 0:0

[Error Id: dbb018ea-29eb-4e1a-bf97-4c2c9cfbdf3c on den-certdrill-1.ci.neoninternal.org:31010]

  (java.lang.UnsupportedOperationException) null
    java.util.Collections$UnmodifiableList.sort():1331
    java.util.Collections.sort():175
    com.amazonaws.services.s3.model.transform.RequestXmlFactory.convertToXmlByteArray():42
    com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload():2513
    org.apache.hadoop.fs.s3a.S3AFastOutputStream$MultiPartUpload.complete():384
    org.apache.hadoop.fs.s3a.S3AFastOutputStream.close():253
    org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close():72
    org.apache.hadoop.fs.FSDataOutputStream.close():106
    java.io.PrintStream.close():360
    org.apache.drill.exec.store.text.DrillTextRecordWriter.cleanup():170
    org.apache.drill.exec.physical.impl.WriterRecordBatch.closeWriter():184
    org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():128
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.record.AbstractRecordBatch.next():119
    org.apache.drill.exec.record.AbstractRecordBatch.next():109
    org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
    org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
    org.apache.drill.exec.record.AbstractRecordBatch.next():162
    org.apache.drill.exec.physical.impl.BaseRootExec.next():105
    org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
    org.apache.drill.exec.physical.impl.BaseRootExec.next():95
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
    org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
    java.security.AccessController.doPrivileged():-2
    javax.security.auth.Subject.doAs():422
    org.apache.hadoop.security.UserGroupInformation.doAs():1657
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():748 (state=,code=0)

This looks suspiciously like:
https://issues.apache.org/jira/browse/HADOOP-14204

So the fix may be as 'simple' as just syncing the upstream version when Hadoop 2.8.2 releases later this month. Although I am ignorant to the implications of upgrading hadoop-hdfs to this version.

We are able to store smaller files just fine.
Things I've tried:
Setting fs.s3a.multipart.threshold to a ridiculously large value like 10T (these files are just over 1GB). Does not work.
Setting fs.s3a.fast.upload: false. Also does not change the behavior.

The s3a driver does not appear to have an option to disable multi-part uploads all together. 

For completeness sake here are my current S3a options for the driver:
"fs.s3a.endpoint": "******",
    "fs.s3a.access.key": "*",
    "fs.s3a.secret.key": "*",
    "fs.s3a.connection.maximum": "200",
    "fs.s3a.paging.maximum": "1000",
    "fs.s3a.fast.upload": "true",
    "fs.s3a.multipart.purge": "true",
    "fs.s3a.fast.upload.buffer": "bytebuffer",
    "fs.s3a.fast.upload.active.blocks": "8",
    "fs.s3a.buffer.dir": "/opt/apache-airflow/buffer",
    "fs.s3a.multipart.size": "134217728",
    "fs.s3a.multipart.threshold": "671088640",
    "fs.s3a.experimental.input.fadvise": "sequential",
    "fs.s3a.acl.default": "PublicRead",
    "fs.s3a.multiobjectdelete.enable": "true"




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)