You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-issues@hadoop.apache.org by GitBox <gi...@apache.org> on 2020/12/07 20:26:14 UTC

[GitHub] [hadoop] steveloughran opened a new pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes wri…

steveloughran opened a new pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530

…tten collected by spark

This is a PoC which, having implemented, I don't think is viable.

Yes, we can fix up getFileStatus so it reads the header. It even knows
to always bypass S3Guard (no inconsistencies to worry about any more).

But: the blast radius of the change is too big. I'm worried about
distcp or any other code which goes
len =getFileStatus(path).getLen()
open(path).readFully(0, len, dest)

You'll get an EOF here. Find the file through a listing and you'll be OK
provided S3Guard isn't updated with that GetFileStatus result, which I
have seen.

The ordering of probes in ITestMagicCommitProtocol.validateTaskAttemptPathAfterWrite
need to be list before getFileStatus, so the S3Guard table is updated from
the list.

overall: danger. Even without S3Guard there's risk.

Anyway, shown it can be done. And I think there's a merit in a leaner patch
which attaches the marker but doesn't do any fixup. This would let us add
an API call "getObjectHeaders(path) -> Future<Map<String, String>> and
then use that to do the lookup. We can implement the probe for
ABFS and S3, add a hasPathCapabilities for it as well as an interface
the FS can implement (which passthrough filesystems would need to do).

Change-Id: If56213c0c5d8ab696d2d89b48ad52874960b0920

## NOTICE

Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HADOOP-XXXXX. Fix a typo in YYY.)
For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus removed a comment on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758710185


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 20s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 41s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 46s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 23s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 49s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 11s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 27s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  9s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 17s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 25s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 46s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m  0s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 27s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 196m 56s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 5b52bc6272c7 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ca7dd5fad33 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/testReport/ |
   | Max. process+thread count | 2244 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758556408


   Left the tests running overnight and clearly the laptop dropped of the internet for a while. This showed up that there's an NPE lurking in test teardown if test setup fails. Will have to look at
   
   ```
   	... 36 more
   
   [ERROR] testRenameEventuallyConsistentFileRFCE[bogus-bogus-auth-true](org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged)  Time elapsed: 533.361 s  <<< ERROR!
   java.lang.NullPointerException
   	at org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged.teardown(ITestS3ARemoteFileChanged.java:275)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
   	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
   	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
   	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.lang.Thread.run(Thread.java:748)
   
   [ERROR] testSelectWithNoVersionMetadata[bogus-bogus-auth-true](org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged)  Time elapsed: 469.225 s  <<< ERROR!
   org.apache.hadoop.fs.s3a.AWSClientIOException: initTable on stevel-london: com.amazonaws.SdkClientException: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com: nodename nor servname provided, or not known: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com: nodename nor servname provided, or not known
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableManager.initTable(DynamoDBMetadataStoreTableManager.java:237)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:438)
   	at org.apache.hadoop.fs.s3a.s3guard.S3Guard.getMetadataStore(S3Guard.java:125)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:489)
   	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3470)
   	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:538)
   	at org.apache.hadoop.fs.contract.AbstractBondedFSContract.init(AbstractBondedFSContract.java:72)
   	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:187)
   	at org.apache.hadoop.fs.s3a.AbstractS3ATestBase.setup(AbstractS3ATestBase.java:77)
   	at org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged.setup(ITestS3ARemoteFileChanged.java:262)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
   	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
   	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
   	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
   	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com: nodename nor servname provided, or not known
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5413)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5380)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:2098)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:2063)
   	at com.amazonaws.services.dynamodbv2.document.Table.describe(Table.java:137)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableManager.initTable(DynamoDBMetadataStoreTableManager.java:171)
   	... 23 more
   Caused by: java.net.UnknownHostException: dynamodb.eu-west-2.amazonaws.com: nodename nor servname provided, or not known
   	at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
   	at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
   	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
   	at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
   	at java.net.InetAddress.getAllByName(InetAddress.java:1193)
   	at java.net.InetAddress.getAllByName(InetAddress.java:1127)
   	at com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
   	at com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
   	at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
   	at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
   	at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
   	at com.amazonaws.http.conn.$Proxy22.connect(Unknown Source)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
   	at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   	at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
   	... 36 more
   
   [ERROR] testSelectWithNoVersionMetadata[bogus-bogus-auth-true](org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged)  Time elapsed: 469.225 s  <<< ERROR!
   java.lang.NullPointerException
   	at org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged.teardown(ITestS3ARemoteFileChanged.java:275)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
   	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
   	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
   	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.lang.Thread.run(Thread.java:748)
   
   [ERROR] testRenameMissingFile[bogus-bogus-auth-true](org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged)  Time elapsed: 325.281 s  <<< ERROR!
   org.apache.hadoop.fs.s3a.AWSClientIOException: initTable on stevel-london: com.amazonaws.SdkClientException: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com
   	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableManager.initTable(DynamoDBMetadataStoreTableManager.java:237)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:438)
   	at org.apache.hadoop.fs.s3a.s3guard.S3Guard.getMetadataStore(S3Guard.java:125)
   	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:489)
   	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3470)
   	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:538)
   	at org.apache.hadoop.fs.contract.AbstractBondedFSContract.init(AbstractBondedFSContract.java:72)
   	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.setup(AbstractFSContractTestBase.java:187)
   	at org.apache.hadoop.fs.s3a.AbstractS3ATestBase.setup(AbstractS3ATestBase.java:77)
   	at org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged.setup(ITestS3ARemoteFileChanged.java:262)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
   	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
   	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
   	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
   	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.lang.Thread.run(Thread.java:748)
   Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: dynamodb.eu-west-2.amazonaws.com
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
   	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5413)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5380)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:2098)
   	at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:2063)
   	at com.amazonaws.services.dynamodbv2.document.Table.describe(Table.java:137)
   	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableManager.initTable(DynamoDBMetadataStoreTableManager.java:171)
   	... 23 more
   Caused by: java.net.UnknownHostException: dynamodb.eu-west-2.amazonaws.com
   	at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
   	at java.net.InetAddress.getAllByName(InetAddress.java:1193)
   	at java.net.InetAddress.getAllByName(InetAddress.java:1127)
   	at com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
   	at com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
   	at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
   	at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
   	at sun.reflect.GeneratedMethodAccessor47.invoke(Unknown Source)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
   	at com.amazonaws.http.conn.$Proxy22.connect(Unknown Source)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
   	at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
   	at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
   	at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
   	at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333)
   	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
   	... 36 more
   
   [ERROR] testRenameMissingFile[bogus-bogus-auth-true](org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged)  Time elapsed: 325.281 s  <<< ERROR!
   java.lang.NullPointerException
   	at org.apache.hadoop.fs.s3a.ITestS3ARemoteFileChanged.teardown(ITestS3ARemoteFileChanged.java:275)
   	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   	at java.lang.reflect.Method.invoke(Method.java:498)
   	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
   	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
   	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
   	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
   	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
   	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   	at java.lang.Thread.run(Thread.java:748)
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-742573246


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 20s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 23s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 32s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 44s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 10s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 44s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  2s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 11s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 28s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 52s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 52s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  8s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m  8s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 44s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 8 new + 23 unchanged - 0 fixed = 31 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 10s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/artifact/out/whitespace-eol.txt) |  The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m  8s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | -1 :x: |  javadoc  |   0m 43s | [/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  hadoop-aws in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -1 :x: |  findbugs  |   1m 36s | [/new-findbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/artifact/out/new-findbugs-hadoop-tools_hadoop-aws.html) |  hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  11m  4s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 33s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 52s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 198m  8s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | FindBugs | module:hadoop-tools/hadoop-aws |
   |  |  Boxing/unboxing to parse a primitive org.apache.hadoop.fs.s3a.impl.HeaderProcessing.extractXAttrLongValue(byte[])  At HeaderProcessing.java:org.apache.hadoop.fs.s3a.impl.HeaderProcessing.extractXAttrLongValue(byte[])  At HeaderProcessing.java:[line 219] |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux d129a40d1368 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / c2cecfc9b95 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/testReport/ |
   | Max. process+thread count | 1368 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/3/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-763103266


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 25s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 23s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 35s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 43s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 12s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 32s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 26s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 27s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 27s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 49s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  22m 29s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 47s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 50s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   4m 53s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |  11m 50s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/13/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 48s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  3s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 208m 54s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.security.token.delegation.TestDelegationToken |
   |   | hadoop.metrics2.source.TestJvmMetrics |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/13/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 3a6b0d46073f 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / a326f226066 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/13/testReport/ |
   | Max. process+thread count | 1613 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/13/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] sunchao commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

sunchao commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r554269175



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {

Review comment:
       nit: I wonder if some kind of caching will be useful here. We are calling `getObjectMetadata` for every `getXAttr` call.

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -330,6 +331,11 @@
    */
   private DirectoryPolicy directoryPolicy;
 
+  /**
+   * Header processing for XAttr. Created on demand.

Review comment:
       nit: is "Created on demand" accurate? it is created in `initialize`.

##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       I also wonder if this (along with the documentation change) is required for this PR.

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =

Review comment:
       nit: to be consistent with the above, should we add comments for the following constants?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);
+      }
+    }
+    // missing/empty header or parse failure.
+    return Optional.empty();
+  }
+
+  /**
+   * Creates a copy of the passed {@link ObjectMetadata}.
+   * Does so without using the {@link ObjectMetadata#clone()} method,
+   * to avoid copying unnecessary headers.
+   * This operation does not copy the {@code X_HEADER_MAGIC_MARKER}
+   * header to avoid confusion. If a marker file is renamed,
+   * it loses information about any remapped file.
+   * @param source the {@link ObjectMetadata} to copy
+   * @param ret the metadata to update; this is the return value.
+   * @return a copy of {@link ObjectMetadata} with only relevant attributes

Review comment:
       hmm why we need a return value if `ret` is already changed in place?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555080742



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {

Review comment:
       I would be surprised if this is called any more frequently than getFileStatus; we don't catch that. Well, not now S3Guard is off. And given the inconsistencies it would give rise to: not something I want to replicate.
   
   What I will do is add IOStatistics collection of count & time of the xattr calls. That way we can get the statistics on how common this is. 

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =

Review comment:
       will do




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758262157


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 33s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 8 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 19s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 24s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  0s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 12s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 31s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 13s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m 13s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 31s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 31s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 51s |  |  root: The patch generated 0 new + 22 unchanged - 1 fixed = 22 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 16s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  1s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m  6s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 46s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m  3s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 29s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 198m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/9/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux c0c281a0f781 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d4fd675a95c |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/9/testReport/ |
   | Max. process+thread count | 1863 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/9/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] dongjoon-hyun commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-747797852


   Gentle ping~


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] liuml07 commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

liuml07 commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r554779575



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
##########
@@ -1048,4 +1048,10 @@ private Constants() {
   public static final String STORE_CAPABILITY_DIRECTORY_MARKER_ACTION_DELETE
       = "fs.s3a.capability.directory.marker.action.delete";
 
+  /**
+   * To comply with the XAttr rules, all headers of the object retrieved
+   * through the getXAttr APIs have the prefix: {@value}.
+   */
+  public static final String HEADER_PREFIX = "header.";

Review comment:
       nit: is `XA_HEADER_PREFIX` a bit clearer name?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -4103,56 +4111,7 @@ private ObjectMetadata cloneObjectMetadata(ObjectMetadata source) {
     // in future there are new attributes added to ObjectMetadata
     // that we do not explicitly call to set here
     ObjectMetadata ret = newObjectMetadata(source.getContentLength());
-
-    // Possibly null attributes
-    // Allowing nulls to pass breaks it during later use
-    if (source.getCacheControl() != null) {
-      ret.setCacheControl(source.getCacheControl());
-    }
-    if (source.getContentDisposition() != null) {
-      ret.setContentDisposition(source.getContentDisposition());
-    }
-    if (source.getContentEncoding() != null) {
-      ret.setContentEncoding(source.getContentEncoding());
-    }
-    if (source.getContentMD5() != null) {
-      ret.setContentMD5(source.getContentMD5());
-    }
-    if (source.getContentType() != null) {
-      ret.setContentType(source.getContentType());
-    }
-    if (source.getExpirationTime() != null) {
-      ret.setExpirationTime(source.getExpirationTime());
-    }
-    if (source.getExpirationTimeRuleId() != null) {
-      ret.setExpirationTimeRuleId(source.getExpirationTimeRuleId());
-    }
-    if (source.getHttpExpiresDate() != null) {
-      ret.setHttpExpiresDate(source.getHttpExpiresDate());
-    }
-    if (source.getLastModified() != null) {
-      ret.setLastModified(source.getLastModified());
-    }
-    if (source.getOngoingRestore() != null) {
-      ret.setOngoingRestore(source.getOngoingRestore());
-    }
-    if (source.getRestoreExpirationTime() != null) {
-      ret.setRestoreExpirationTime(source.getRestoreExpirationTime());
-    }
-    if (source.getSSEAlgorithm() != null) {
-      ret.setSSEAlgorithm(source.getSSEAlgorithm());
-    }
-    if (source.getSSECustomerAlgorithm() != null) {
-      ret.setSSECustomerAlgorithm(source.getSSECustomerAlgorithm());
-    }
-    if (source.getSSECustomerKeyMd5() != null) {
-      ret.setSSECustomerKeyMd5(source.getSSECustomerKeyMd5());
-    }
-
-    for (Map.Entry<String, String> e : source.getUserMetadata().entrySet()) {
-      ret.addUserMetadata(e.getKey(), e.getValue());
-    }
-    return ret;
+    return getHeaderProcessing().cloneObjectMetadata(source, ret);

Review comment:
       nit: seems better if we move those comments above into the implementation method `HeaderProcessing#cloneObjectMetadata`
   
   ```
   // This approach may be too brittle, especially if
   // in future there are new attributes added to ObjectMetadata
   // that we do not explicitly call to set here
   ```

##########
File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committer_architecture.md
##########
@@ -1312,6 +1312,16 @@ On `close()`, summary data would be written to the file
 `/results/latest/__magic/job400_1/task_01_01/latest.orc.lzo.pending`.
 This would contain the upload ID and all the parts and etags of uploaded data.
 
+A marker file is also created, so that code which verifies that a newly created file
+exists does not fail.
+1. These marker files are zero bytes long.
+1. They declare the full length of the final file in the HTTP header
+   `x-hadoop-s3a-magic-marker`.

Review comment:
       you mean `x-hadoop-s3a-magic-data-length` here?

##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       I'm +1 on enabling this by default as it does not depend on S3Guard any more. I totally agree we should update the JIRA release notes or commit message to indicate this default config change. Or ideally a separate PR.

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null

Review comment:
       will never return `null`, but could return empty byte array?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -4382,6 +4341,37 @@ public EtagChecksum getFileChecksum(Path f, final long length)
     }
   }
 
+  /**
+   * Get header processing support.
+   * @return the header processing of this instance.
+   */
+  private HeaderProcessing getHeaderProcessing() {
+    return headerProcessing;
+  }
+
+  @Override
+  public byte[] getXAttr(final Path path, final String name)
+      throws IOException {
+    return getHeaderProcessing().getXAttr(path, name);
+  }
+
+  @Override
+  public Map<String, byte[]> getXAttrs(final Path path) throws IOException {
+    return getHeaderProcessing().getXAttrs(path);
+  }
+
+  @Override
+  public Map<String, byte[]> getXAttrs(final Path path,
+      final List<String> names)
+      throws IOException {
+    return getHeaderProcessing().getXAttrs(path, names);
+  }
+
+  @Override
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return headerProcessing.listXAttrs(path);

Review comment:
       nit: replace `headerProcessing` with private `getHeaderProcessing()` as the `getXattrs` method?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);

Review comment:
       Is it useful to print `ex`? Or along with the line `LOG.debug("Fail to parse {} to long with exception", xAttr, ex)`?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);
+      }
+    }
+    // missing/empty header or parse failure.
+    return Optional.empty();
+  }
+
+  /**
+   * Creates a copy of the passed {@link ObjectMetadata}.
+   * Does so without using the {@link ObjectMetadata#clone()} method,
+   * to avoid copying unnecessary headers.
+   * This operation does not copy the {@code X_HEADER_MAGIC_MARKER}
+   * header to avoid confusion. If a marker file is renamed,
+   * it loses information about any remapped file.
+   * @param source the {@link ObjectMetadata} to copy
+   * @param ret the metadata to update; this is the return value.
+   * @return a copy of {@link ObjectMetadata} with only relevant attributes
+   */
+  public ObjectMetadata cloneObjectMetadata(ObjectMetadata source,
+      ObjectMetadata ret) {
+
+    // Possibly null attributes
+    // Allowing nulls to pass breaks it during later use
+    if (source.getCacheControl() != null) {
+      ret.setCacheControl(source.getCacheControl());
+    }
+    if (source.getContentDisposition() != null) {
+      ret.setContentDisposition(source.getContentDisposition());
+    }
+    if (source.getContentEncoding() != null) {
+      ret.setContentEncoding(source.getContentEncoding());
+    }
+    if (source.getContentMD5() != null) {
+      ret.setContentMD5(source.getContentMD5());
+    }
+    if (source.getContentType() != null) {
+      ret.setContentType(source.getContentType());
+    }
+    if (source.getExpirationTime() != null) {
+      ret.setExpirationTime(source.getExpirationTime());
+    }
+    if (source.getExpirationTimeRuleId() != null) {
+      ret.setExpirationTimeRuleId(source.getExpirationTimeRuleId());
+    }
+    if (source.getHttpExpiresDate() != null) {
+      ret.setHttpExpiresDate(source.getHttpExpiresDate());
+    }
+    if (source.getLastModified() != null) {
+      ret.setLastModified(source.getLastModified());
+    }
+    if (source.getOngoingRestore() != null) {
+      ret.setOngoingRestore(source.getOngoingRestore());
+    }
+    if (source.getRestoreExpirationTime() != null) {
+      ret.setRestoreExpirationTime(source.getRestoreExpirationTime());
+    }
+    if (source.getSSEAlgorithm() != null) {
+      ret.setSSEAlgorithm(source.getSSEAlgorithm());
+    }
+    if (source.getSSECustomerAlgorithm() != null) {
+      ret.setSSECustomerAlgorithm(source.getSSECustomerAlgorithm());
+    }
+    if (source.getSSECustomerKeyMd5() != null) {
+      ret.setSSECustomerKeyMd5(source.getSSECustomerKeyMd5());
+    }
+
+    // copy user metadata except the magic marker header.
+    for (Map.Entry<String, String> e : source.getUserMetadata().entrySet()) {

Review comment:
       nit: I know this is based on existing code...but Java 8 stream?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766807576






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-765669907


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 39s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 58s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 40s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 34s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 52s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 12s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 35s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 24s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  24m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 31s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  21m 31s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 56s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 24s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 30s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  4s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 54s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  11m  5s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 50s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 207m 22s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/14/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 18d689554ee4 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d09e3c929f2 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/14/testReport/ |
   | Max. process+thread count | 1616 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/14/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-767032428


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 42s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 17s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 17s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m  6s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  17m 22s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 40s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 29s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 26s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 17s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 34s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 32s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  19m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 22s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  17m 22s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 38s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 24s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 27s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 21s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 49s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 22s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m  4s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 196m 27s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/17/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint |
   | uname | Linux fda0bb92688a 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 06a5d3437f6 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/17/testReport/ |
   | Max. process+thread count | 3158 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/17/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766939278






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r563811325



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -1800,31 +1809,16 @@ public ObjectMetadata getObjectMetadata(Path path) throws IOException {
    * @return metadata
    * @throws IOException IO and object access problems.
    */
-  @VisibleForTesting
   @Retries.RetryTranslated
-  public ObjectMetadata getObjectMetadata(Path path,
+  private ObjectMetadata getObjectMetadata(Path path,
       ChangeTracker changeTracker, Invoker changeInvoker, String operation)
       throws IOException {
-    checkNotClosed();

Review comment:
       I think I had a good reason for doing this (it was a private method, checks further up, etc). But looking at it now I'm concluding that whatever that reason was -it's not enough. Restoring.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555247027



##########
File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/committer_architecture.md
##########
@@ -1312,6 +1312,16 @@ On `close()`, summary data would be written to the file
 `/results/latest/__magic/job400_1/task_01_01/latest.orc.lzo.pending`.
 This would contain the upload ID and all the parts and etags of uploaded data.
 
+A marker file is also created, so that code which verifies that a newly created file
+exists does not fail.
+1. These marker files are zero bytes long.
+1. They declare the full length of the final file in the HTTP header
+   `x-hadoop-s3a-magic-marker`.

Review comment:
       nicely spotted. will fix




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766807576


   Revert the switching of the fs.s3a.magic.enabled; got a bigger patch which removes that switch entirely; it was only ever in there for two concerns
   1. to make sure people were really, really confident their store was good to go
   1. in case it turned out that something was using `__magic` as prefix and the committer went and broke everything
   
   issue 1 is now resolved, and if issue 2 ever surfaced, well, nobody filed a bug report.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-759567381


   It's in the commit log, but I'll add it as an explicit comment: I've added getXAttr() for directories too. The code initially does a HEAD path and if that returns 404 do a HEAD path + "/". I suppose we really need some contract tests for xattr but I'm not going to do them here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-743157050


   This is how the "-getfattr" command looks like against a non-magic-marker file
   
   ```
    bin/hadoop fs -getfattr -d s3a://stevel-london/cloud-integration/DELAY_LISTING_ME/S3ACommitBulkDataSuite/bulkdata/output/landsat/parquet/parted-1/year=2016/month=10/part-00000-8b00c5f2-cd1f-4741-9278-1a2f0af3a9a9.c000.snappy.parquet
   2020-12-11 12:06:19,361 [main] INFO  s3a.S3AFileSystem (S3Guard.java:logS3GuardDisabled(1135)) - S3Guard is disabled on this bucket: stevel-london
   2020-12-11 12:06:19,366 [main] INFO  impl.DirectoryPolicyImpl (DirectoryPolicyImpl.java:getDirectoryPolicy(189)) - Directory markers will be kept
   # file: s3a://stevel-london/cloud-integration/DELAY_LISTING_ME/S3ACommitBulkDataSuite/bulkdata/output/landsat/parquet/parted-1/year=2016/month=10/part-00000-8b00c5f2-cd1f-4741-9278-1a2f0af3a9a9.c000.snappy.parquet
   header.Content-Disposition
   header.Content-Encoding
   header.Content-Language
   header.Content-Length="92337"
   header.Content-MD5
   header.Content-Range
   header.Content-Type="application/octet-stream"
   header.ETag="b445e3e08b71650e906789a0f5a80c7b-1"
   header.Last-Modified="Thu Dec 10 15:10:12 GMT 2020"
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-759555115


   build failed. Looks spurious. But I'm going to rebase anyway


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758775336


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 19s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for branch  |
   | -1 :x: |  mvninstall  |   0m 13s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvninstall-root.txt) |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 23s | [/branch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in trunk failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  compile  |   0m 17s | [/branch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in trunk failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   4m 51s | [/buildtool-branch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/buildtool-branch-checkstyle-root.txt) |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   1m 56s | [/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt) |  hadoop-common in trunk failed.  |
   | -1 :x: |  mvnsite  |   2m 41s | [/branch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvnsite-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in trunk failed.  |
   | -1 :x: |  shadedclient  |   9m 50s |  |  branch has errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |  13m 49s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | -1 :x: |  findbugs  |   0m 13s | [/branch-findbugs-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt) |  hadoop-common in trunk failed.  |
   | -1 :x: |  findbugs  |   0m 32s | [/branch-findbugs-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-findbugs-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in trunk failed.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   4m 36s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 13s | [/patch-mvninstall-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  mvninstall  |   0m 22s | [/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | -1 :x: |  compile  |   0m 25s | [/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  javac  |   0m 25s | [/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  compile  |   7m 20s | [/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -1 :x: |  javac  |   7m 20s | [/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   4m 45s | [/buildtool-patch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/buildtool-patch-checkstyle-root.txt) |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   0m 18s | [/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  mvnsite  |   0m 24s | [/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/whitespace-eol.txt) |  The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 33s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | -1 :x: |  findbugs  |   0m 13s | [/patch-findbugs-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-findbugs-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  findbugs  |   0m 22s | [/patch-findbugs-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |   0m 13s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  unit  |   0m 13s | [/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 31s |  |  The patch does not generate ASF License warnings.  |
   |  |   |  59m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 2fcc4597f1e7 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 85b1c017eed |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/testReport/ |
   | Max. process+thread count | 621 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/console |
   | versions | git=2.17.1 maven=3.6.0 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-755437582


   @sunchao @liuml07 could either of you take a look @ this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555245870



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);
+      }
+    }
+    // missing/empty header or parse failure.
+    return Optional.empty();
+  }
+
+  /**
+   * Creates a copy of the passed {@link ObjectMetadata}.
+   * Does so without using the {@link ObjectMetadata#clone()} method,
+   * to avoid copying unnecessary headers.
+   * This operation does not copy the {@code X_HEADER_MAGIC_MARKER}
+   * header to avoid confusion. If a marker file is renamed,
+   * it loses information about any remapped file.
+   * @param source the {@link ObjectMetadata} to copy
+   * @param ret the metadata to update; this is the return value.
+   * @return a copy of {@link ObjectMetadata} with only relevant attributes
+   */
+  public ObjectMetadata cloneObjectMetadata(ObjectMetadata source,
+      ObjectMetadata ret) {
+
+    // Possibly null attributes
+    // Allowing nulls to pass breaks it during later use
+    if (source.getCacheControl() != null) {
+      ret.setCacheControl(source.getCacheControl());
+    }
+    if (source.getContentDisposition() != null) {
+      ret.setContentDisposition(source.getContentDisposition());
+    }
+    if (source.getContentEncoding() != null) {
+      ret.setContentEncoding(source.getContentEncoding());
+    }
+    if (source.getContentMD5() != null) {
+      ret.setContentMD5(source.getContentMD5());
+    }
+    if (source.getContentType() != null) {
+      ret.setContentType(source.getContentType());
+    }
+    if (source.getExpirationTime() != null) {
+      ret.setExpirationTime(source.getExpirationTime());
+    }
+    if (source.getExpirationTimeRuleId() != null) {
+      ret.setExpirationTimeRuleId(source.getExpirationTimeRuleId());
+    }
+    if (source.getHttpExpiresDate() != null) {
+      ret.setHttpExpiresDate(source.getHttpExpiresDate());
+    }
+    if (source.getLastModified() != null) {
+      ret.setLastModified(source.getLastModified());
+    }
+    if (source.getOngoingRestore() != null) {
+      ret.setOngoingRestore(source.getOngoingRestore());
+    }
+    if (source.getRestoreExpirationTime() != null) {
+      ret.setRestoreExpirationTime(source.getRestoreExpirationTime());
+    }
+    if (source.getSSEAlgorithm() != null) {
+      ret.setSSEAlgorithm(source.getSSEAlgorithm());
+    }
+    if (source.getSSECustomerAlgorithm() != null) {
+      ret.setSSECustomerAlgorithm(source.getSSECustomerAlgorithm());
+    }
+    if (source.getSSECustomerKeyMd5() != null) {
+      ret.setSSECustomerKeyMd5(source.getSSECustomerKeyMd5());
+    }
+
+    // copy user metadata except the magic marker header.
+    for (Map.Entry<String, String> e : source.getUserMetadata().entrySet()) {

Review comment:
       yeah, it was just copy and paste. Move to java 8 filtered stream




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-763617693


   all test failures are in hadoop-common and therefore unrelated. JVM ones: thread space, and one other which I'm going to view as transient. 
   
   This PR is ready for review!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766939278


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 44s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  17m 50s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 24s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  2s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 16s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 37s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 27s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  24m 28s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  24m 28s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 41s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  15m 15s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 47s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 10s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 57s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 205m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/15/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 7b58b4ed0b5d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2fa73a23875 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/15/testReport/ |
   | Max. process+thread count | 1378 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/15/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758165380


   (testing in progress)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] dongjoon-hyun commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-748335153


   For the reviewers~
   > for who?
   
   It's because it's still not an official patch until this lands on the release branches.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-767029785


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   3m  9s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 21s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 10s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m  7s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  17m 25s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 41s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  1s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 43s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 18s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 35s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 27s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  19m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 25s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  17m 25s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 35s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  15m 29s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 39s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 50s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 16s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   2m  8s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 198m 34s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/16/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint |
   | uname | Linux b8fe01beea7c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 06a5d3437f6 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/16/testReport/ |
   | Max. process+thread count | 2997 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/16/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-740865126


   Latest iteration is something viable to be merged in: All HTTP headers are returned as xattrs
   
   * New HeaderProcessing store operation to isolate this
   * ContextAccessors adds call to getObjectMetadata
   * the various FS getXattr* calls all delegate to HeaderProcessing
   * the returned header list contains all user headers
   * and the standard HTTP ones.
   
   Tests for all of this
   * unit tests for HeaderProcessing
   * etag propagation in ITestMiscOperations
   * And something to query/log that getXAttr(/) is sensible
   * commit tests verify the attribute is set
   
   This patch doesn't play games with getFileStatus and provides information
   (access to headers) which is more broadly useful.
   
   The setXAttr calls aren't supported: they are immutable. 
   
   testing: in progress.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-756223044


   did squash and rebase down to a single patch; then tested against s3 london with `-Dparallel-tests -DtestsThreadCount=6 -Dscale -Dmarkers=delete -Ds3guard -Ddynamo`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] liuml07 commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

liuml07 commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-755566048


   Will take a look this week. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766943305


   Retested: s3 london, unguarded. all good.
   
   I think we are done here. Can I get a vote from anyone with commit rights, eg. @liuml07 @sunchao @bgaborg ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766884586


   having to rebase due to the site doc changes; sorry


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758146786


   Next PR to come in 
   * tries to address all review comments
   * adds stats gathering
   * adds almost all the AWS headers (everything but some of the encryption stuff) as XAttrs to be listed. 
   
   Log of a test run on a newly created file on a bucket with SSE-S3 and versioning.
   ```
   2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Cache-Control has bytes[0] => ""
   2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Disposition has bytes[0] => ""
   2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Encoding has bytes[0] => ""
   2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Language has bytes[0] => ""
   2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Length has bytes[1] => "0"
   2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-MD5 has bytes[0] => ""
   2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Range has bytes[0] => ""
   2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Type has bytes[24] => "application/octet-stream"
   2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.ETag has bytes[32] => "d41d8cd98f00b204e9800998ecf8427e"
   2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Last-Modified has bytes[28] => "Mon Jan 11 18:32:24 GMT 2021"
   2021-01-11 18:32:24,012 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-archive-status has bytes[0] => ""
   2021-01-11 18:32:24,012 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-legal-hold has bytes[0] => ""
   2021-01-11 18:32:24,013 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-mode has bytes[0] => ""
   2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-retain-until-date has bytes[0] => ""
   2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-replication-status has bytes[0] => ""
   2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-server-side-encryption has bytes[6] => "AES256"
   2021-01-11 18:32:24,015 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-storage-class has bytes[0] => ""
   2021-01-11 18:32:24,015 [JUnit-testXAttrFileCost] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-version-id has bytes[32] => "XeajHuYbsD1rO.Bh.6UKqnqVMCZvkWg1"
   ```
   
   And for the curious, that of /
   ```
   2021-01-11 18:32:21,207 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Cache-Control has bytes[0] => ""
   2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Disposition has bytes[0] => ""
   2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Encoding has bytes[0] => ""
   2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Language has bytes[0] => ""
   2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Length has bytes[1] => "0"
   2021-01-11 18:32:21,209 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-MD5 has bytes[0] => ""
   2021-01-11 18:32:21,210 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Range has bytes[0] => ""
   2021-01-11 18:32:21,210 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Type has bytes[15] => "application/xml"
   2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.ETag has bytes[0] => ""
   2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Last-Modified has bytes[0] => ""
   2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-archive-status has bytes[0] => ""
   2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-legal-hold has bytes[0] => ""
   2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-mode has bytes[0] => ""
   2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-retain-until-date has bytes[0] => ""
   2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-replication-status has bytes[0] => ""
   2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-server-side-encryption has bytes[0] => ""
   2021-01-11 18:32:21,215 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-storage-class has bytes[0] => ""
   2021-01-11 18:32:21,215 [JUnit-testXAttrRoot] INFO  impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-version-id has bytes[0] => ""
   ```
   
   And the stats of that test run
   
   ```
   2021-01-11 18:32:26,484 [JUnit] INFO  s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:dumpFileSystemIOStatistics(99)) - Aggregate FileSystem Statistics counters=((action_executor_acquired=1)
   (action_http_head_request=11)
   (directories_created=2)
   (directories_deleted=1)
   (fake_directories_deleted=1)
   (files_created=1)
   (files_deleted=1)
   (object_bulk_delete_request=2)
   (object_delete_objects=3)
   (object_delete_request=1)
   (object_list_request=8)
   (object_metadata_request=11)
   (object_put_request=3)
   (object_put_request_completed=3)
   (op_create=1)
   (op_delete=2)
   (op_exists=2)
   (op_get_file_status=2)
   (op_list_files=1)
   (op_mkdirs=2)
   (op_xattr_get_map=2)
   (op_xattr_get_named=1)
   (op_xattr_list=2)
   (stream_write_block_uploads=1));
   
   gauges=();
   
   minimums=((action_executor_acquired.min=0)
   (action_http_head_request.min=16)
   (object_bulk_delete_request.min=46)
   (object_delete_request.min=37)
   (object_list_request.min=33)
   (op_xattr_get_map.min=17)
   (op_xattr_get_named.min=36)
   (op_xattr_list.min=18));
   
   maximums=((action_executor_acquired.max=0)
   (action_http_head_request.max=903)
   (object_bulk_delete_request.max=97)
   (object_delete_request.max=37)
   (object_list_request.max=934)
   (op_xattr_get_map.max=33)
   (op_xattr_get_named.max=36)
   (op_xattr_list.max=22));
   
   means=((action_executor_acquired.mean=(samples=1, sum=0, mean=0.0000))
   (action_http_head_request.mean=(samples=11, sum=1279, mean=116.2727))
   (object_bulk_delete_request.mean=(samples=2, sum=143, mean=71.5000))
   (object_delete_request.mean=(samples=1, sum=37, mean=37.0000))
   (object_list_request.mean=(samples=8, sum=5094, mean=636.7500))
   (op_xattr_get_map.mean=(samples=2, sum=50, mean=25.0000))
   (op_xattr_get_named.mean=(samples=1, sum=36, mean=36.0000))
   (op_xattr_list.mean=(samples=2, sum=40, mean=20.0000)));
   ```
   
   @sunchao XAttr calls seem to take ~25 milliseconds over a long haul link. If someone used getXAttr(name) one by one to list attributes, it'd be expensive. If they did the getXAttrs to get multiple attributes then the performance is better.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-742884929


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   2m 35s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  2s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 11s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m  9s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 51s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 47s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 11s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 25s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  1s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 17s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 47s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 8 new + 23 unchanged - 0 fixed = 31 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 11s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/artifact/out/whitespace-eol.txt) |  The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  3s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 18s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 55s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | -1 :x: |  javadoc  |   0m 35s | [/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  hadoop-aws in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | +1 :green_heart: |  findbugs  |   3m 45s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 54s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 25s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 48s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 203m 55s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux ff660d1017ab 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3ec01b1bb30 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/testReport/ |
   | Max. process+thread count | 2295 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/4/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-742788785


   Matching spark PR is https://github.com/apache/spark/pull/30714
   
   I've now done downstream testing in spark local mode, verifying from the log that the attribute was read. 
   
   ```
   2020-12-10 14:56:16,760 [Executor task launch worker for task 3] INFO  magic.MagicCommitTracker (MagicCommitTracker.java:aboutToComplete(134)) - Uncommitted data pending to file s3a://stevel-london/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/committer-magic-parquet/parquet/__magic/job-f718ab6a-27df-40fe-8b25-a8d83cd25dca/tasks/attempt_202012101456156213593579150946569_0005_m_000000_3/__base/year=2017/month=1/part-00000-fe9c22b3-8639-451e-9843-7498f2ea5f41.c000.snappy.parquet; commit metadata for 1 parts in cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/committer-magic-parquet/parquet/__magic/job-f718ab6a-27df-40fe-8b25-a8d83cd25dca/tasks/attempt_202012101456156213593579150946569_0005_m_000000_3/__base/year=2017/month=1/part-00000-fe9c22b3-8639-451e-9843-7498f2ea5f41.c000.snappy.parquet.pending. sixe: 6286 byte(s)
   2020-12-10 14:56:17,513 [Executor task launch worker for task 3] INFO  s3a.S3ABlockOutputStream (S3ABlockOutputStream.java:close(388)) - File cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/committer-magic-parquet/parquet/year=2017/month=1/part-00000-fe9c22b3-8639-451e-9843-7498f2ea5f41.c000.snappy.parquet will be visible when the job is committed
   2020-12-10 14:56:17,705 [Executor task launch worker for task 3] DEBUG datasources.BasicWriteTaskStatsTracker (Logging.scala:logDebug(61)) - File Length statistics for s3a://stevel-london/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/committer-magic-parquet/parquet/__magic/job-f718ab6a-27df-40fe-8b25-a8d83cd25dca/tasks/attempt_202012101456156213593579150946569_0005_m_000000_3/__base/year=2017/month=1/part-00000-fe9c22b3-8639-451e-9843-7498f2ea5f41.c000.snappy.parquet retrieved from XAttr: 6286
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-753983870


   rebased to trunk.
   
   _not yet retested/reviewed_


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-762809195


   style
   ```
   ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/ContextAccessors.java:95:  ObjectMetadata getObjectMetadata(final String key) throws IOException;:36: Redundant 'final' modifier. [RedundantModifier]
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555253967



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);
+      }
+    }
+    // missing/empty header or parse failure.
+    return Optional.empty();
+  }
+
+  /**
+   * Creates a copy of the passed {@link ObjectMetadata}.
+   * Does so without using the {@link ObjectMetadata#clone()} method,
+   * to avoid copying unnecessary headers.
+   * This operation does not copy the {@code X_HEADER_MAGIC_MARKER}
+   * header to avoid confusion. If a marker file is renamed,
+   * it loses information about any remapped file.
+   * @param source the {@link ObjectMetadata} to copy
+   * @param ret the metadata to update; this is the return value.
+   * @return a copy of {@link ObjectMetadata} with only relevant attributes

Review comment:
       was originally like that, I assume me or somebody else wanted it that way. Will have it return void.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555075093



##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       I' will add to release notes. This *does not* enable the committer. All it does is say "this store has the consistency needed for the committer". 
   
   Now, we could do a bit more in a separate patch as there's a bit more in settings I'd like to change (we ship in cloudera) where the fs.s3a.buffer.dir is set to something like `${env.LOCAL_DIR:${hadoop.tmp.dir}}/s3a` so that incomplete buffered writes (staging committer and blocks for normal files/magic uploads) are always cleaned up. I could make the core-default settings for this committer different




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] sunchao commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

sunchao commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r554269175



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {

Review comment:
       nit: I wonder if some kind of caching will be useful here. We are calling `getObjectMetadata` for every `getXAttr` call.

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -330,6 +331,11 @@
    */
   private DirectoryPolicy directoryPolicy;
 
+  /**
+   * Header processing for XAttr. Created on demand.

Review comment:
       nit: is "Created on demand" accurate? it is created in `initialize`.

##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       I also wonder if this (along with the documentation change) is required for this PR.

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =

Review comment:
       nit: to be consistent with the above, should we add comments for the following constants?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);
+      }
+    }
+    // missing/empty header or parse failure.
+    return Optional.empty();
+  }
+
+  /**
+   * Creates a copy of the passed {@link ObjectMetadata}.
+   * Does so without using the {@link ObjectMetadata#clone()} method,
+   * to avoid copying unnecessary headers.
+   * This operation does not copy the {@code X_HEADER_MAGIC_MARKER}
+   * header to avoid confusion. If a marker file is renamed,
+   * it loses information about any remapped file.
+   * @param source the {@link ObjectMetadata} to copy
+   * @param ret the metadata to update; this is the return value.
+   * @return a copy of {@link ObjectMetadata} with only relevant attributes

Review comment:
       hmm why we need a return value if `ret` is already changed in place?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-743392212


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 39s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 27s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m  7s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 38s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  2s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 10s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 25s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 48s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 48s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m  8s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m  8s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 48s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/5/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 1 new + 23 unchanged - 0 fixed = 24 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 15s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/5/artifact/out/whitespace-eol.txt) |  The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m  6s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  0s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 44s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 55s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 23s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 47s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 195m 48s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux dc03c72e3769 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9bd3c9bc506 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/5/testReport/ |
   | Max. process+thread count | 2232 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/5/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-740249672


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 14s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 4 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 57s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 39s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 57s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 57s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 30s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 10s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 26s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 43s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 43s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 17s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 47s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/1/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15)  |
   | +1 :green_heart: |  mvnsite  |   2m 12s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/1/artifact/out/whitespace-eol.txt) |  The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 18s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 44s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m  1s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 25s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 196m 32s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/1/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux c05b3b4b492a 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 32099e36dda |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/1/testReport/ |
   | Max. process+thread count | 2735 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/1/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-759711312


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 22s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  20m 59s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 50s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  23m 30s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 58s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 23s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   4m  5s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 48s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  26m 49s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  26m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 41s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  19m 41s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 53s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 1 new + 26 unchanged - 1 fixed = 27 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 16s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/whitespace-eol.txt) |  The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 55s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 55s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m 12s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 28s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 222m 26s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 87dd6fb9a2d9 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 836c6304304 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/testReport/ |
   | Max. process+thread count | 1905 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] mukund-thakur commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

mukund-thakur commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r563717996



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -1800,31 +1809,16 @@ public ObjectMetadata getObjectMetadata(Path path) throws IOException {
    * @return metadata
    * @throws IOException IO and object access problems.
    */
-  @VisibleForTesting
   @Retries.RetryTranslated
-  public ObjectMetadata getObjectMetadata(Path path,
+  private ObjectMetadata getObjectMetadata(Path path,
       ChangeTracker changeTracker, Invoker changeInvoker, String operation)
       throws IOException {
-    checkNotClosed();

Review comment:
       I hope removal of this is not a typo.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-740161845


   + @sunchao @dongjoon-hyun 
   
   This is not for merging, just for run through yetus and discussion.
   
   Tested S3A London (consistent!) with/without S3guard
   
   ```
   mvit -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep  -Dfs.s3a.directory.marker.audit=true
   mvit -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete -Ds3guard -Ddynamo  -Dfs.s3a.directory.marker.audit=true
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555204308



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
##########
@@ -1048,4 +1048,10 @@ private Constants() {
   public static final String STORE_CAPABILITY_DIRECTORY_MARKER_ACTION_DELETE
       = "fs.s3a.capability.directory.marker.action.delete";
 
+  /**
+   * To comply with the XAttr rules, all headers of the object retrieved
+   * through the getXAttr APIs have the prefix: {@value}.
+   */
+  public static final String HEADER_PREFIX = "header.";

Review comment:
       ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] sunchao commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

sunchao commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-755439979


   > @sunchao @liuml07 could either of you take a look @ this?
   
   Yes I can help on this PR, sorry for the delay. Will spend some time this week.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-767778010


   Big thanks to all reviewers -its in trunk, backport to 3.3 in progress and I'll get the spark one in now as well


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-743129197


   Fixed new javadocs. The whole aws codebase has slowly been dropping in javadoc quality; need to do a cleanup there


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-748063774


   > Gentle ping~
   
   for who?
   
   I'm happy with this, the only open design issue is "should we lower case all the classic headers?" Primarily as it reduces the risk of someone getting the .equals wrong in different countries


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r562770735



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
##########
@@ -0,0 +1,300 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.impl;
+
+import java.io.IOException;
+import java.nio.charset.StandardCharsets;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.Optional;
+import java.util.TreeMap;
+
+import com.amazonaws.services.s3.Headers;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.fs.Path;
+
+import static org.apache.hadoop.fs.s3a.Constants.HEADER_PREFIX;
+import static org.apache.hadoop.fs.s3a.commit.CommitConstants.X_HEADER_MAGIC_MARKER;
+
+/**
+ * Part of the S3A FS where object headers are
+ * processed.
+ * Implements all the various XAttr read operations.
+ * Those APIs all expect byte arrays back.
+ * Metadata cloning is also implemented here, so as
+ * to stay in sync with custom header logic.
+ */
+public class HeaderProcessing extends AbstractStoreOperation {
+
+  private static final Logger LOG = LoggerFactory.getLogger(
+      HeaderProcessing.class);
+
+  private static final byte[] EMPTY = new byte[0];
+
+  /**
+   * Length XAttr.
+   */
+  public static final String XA_CONTENT_LENGTH =
+      HEADER_PREFIX + Headers.CONTENT_LENGTH;
+
+  /**
+   * last modified XAttr.
+   */
+  public static final String XA_LAST_MODIFIED =
+      HEADER_PREFIX + Headers.LAST_MODIFIED;
+
+  public static final String XA_CONTENT_DISPOSITION =
+      HEADER_PREFIX + Headers.CONTENT_DISPOSITION;
+
+  public static final String XA_CONTENT_ENCODING =
+      HEADER_PREFIX + Headers.CONTENT_ENCODING;
+
+  public static final String XA_CONTENT_LANGUAGE =
+      HEADER_PREFIX + Headers.CONTENT_LANGUAGE;
+
+  public static final String XA_CONTENT_MD5 =
+      HEADER_PREFIX + Headers.CONTENT_MD5;
+
+  public static final String XA_CONTENT_RANGE =
+      HEADER_PREFIX + Headers.CONTENT_RANGE;
+
+  public static final String XA_CONTENT_TYPE =
+      HEADER_PREFIX + Headers.CONTENT_TYPE;
+
+  public static final String XA_ETAG = HEADER_PREFIX + Headers.ETAG;
+
+  public HeaderProcessing(final StoreContext storeContext) {
+    super(storeContext);
+  }
+
+  /**
+   * Query the store, get all the headers into a map. Each Header
+   * has the "header." prefix.
+   * Caller must have read access.
+   * The value of each header is the string value of the object
+   * UTF-8 encoded.
+   * @param path path of object.
+   * @return the headers
+   * @throws IOException failure, including file not found.
+   */
+  private Map<String, byte[]> retrieveHeaders(Path path) throws IOException {
+    StoreContext context = getStoreContext();
+    ObjectMetadata md = context.getContextAccessors()
+        .getObjectMetadata(path);
+    Map<String, String> rawHeaders = md.getUserMetadata();
+    Map<String, byte[]> headers = new TreeMap<>();
+    rawHeaders.forEach((key, value) ->
+        headers.put(HEADER_PREFIX + key, encodeBytes(value)));
+    // and add the usual content length &c, if set
+    headers.put(XA_CONTENT_DISPOSITION,
+        encodeBytes(md.getContentDisposition()));
+    headers.put(XA_CONTENT_ENCODING,
+        encodeBytes(md.getContentEncoding()));
+    headers.put(XA_CONTENT_LANGUAGE,
+        encodeBytes(md.getContentLanguage()));
+    headers.put(XA_CONTENT_LENGTH,
+        encodeBytes(md.getContentLength()));
+    headers.put(
+        XA_CONTENT_MD5,
+        encodeBytes(md.getContentMD5()));
+    headers.put(XA_CONTENT_RANGE,
+        encodeBytes(md.getContentRange()));
+    headers.put(XA_CONTENT_TYPE,
+        encodeBytes(md.getContentType()));
+    headers.put(XA_ETAG,
+        encodeBytes(md.getETag()));
+    headers.put(XA_LAST_MODIFIED,
+        encodeBytes(md.getLastModified()));
+    return headers;
+  }
+
+  /**
+   * Stringify an object and return its bytes in UTF-8 encoding.
+   * @param s source
+   * @return encoded object or null
+   */
+  public static byte[] encodeBytes(Object s) {
+    return s == null
+        ? EMPTY
+        : s.toString().getBytes(StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get the string value from the bytes.
+   * if null : return null, otherwise the UTF-8 decoded
+   * bytes.
+   * @param bytes source bytes
+   * @return decoded value
+   */
+  public static String decodeBytes(byte[] bytes) {
+    return bytes == null
+        ? null
+        : new String(bytes, StandardCharsets.UTF_8);
+  }
+
+  /**
+   * Get an XAttr name and value for a file or directory.
+   * @param path Path to get extended attribute
+   * @param name XAttr name.
+   * @return byte[] XAttr value or null
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public byte[] getXAttr(Path path, String name) throws IOException {
+    return retrieveHeaders(path).get(name);
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path}.
+   *
+   * @param path Path to get extended attributes
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   * @throws UnsupportedOperationException if the operation is unsupported
+   *         (default outcome).
+   */
+  public Map<String, byte[]> getXAttrs(Path path) throws IOException {
+    return retrieveHeaders(path);
+  }
+
+  /**
+   * See {@code FileSystem.listXAttrs(path)}.
+   * @param path Path to get extended attributes
+   * @return List of supported XAttrs
+   * @throws IOException IO failure
+   */
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return new ArrayList<>(retrieveHeaders(path).keySet());
+  }
+
+  /**
+   * See {@code FileSystem.getXAttrs(path, names}.
+   * @param path Path to get extended attributes
+   * @param names XAttr names.
+   * @return Map describing the XAttrs of the file or directory
+   * @throws IOException IO failure
+   */
+  public Map<String, byte[]> getXAttrs(Path path, List<String> names)
+      throws IOException {
+    Map<String, byte[]> headers = retrieveHeaders(path);
+    Map<String, byte[]> result = new TreeMap<>();
+    headers.entrySet().stream()
+        .filter(entry -> names.contains(entry.getKey()))
+        .forEach(entry -> result.put(entry.getKey(), entry.getValue()));
+    return result;
+  }
+
+  /**
+   * Convert an XAttr byte array to a long.
+   * testability.
+   * @param data data to parse
+   * @return either a length or none
+   */
+  public static Optional<Long> extractXAttrLongValue(byte[] data) {
+    String xAttr;
+    xAttr = HeaderProcessing.decodeBytes(data);
+    if (StringUtils.isNotEmpty(xAttr)) {
+      try {
+        long l = Long.parseLong(xAttr);
+        if (l >= 0) {
+          return Optional.of(l);
+        }
+      } catch (NumberFormatException ex) {
+        LOG.warn("Not a number: {}", xAttr);

Review comment:
       done, even though it's not as useful as knowing the actual string.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r563800923



##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       
   gabor, disabling option is now in HADOOP-17483; #2637 a PR to completely remove the switch entirely. Now we have to decide: do we want that, or would we ever want to turn the s3a client into a dumb client which doesn't do "magic" when start writing under certain paths?

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -1800,31 +1809,16 @@ public ObjectMetadata getObjectMetadata(Path path) throws IOException {
    * @return metadata
    * @throws IOException IO and object access problems.
    */
-  @VisibleForTesting
   @Retries.RetryTranslated
-  public ObjectMetadata getObjectMetadata(Path path,
+  private ObjectMetadata getObjectMetadata(Path path,
       ChangeTracker changeTracker, Invoker changeInvoker, String operation)
       throws IOException {
-    checkNotClosed();

Review comment:
       I think I had a good reason for doing this (it was a private method, checks further up, etc). But looking at it now I'm concluding that whatever that reason was -it's not enough. Restoring.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r563800923



##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       
   gabor, disabling option is now in HADOOP-17483; #2637 a PR to completely remove the switch entirely. Now we have to decide: do we want that, or would we ever want to turn the s3a client into a dumb client which doesn't do "magic" when start writing under certain paths?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran merged pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran merged pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-754696434


   Rebased, full test run. Failures of HADOOP-17403 and HADOOP-17451; both unrelated


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-754100106


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 23s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 32s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  27m 45s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  20m 36s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 23s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  2s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 13s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 30s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 57s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 15s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 15s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 47s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 15s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 19s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  4s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 50s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 58s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 27s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 204m 52s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/7/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux 7869e92f3a79 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2825d060cf9 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/7/testReport/ |
   | Max. process+thread count | 1440 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/7/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus removed a comment on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758775336


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 19s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  |  Maven dependency ordering for branch  |
   | -1 :x: |  mvninstall  |   0m 13s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvninstall-root.txt) |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 23s | [/branch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in trunk failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  compile  |   0m 17s | [/branch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in trunk failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   4m 51s | [/buildtool-branch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/buildtool-branch-checkstyle-root.txt) |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   1m 56s | [/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt) |  hadoop-common in trunk failed.  |
   | -1 :x: |  mvnsite  |   2m 41s | [/branch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-mvnsite-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in trunk failed.  |
   | -1 :x: |  shadedclient  |   9m 50s |  |  branch has errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 31s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |  13m 49s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | -1 :x: |  findbugs  |   0m 13s | [/branch-findbugs-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common.txt) |  hadoop-common in trunk failed.  |
   | -1 :x: |  findbugs  |   0m 32s | [/branch-findbugs-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/branch-findbugs-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in trunk failed.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   4m 36s |  |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   0m 13s | [/patch-mvninstall-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  mvninstall  |   0m 22s | [/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | -1 :x: |  compile  |   0m 25s | [/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  javac  |   0m 25s | [/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) |  root in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.  |
   | -1 :x: |  compile  |   7m 20s | [/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -1 :x: |  javac  |   7m 20s | [/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  root in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | -0 :warning: |  checkstyle  |   4m 45s | [/buildtool-patch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/buildtool-patch-checkstyle-root.txt) |  The patch fails to run checkstyle in root  |
   | -1 :x: |  mvnsite  |   0m 18s | [/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  mvnsite  |   0m 24s | [/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/whitespace-eol.txt) |  The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 33s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | -1 :x: |  findbugs  |   0m 13s | [/patch-findbugs-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-findbugs-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  findbugs  |   0m 22s | [/patch-findbugs-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |   0m 13s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) |  hadoop-common in the patch failed.  |
   | -1 :x: |  unit  |   0m 13s | [/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt) |  hadoop-aws in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 31s |  |  The patch does not generate ASF License warnings.  |
   |  |   |  59m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 2fcc4597f1e7 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 85b1c017eed |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/testReport/ |
   | Max. process+thread count | 621 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/11/console |
   | versions | git=2.17.1 maven=3.6.0 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-744563545


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 35s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 19s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  22m 43s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 14s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m  7s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 41s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  9s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 25s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  6s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 15s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 34s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 38s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  20m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 14s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 14s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 13s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  15m 56s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 45s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 57s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 37s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 191m 24s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/6/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux fbaf00de0475 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3234e5eaf36 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/6/testReport/ |
   | Max. process+thread count | 1805 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/6/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-741113543


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 7 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 56s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  19m 59s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  17m 14s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  1s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 18s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 32s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 12s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  19m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  17m 17s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 36s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 7 new + 23 unchanged - 0 fixed = 30 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 23s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/artifact/out/whitespace-eol.txt) |  The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  15m 41s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | -1 :x: |  javadoc  |   0m 43s | [/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) |  hadoop-aws in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.  |
   | +1 :green_heart: |  findbugs  |   3m 48s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   9m 31s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 43s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 185m 52s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle markdownlint |
   | uname | Linux 4aa5c9d2c709 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4ffec79b7ca |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/testReport/ |
   | Max. process+thread count | 3249 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/2/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] bgaborg commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

bgaborg commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r553870733



##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       with this change we will enable magic committer for all the components. Maybe point that out in the title of the issue.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus removed a comment on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-759711312


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 22s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  20m 59s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 50s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  23m 30s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 58s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 23s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   4m  5s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 48s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  26m 49s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  26m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 41s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  19m 41s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 53s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/diff-checkstyle-root.txt) |  root: The patch generated 1 new + 26 unchanged - 1 fixed = 27 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 16s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s | [/whitespace-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/whitespace-eol.txt) |  The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 55s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 55s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m 12s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 28s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 222m 26s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 87dd6fb9a2d9 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 836c6304304 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/testReport/ |
   | Max. process+thread count | 1905 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/12/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-766897657


   * Pushed up rebased and squashed PR.
   * added changes to go with Mukund's review as a second patch
   * testing in progress


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758139586


   > Looks good to me overall. Using XAttr is smart and extensible for solving problems like this - and it breaks nothing!
   
   thanks,. biggest risk is actually the copy code, as that is where a regression could creep in. We've had problems with copying encrypted data in the past there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-767775075


   ahh., great, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

hadoop-yetus commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758710185


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   1m 20s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch appears to include 9 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 41s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 46s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 23s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 18s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 49s |  |  branch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 11s |  |  Used deprecated FindBugs config; considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 27s |  |  trunk passed  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  9s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javac  |  21m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 17s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 17s |  |  the patch passed  |
   | +1 :green_heart: |  checkstyle  |   2m 50s |  |  root: The patch generated 0 new + 26 unchanged - 1 fixed = 26 total (was 27)  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML file.  |
   | +1 :green_heart: |  shadedclient  |  17m 25s |  |  patch has no errors when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   2m 10s |  |  the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 46s |  |  the patch passed  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |  10m  0s |  |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 27s |  |  hadoop-aws in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 196m 56s |  |  |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2530 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml markdownlint |
   | uname | Linux 5b52bc6272c7 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ca7dd5fad33 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/testReport/ |
   | Max. process+thread count | 2244 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: . |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2530/10/console |
   | versions | git=2.17.1 maven=3.6.0 findbugs=4.0.6 |
   | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] mukund-thakur commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

mukund-thakur commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r563717996



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -1800,31 +1809,16 @@ public ObjectMetadata getObjectMetadata(Path path) throws IOException {
    * @return metadata
    * @throws IOException IO and object access problems.
    */
-  @VisibleForTesting
   @Retries.RetryTranslated
-  public ObjectMetadata getObjectMetadata(Path path,
+  private ObjectMetadata getObjectMetadata(Path path,
       ChangeTracker changeTracker, Invoker changeInvoker, String operation)
       throws IOException {
-    checkNotClosed();

Review comment:
       I hope removal of this is not a typo.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-744455742


   I'm happy with this PR. Design Issues to consider
   * log headers @ debug when we copy metadata
   * should all headers be mapped to lower case in XAttr API?
   * is the new custom header name the right one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-763814617


   note: I think I'm going to remove the change to always enable magic committer from this PR. Instead I'm going to remove the switch entirely in the same PR which updates all the S3A docs to cover the fact that AWS S3 is consistent. We don't need the switch any more. So ignore that bit of the patch. thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] bgaborg commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

bgaborg commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r553870733



##########
File path: hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
##########
@@ -1873,11 +1873,9 @@
 
 <property>
   <name>fs.s3a.committer.magic.enabled</name>
-  <value>false</value>
+  <value>true</value>

Review comment:
       with this change we will enable magic committer for all the components. Maybe point that out in the title of the issue.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555242848



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -4103,56 +4111,7 @@ private ObjectMetadata cloneObjectMetadata(ObjectMetadata source) {
     // in future there are new attributes added to ObjectMetadata
     // that we do not explicitly call to set here
     ObjectMetadata ret = newObjectMetadata(source.getContentLength());
-
-    // Possibly null attributes
-    // Allowing nulls to pass breaks it during later use
-    if (source.getCacheControl() != null) {
-      ret.setCacheControl(source.getCacheControl());
-    }
-    if (source.getContentDisposition() != null) {
-      ret.setContentDisposition(source.getContentDisposition());
-    }
-    if (source.getContentEncoding() != null) {
-      ret.setContentEncoding(source.getContentEncoding());
-    }
-    if (source.getContentMD5() != null) {
-      ret.setContentMD5(source.getContentMD5());
-    }
-    if (source.getContentType() != null) {
-      ret.setContentType(source.getContentType());
-    }
-    if (source.getExpirationTime() != null) {
-      ret.setExpirationTime(source.getExpirationTime());
-    }
-    if (source.getExpirationTimeRuleId() != null) {
-      ret.setExpirationTimeRuleId(source.getExpirationTimeRuleId());
-    }
-    if (source.getHttpExpiresDate() != null) {
-      ret.setHttpExpiresDate(source.getHttpExpiresDate());
-    }
-    if (source.getLastModified() != null) {
-      ret.setLastModified(source.getLastModified());
-    }
-    if (source.getOngoingRestore() != null) {
-      ret.setOngoingRestore(source.getOngoingRestore());
-    }
-    if (source.getRestoreExpirationTime() != null) {
-      ret.setRestoreExpirationTime(source.getRestoreExpirationTime());
-    }
-    if (source.getSSEAlgorithm() != null) {
-      ret.setSSEAlgorithm(source.getSSEAlgorithm());
-    }
-    if (source.getSSECustomerAlgorithm() != null) {
-      ret.setSSECustomerAlgorithm(source.getSSECustomerAlgorithm());
-    }
-    if (source.getSSECustomerKeyMd5() != null) {
-      ret.setSSECustomerKeyMd5(source.getSSECustomerKeyMd5());
-    }
-
-    for (Map.Entry<String, String> e : source.getUserMetadata().entrySet()) {
-      ret.addUserMetadata(e.getKey(), e.getValue());
-    }
-    return ret;
+    return getHeaderProcessing().cloneObjectMetadata(source, ret);

Review comment:
       will do, especially as its a bit less brittle now

##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -4382,6 +4341,37 @@ public EtagChecksum getFileChecksum(Path f, final long length)
     }
   }
 
+  /**
+   * Get header processing support.
+   * @return the header processing of this instance.
+   */
+  private HeaderProcessing getHeaderProcessing() {
+    return headerProcessing;
+  }
+
+  @Override
+  public byte[] getXAttr(final Path path, final String name)
+      throws IOException {
+    return getHeaderProcessing().getXAttr(path, name);
+  }
+
+  @Override
+  public Map<String, byte[]> getXAttrs(final Path path) throws IOException {
+    return getHeaderProcessing().getXAttrs(path);
+  }
+
+  @Override
+  public Map<String, byte[]> getXAttrs(final Path path,
+      final List<String> names)
+      throws IOException {
+    return getHeaderProcessing().getXAttrs(path, names);
+  }
+
+  @Override
+  public List<String> listXAttrs(final Path path) throws IOException {
+    return headerProcessing.listXAttrs(path);

Review comment:
       ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-740536282


   I have a more straightforward solution to this: S3A to implement the FileSystem.getXAttr API call to return the headers. No risk to other applications; all spark will need to do is check for the header before looking for file length, swallow any exceptions raised in the API call and fall back to getFileStatus. Less than 10 lines.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

steveloughran commented on a change in pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#discussion_r555252337



##########
File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -330,6 +331,11 @@
    */
   private DirectoryPolicy directoryPolicy;
 
+  /**
+   * Header processing for XAttr. Created on demand.

Review comment:
       you are right. I originally had it on demand for XAttr calls, but once I pulled in to the copy operation just made it something created earlier.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

[GitHub] [hadoop] dongjoon-hyun commented on pull request #2530: HADOOP-17414. Magic committer files don't have the count of bytes written collected by spark

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #2530:
URL: https://github.com/apache/hadoop/pull/2530#issuecomment-740247128


   Thank you for pinging me. 😄 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org