You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/08/25 15:08:34 UTC

[GitHub] [solr] HoustonPutman opened a new pull request #271: SOLR-15599: Upgrade aws sdk

HoustonPutman opened a new pull request #271:
URL: https://github.com/apache/solr/pull/271


   https://issues.apache.org/jira/browse/SOLR-15599
   
   So far the checklist:
   
   - [ ] The tests run and pass, with similar times (Had to change to use the http client in the tests, because https caused a large slowdown when uploading files to s3mock)
   - [ ] Documentation updates, such as the new proxy config param and system/env properties for auth.
   - [ ] I think the dependencies are good, but I'd love a check around that.
   
   One thing I couldn't find a similar setting for in the v2 client was: (not sure if this is actually an issue or not)
   
   ```java
   new ClientConfiguration().withProtocol(Protocol.HTTPS);
   ```
   
   It also looks like the multi-part-upload in v2 doesn't have an idea of a "last part", but the whole process is a little more controlled anyways, and the tests pass, so it should be fine?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695972996



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -253,17 +256,20 @@ void deleteDirectory(String path) throws S3Exception {
    * @return true if path exists, otherwise false?
    */
   boolean pathExists(String path) throws S3Exception {
-    path = sanitizedPath(path);
+    final String s3Path = sanitizedPath(path);
 
     // for root return true
-    if (path.isEmpty() || S3_FILE_PATH_DELIMITER.equals(path)) {
+    if (s3Path.isEmpty() || S3_FILE_PATH_DELIMITER.equals(s3Path)) {
       return true;
     }
 
     try {
-      return s3Client.doesObjectExist(bucketName, path);
-    } catch (AmazonClientException ase) {
-      throw handleAmazonException(ase);
+      s3Client.headObject(builder -> builder.bucket(bucketName).key(s3Path));

Review comment:
       I remember seeing this in the API last I looked -- no first class support for object existence :(




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695996885



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -73,48 +80,51 @@
   // Error messages returned by S3 for a key not found.
   private static final Set<String> NOT_FOUND_CODES = Set.of("NoSuchKey", "404 Not Found");
 
-  private final AmazonS3 s3Client;
+  private final S3Client s3Client;
 
   /** The S3 bucket where we read/write all data. */
   private final String bucketName;
 
   S3StorageClient(
-      String bucketName, String region, String proxyHost, int proxyPort, String endpoint) {
-    this(createInternalClient(region, proxyHost, proxyPort, endpoint), bucketName);
+      String bucketName,
+      String region,
+      String proxyUrl,
+      boolean proxyUseSystemSettings,
+      String endpoint) {
+    this(createInternalClient(region, proxyUrl, proxyUseSystemSettings, endpoint), bucketName);
   }
 
   @VisibleForTesting
-  S3StorageClient(AmazonS3 s3Client, String bucketName) {
+  S3StorageClient(S3Client s3Client, String bucketName) {
     this.s3Client = s3Client;
     this.bucketName = bucketName;
   }
 
-  private static AmazonS3 createInternalClient(
-      String region, String proxyHost, int proxyPort, String endpoint) {
-    ClientConfiguration clientConfig = new ClientConfiguration().withProtocol(Protocol.HTTPS);
-
+  private static S3Client createInternalClient(
+      String region, String proxyUrl, boolean proxyUseSystemSettings, String endpoint) {
+    ApacheHttpClient.Builder sdkHttpClientBuilder = ApacheHttpClient.builder();

Review comment:
       But I'm also fine if you want to stick with the sync Apache client. Currently, there's not too much to gain for adding async calls in our codebase (maybe the S3OutputStream could utilize it in its multipart uploads? could probably batch deletes faster?)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695989204



##########
File path: versions.props
##########
@@ -129,5 +129,6 @@ org.tallison:jmatio=1.5
 org.threeten:threetenbp=1.3.3
 org.tukaani:xz=1.8
 org.xerial.snappy:snappy-java=1.1.7.6
+software.amazon.awssdk:*=2.16.93

Review comment:
       Yeah I was also wondering if we can now exclude the v1 API when importing s3mock...we shouldn't be using any codepath in it that touches a v1 AWS SDK jar, so it could possibly work? But would make future upgrades potentially ugly since we're hacking up their jar.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-907330441


   @athrog Do you see any other issues here or should we be good to commit and backport?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-905591597


   @athrog Please critique to your hearts content.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-905759932


   Regarding
   
   > One thing I couldn't find a similar setting for in the v2 client was: (not sure if this is actually an issue or not)
   > new ClientConfiguration().withProtocol(Protocol.HTTPS);
   
   IIRC the v2 API uses HTTPS by default, so no need to explicitly set it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695980775



##########
File path: versions.props
##########
@@ -129,5 +129,6 @@ org.tallison:jmatio=1.5
 org.threeten:threetenbp=1.3.3
 org.tukaani:xz=1.8
 org.xerial.snappy:snappy-java=1.1.7.6
+software.amazon.awssdk:*=2.16.93

Review comment:
       We should also remove `com.amazonaws:*=1.12.42`, right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-907338957


   It looks good to me! I assume you've tested it end to end with s3mock and real s3?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695984569



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -73,48 +80,51 @@
   // Error messages returned by S3 for a key not found.
   private static final Set<String> NOT_FOUND_CODES = Set.of("NoSuchKey", "404 Not Found");
 
-  private final AmazonS3 s3Client;
+  private final S3Client s3Client;
 
   /** The S3 bucket where we read/write all data. */
   private final String bucketName;
 
   S3StorageClient(
-      String bucketName, String region, String proxyHost, int proxyPort, String endpoint) {
-    this(createInternalClient(region, proxyHost, proxyPort, endpoint), bucketName);
+      String bucketName,
+      String region,
+      String proxyUrl,
+      boolean proxyUseSystemSettings,
+      String endpoint) {
+    this(createInternalClient(region, proxyUrl, proxyUseSystemSettings, endpoint), bucketName);
   }
 
   @VisibleForTesting
-  S3StorageClient(AmazonS3 s3Client, String bucketName) {
+  S3StorageClient(S3Client s3Client, String bucketName) {
     this.s3Client = s3Client;
     this.bucketName = bucketName;
   }
 
-  private static AmazonS3 createInternalClient(
-      String region, String proxyHost, int proxyPort, String endpoint) {
-    ClientConfiguration clientConfig = new ClientConfiguration().withProtocol(Protocol.HTTPS);
-
+  private static S3Client createInternalClient(
+      String region, String proxyUrl, boolean proxyUseSystemSettings, String endpoint) {
+    ApacheHttpClient.Builder sdkHttpClientBuilder = ApacheHttpClient.builder();

Review comment:
       The Netty nio client provides a proxy config: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/http/nio/netty/NettyNioAsyncHttpClient.Builder.html#proxyConfiguration-software.amazon.awssdk.http.nio.netty.ProxyConfiguration-




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman merged pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman merged pull request #271:
URL: https://github.com/apache/solr/pull/271


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695978425



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -253,17 +256,20 @@ void deleteDirectory(String path) throws S3Exception {
    * @return true if path exists, otherwise false?
    */
   boolean pathExists(String path) throws S3Exception {
-    path = sanitizedPath(path);
+    final String s3Path = sanitizedPath(path);
 
     // for root return true
-    if (path.isEmpty() || S3_FILE_PATH_DELIMITER.equals(path)) {
+    if (s3Path.isEmpty() || S3_FILE_PATH_DELIMITER.equals(s3Path)) {
       return true;
     }
 
     try {
-      return s3Client.doesObjectExist(bucketName, path);
-    } catch (AmazonClientException ase) {
-      throw handleAmazonException(ase);
+      s3Client.headObject(builder -> builder.bucket(bucketName).key(s3Path));

Review comment:
       Yeah it's unfortunate 😞 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-907340886


   Doing my end-to-end tests as we type!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r696012394



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -73,48 +80,51 @@
   // Error messages returned by S3 for a key not found.
   private static final Set<String> NOT_FOUND_CODES = Set.of("NoSuchKey", "404 Not Found");
 
-  private final AmazonS3 s3Client;
+  private final S3Client s3Client;
 
   /** The S3 bucket where we read/write all data. */
   private final String bucketName;
 
   S3StorageClient(
-      String bucketName, String region, String proxyHost, int proxyPort, String endpoint) {
-    this(createInternalClient(region, proxyHost, proxyPort, endpoint), bucketName);
+      String bucketName,
+      String region,
+      String proxyUrl,
+      boolean proxyUseSystemSettings,
+      String endpoint) {
+    this(createInternalClient(region, proxyUrl, proxyUseSystemSettings, endpoint), bucketName);
   }
 
   @VisibleForTesting
-  S3StorageClient(AmazonS3 s3Client, String bucketName) {
+  S3StorageClient(S3Client s3Client, String bucketName) {
     this.s3Client = s3Client;
     this.bucketName = bucketName;
   }
 
-  private static AmazonS3 createInternalClient(
-      String region, String proxyHost, int proxyPort, String endpoint) {
-    ClientConfiguration clientConfig = new ClientConfiguration().withProtocol(Protocol.HTTPS);
-
+  private static S3Client createInternalClient(
+      String region, String proxyUrl, boolean proxyUseSystemSettings, String endpoint) {
+    ApacheHttpClient.Builder sdkHttpClientBuilder = ApacheHttpClient.builder();

Review comment:
       Yeah let's stick with sync for now. I'm fine switching to async in the future, but I'd rather we limit the scope of this ticket.
   
   The only non-supported option between apache and netty is that we are using the `useSystemPropertyValues` option provided by apache. If we move to netty in the future, we would just need to make sure to check the system properties ourselves and set the appropriate proxy options manually for the user.
   
   Otherwise we can convert `proxyUrl` to `host`, `port` and `scheme` ourselves very easily. (another difference between apache and netty)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695990891



##########
File path: versions.props
##########
@@ -129,5 +129,6 @@ org.tallison:jmatio=1.5
 org.threeten:threetenbp=1.3.3
 org.tukaani:xz=1.8
 org.xerial.snappy:snappy-java=1.1.7.6
+software.amazon.awssdk:*=2.16.93

Review comment:
       I tried, but the `S3MockStarter` class, which we use through `S3MockRule`, imports both sets of APIs. So if we remove the dependency, then we get a class loading issue when starting the tests that use `S3MockRule`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695969485



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -129,17 +139,17 @@ void createDirectory(String path) throws S3Exception {
       //            throw new S3Exception("Parent directory doesn't exist, path=" + path);
     }
 
-    ObjectMetadata objectMetadata = new ObjectMetadata();
-    objectMetadata.setContentType(S3_DIR_CONTENT_TYPE);
-    objectMetadata.setContentLength(0);
-
-    // Create empty object with header
-    final InputStream im = ClosedInputStream.CLOSED_INPUT_STREAM;
-
     try {
-      PutObjectRequest putRequest = new PutObjectRequest(bucketName, path, im, objectMetadata);
-      s3Client.putObject(putRequest);
-    } catch (AmazonClientException ase) {
+      // Create empty object with content type header
+      PutObjectRequest putRequest =
+          PutObjectRequest.builder()
+              .bucket(bucketName)
+              .contentType(S3_DIR_CONTENT_TYPE)
+              .key(path)
+              .build();
+      s3Client.putObject(
+          putRequest, RequestBody.fromInputStream(ClosedInputStream.CLOSED_INPUT_STREAM, 0L));

Review comment:
       Oh good call




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman merged pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman merged pull request #271:
URL: https://github.com/apache/solr/pull/271


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695965067



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -129,17 +139,17 @@ void createDirectory(String path) throws S3Exception {
       //            throw new S3Exception("Parent directory doesn't exist, path=" + path);
     }
 
-    ObjectMetadata objectMetadata = new ObjectMetadata();
-    objectMetadata.setContentType(S3_DIR_CONTENT_TYPE);
-    objectMetadata.setContentLength(0);
-
-    // Create empty object with header
-    final InputStream im = ClosedInputStream.CLOSED_INPUT_STREAM;
-
     try {
-      PutObjectRequest putRequest = new PutObjectRequest(bucketName, path, im, objectMetadata);
-      s3Client.putObject(putRequest);
-    } catch (AmazonClientException ase) {
+      // Create empty object with content type header
+      PutObjectRequest putRequest =
+          PutObjectRequest.builder()
+              .bucket(bucketName)
+              .contentType(S3_DIR_CONTENT_TYPE)
+              .key(path)
+              .build();
+      s3Client.putObject(
+          putRequest, RequestBody.fromInputStream(ClosedInputStream.CLOSED_INPUT_STREAM, 0L));

Review comment:
       Is this CLOSED_INPUT_STREAM still necessary? or can we use `RequestBody.empty()` instead to create an empty object?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695984443



##########
File path: versions.props
##########
@@ -129,5 +129,6 @@ org.tallison:jmatio=1.5
 org.threeten:threetenbp=1.3.3
 org.tukaani:xz=1.8
 org.xerial.snappy:snappy-java=1.1.7.6
+software.amazon.awssdk:*=2.16.93

Review comment:
       We can probably do that, but unfortunately we still have the dependency since it is used by s3mock (v1 and v2 are supported).
   
   But no particular reason we need to explicitly state the v1 version in `version.props`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on pull request #271:
URL: https://github.com/apache/solr/pull/271#issuecomment-907469533


   Having some issues with the tests and URI parsing. Will make a fix on monday, at which point everything should be good to go.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695851226



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -73,48 +80,51 @@
   // Error messages returned by S3 for a key not found.
   private static final Set<String> NOT_FOUND_CODES = Set.of("NoSuchKey", "404 Not Found");
 
-  private final AmazonS3 s3Client;
+  private final S3Client s3Client;
 
   /** The S3 bucket where we read/write all data. */
   private final String bucketName;
 
   S3StorageClient(
-      String bucketName, String region, String proxyHost, int proxyPort, String endpoint) {
-    this(createInternalClient(region, proxyHost, proxyPort, endpoint), bucketName);
+      String bucketName,
+      String region,
+      String proxyUrl,
+      boolean proxyUseSystemSettings,
+      String endpoint) {
+    this(createInternalClient(region, proxyUrl, proxyUseSystemSettings, endpoint), bucketName);
   }
 
   @VisibleForTesting
-  S3StorageClient(AmazonS3 s3Client, String bucketName) {
+  S3StorageClient(S3Client s3Client, String bucketName) {
     this.s3Client = s3Client;
     this.bucketName = bucketName;
   }
 
-  private static AmazonS3 createInternalClient(
-      String region, String proxyHost, int proxyPort, String endpoint) {
-    ClientConfiguration clientConfig = new ClientConfiguration().withProtocol(Protocol.HTTPS);
-
+  private static S3Client createInternalClient(
+      String region, String proxyUrl, boolean proxyUseSystemSettings, String endpoint) {
+    ApacheHttpClient.Builder sdkHttpClientBuilder = ApacheHttpClient.builder();
     // If configured, add proxy
-    if (!StringUtils.isEmpty(proxyHost)) {
-      clientConfig.setProxyHost(proxyHost);
-      if (proxyPort > 0) {
-        clientConfig.setProxyPort(proxyPort);
-      }
+    ProxyConfiguration.Builder proxyConfigurationBuilder = ProxyConfiguration.builder();
+    if (!StringUtils.isEmpty(proxyUrl)) {
+      proxyConfigurationBuilder.endpoint(URI.create(proxyUrl));
+    } else {
+      proxyConfigurationBuilder.useSystemPropertyValues(proxyUseSystemSettings);
     }
+    sdkHttpClientBuilder.proxyConfiguration(proxyConfigurationBuilder.build());
+    sdkHttpClientBuilder.useIdleConnectionReaper(false);

Review comment:
       I had to disable the idleConnectionReaper, because it caused "leaked" threads after the unit tests.

##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -73,48 +80,51 @@
   // Error messages returned by S3 for a key not found.
   private static final Set<String> NOT_FOUND_CODES = Set.of("NoSuchKey", "404 Not Found");
 
-  private final AmazonS3 s3Client;
+  private final S3Client s3Client;
 
   /** The S3 bucket where we read/write all data. */
   private final String bucketName;
 
   S3StorageClient(
-      String bucketName, String region, String proxyHost, int proxyPort, String endpoint) {
-    this(createInternalClient(region, proxyHost, proxyPort, endpoint), bucketName);
+      String bucketName,
+      String region,
+      String proxyUrl,
+      boolean proxyUseSystemSettings,
+      String endpoint) {
+    this(createInternalClient(region, proxyUrl, proxyUseSystemSettings, endpoint), bucketName);
   }
 
   @VisibleForTesting
-  S3StorageClient(AmazonS3 s3Client, String bucketName) {
+  S3StorageClient(S3Client s3Client, String bucketName) {
     this.s3Client = s3Client;
     this.bucketName = bucketName;
   }
 
-  private static AmazonS3 createInternalClient(
-      String region, String proxyHost, int proxyPort, String endpoint) {
-    ClientConfiguration clientConfig = new ClientConfiguration().withProtocol(Protocol.HTTPS);
-
+  private static S3Client createInternalClient(
+      String region, String proxyUrl, boolean proxyUseSystemSettings, String endpoint) {
+    ApacheHttpClient.Builder sdkHttpClientBuilder = ApacheHttpClient.builder();

Review comment:
       I think I had to switch to using the ApacheHttpClient for the S3Client, otherwise we wouldn't be able to use a proxy.
   
   The new ApacheHttpClient also lets us use system proxy settings now, so I've enabled that by default. But users can turn it off if they desire in the solr.xml..




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] athrog commented on a change in pull request #271: SOLR-15599: Upgrade aws sdk

Posted by GitBox <gi...@apache.org>.
athrog commented on a change in pull request #271:
URL: https://github.com/apache/solr/pull/271#discussion_r695971805



##########
File path: solr/contrib/s3-repository/src/java/org/apache/solr/s3/S3StorageClient.java
##########
@@ -194,55 +204,48 @@ void deleteDirectory(String path) throws S3Exception {
   String[] listDir(String path) throws S3Exception {
     path = sanitizedDirPath(path);
 
-    String prefix = path;
-    ListObjectsRequest listRequest =
-        new ListObjectsRequest()
-            .withBucketName(bucketName)
-            .withPrefix(prefix)
-            .withDelimiter(S3_FILE_PATH_DELIMITER);
+    final String prefix = path;
 
     List<String> entries = new ArrayList<>();
     try {
-      ObjectListing objectListing = s3Client.listObjects(listRequest);
-
-      while (true) {
-        List<String> files =
-            objectListing.getObjectSummaries().stream()
-                .map(S3ObjectSummary::getKey)
-                .collect(Collectors.toList());
-        files.addAll(objectListing.getCommonPrefixes());
-        // This filtering is needed only for S3mock. Real S3 does not ignore the trailing '/' in the
-        // prefix.
-        files =
-            files.stream()
-                .filter(s -> s.startsWith(prefix))
-                .map(s -> s.substring(prefix.length()))
-                .filter(s -> !s.isEmpty())
-                .filter(
-                    s -> {
-                      int slashIndex = s.indexOf(S3_FILE_PATH_DELIMITER);
-                      return slashIndex == -1 || slashIndex == s.length() - 1;
-                    })
-                .map(
-                    s -> {
-                      if (s.endsWith(S3_FILE_PATH_DELIMITER)) {
-                        return s.substring(0, s.length() - 1);
-                      }
-                      return s;
-                    })
-                .collect(Collectors.toList());
-
-        entries.addAll(files);
-
-        if (objectListing.isTruncated()) {
-          objectListing = s3Client.listNextBatchOfObjects(objectListing);
-        } else {
-          break;
-        }
-      }
+      ListObjectsV2Iterable objectListing =
+          s3Client.listObjectsV2Paginator(
+              builder ->
+                  builder
+                      .bucket(bucketName)
+                      .prefix(prefix)
+                      .delimiter(S3_FILE_PATH_DELIMITER)
+                      .build());
+
+      List<String> files =
+          objectListing.contents().stream().map(S3Object::key).collect(Collectors.toList());
+      objectListing.commonPrefixes().stream().map(CommonPrefix::prefix).forEach(files::add);
+      // This filtering is needed only for S3mock. Real S3 does not ignore the trailing '/' in the
+      // prefix.
+      files =

Review comment:
       Now that the `while` loop is gone and we can get everything in one API call, this streaming/filtering between `files`, `objectListing`, and `entries` (and even the returned `String[]`) could be simplified.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org