You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@pekko.apache.org by "mdedetrich (via GitHub)" <gi...@apache.org> on 2023/04/23 08:19:58 UTC

[GitHub] [incubator-pekko-connectors] mdedetrich opened a new pull request, #84: Add S3 put plus get bucketWithVersioning API

mdedetrich opened a new pull request, #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84

   So the context behind this PR is that I am trying to get integration tests working against the real AWS S3 (see https://github.com/apache/incubator-pekko-connectors/issues/67) and as part of that integration I am implementing the ability of the test suite to dynamically create/delete buckets (to prevent flaky test issues as a result of concurrency/left over state). In that work I ended up realizing that in order to get some of the tests to work I have to create buckets with versioning hence why I am making this PR.
   
   Slightly annoyingly, the API of https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketVersioning.html / https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetBucketVersioning.html has some unique peculiarities which made it harder to model. For starters the API differentiates between the versioning status of a bucket never being set versus it being set to false, alongside that there is an MFA field which is also part of the request hence why I modelled it. There are integration tests that verify this behaviour (tested on my companies AWS account) however since its highly impractical to test MFA I just ended up writing a unit test to make sure the headers are being correctly set (see `"add x-amz-mfa headers for a putBucketVersioning request"`).
   
   There are some important design decisions which should be considered that I am definitely open to changing, i.e. both the `status` and `mfaDelete` fields are being modelled with an ADT rather than a boolean. This is because the `status` field doesn't appear to be a true boolean (its `Enabled`/`Suspended` rather than just `Enabled`/`Disabled` which means there is a chance that more fields can be added) and also because of what was mentioned before wrt to the API explicitly modelling the case where a field is never set vs it being set to false you would end up with `Option[Boolean]` which isn't the nicest. There is however an exception to this which is `BucketVersioningResult.mfaField` which does contain `Option[Boolean]`. This is because when setting the status field via put you have to provide actual MFA details where as when you retrieve the status via a get result its just telling whether its never being set/enabled/disabled. If I wanted to model this I would have to create an e
 ntirely separate ADT just for that boolean like value (since `MFAStatus` is already being used for the put request).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich merged pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich merged PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] He-Pin commented on pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "He-Pin (via GitHub)" <gi...@apache.org>.
He-Pin commented on PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#issuecomment-1527386862

   Sorry I never used s3 😓😓😓


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich commented on a diff in pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich commented on code in PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#discussion_r1174534616


##########
s3/src/main/scala/org/apache/pekko/stream/connectors/s3/impl/Marshalling.scala:
##########
@@ -32,6 +32,19 @@ import scala.xml.NodeSeq
 @InternalApi private[impl] object Marshalling {
   import ScalaXmlSupport._
 
+  implicit val bucketVersioningUnmarshaller: FromEntityUnmarshaller[BucketVersioningResult] = {
+    nodeSeqUnmarshaller(MediaTypes.`application/xml`, ContentTypes.`application/octet-stream`).map {
+      case NodeSeq.Empty => throw Unmarshaller.NoContentException
+      case x =>
+        val status = (x \ "Status").headOption.map(_.text match {
+          case "Enabled"   => BucketVersioningStatus.Enabled
+          case "Suspended" => BucketVersioningStatus.Suspended

Review Comment:
   So I am slightly torn on what to do here. As explained previously I have an impression that there could be a chance that a new value gets added to this field (since its `Suspended` rather than just being `Disabled`). If this happens then you would get a match runtime error here, there are alternatives however any of these alternatives would be a completely new case for `Marshalling.scala`, i.e.
   
   * If we get a value other than `"Enabled"`/`"Suspended"` we can just return "Suspended" but then log an error saying that we discovered a new field. This means we won't break end users but it might also cause unintended behaviour
   * Throw a proper exception rather than just a match error, no such exception actually exists currently since this would be a first



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich commented on pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich commented on PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#issuecomment-1527219331

   @He-Pin @pjfanning @nvollmar Would it be possible to look into this? This ticket is blocking adding in integration tests for AWS S3 which I would prefer to get done before release because I want to verify that there aren't any more problems with the S3 client.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich commented on a diff in pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich commented on code in PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#discussion_r1174534616


##########
s3/src/main/scala/org/apache/pekko/stream/connectors/s3/impl/Marshalling.scala:
##########
@@ -32,6 +32,19 @@ import scala.xml.NodeSeq
 @InternalApi private[impl] object Marshalling {
   import ScalaXmlSupport._
 
+  implicit val bucketVersioningUnmarshaller: FromEntityUnmarshaller[BucketVersioningResult] = {
+    nodeSeqUnmarshaller(MediaTypes.`application/xml`, ContentTypes.`application/octet-stream`).map {
+      case NodeSeq.Empty => throw Unmarshaller.NoContentException
+      case x =>
+        val status = (x \ "Status").headOption.map(_.text match {
+          case "Enabled"   => BucketVersioningStatus.Enabled
+          case "Suspended" => BucketVersioningStatus.Suspended

Review Comment:
   So I am slightly torn on what to do here. As explained previously I have an impression that there could be a chance that a new value gets added to this field (since its `Suspended` rather than just being `Disabled`). If this happens then you would get a runtime error here, there are alternatives however any of these alternatives would be a completely new case for `Marshalling.scala`, i.e.
   
   * If we get a value other than `"Enabled"`/`"Suspended"` we can just return "Suspended" but then log an error saying that we discovered a new field. This means we won't break end users but it might also cause unintended behaviour
   * Throw a proper exception rather than just a match error, no such exception actually exists currently since this would be a first



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] pjfanning commented on a diff in pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "pjfanning (via GitHub)" <gi...@apache.org>.
pjfanning commented on code in PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#discussion_r1180205629


##########
s3/src/main/scala/org/apache/pekko/stream/connectors/s3/impl/HttpRequests.scala:
##########
@@ -24,13 +24,21 @@ import pekko.http.scaladsl.model.Uri.{ Authority, Query }
 import pekko.http.scaladsl.model.headers.{ `Raw-Request-URI`, Host, RawHeader }
 import pekko.http.scaladsl.model.{ RequestEntity, _ }
 import pekko.stream.connectors.s3.AccessStyle.{ PathAccessStyle, VirtualHostAccessStyle }
-import pekko.stream.connectors.s3.{ ApiVersion, MultipartUpload, S3Settings }
+import pekko.stream.connectors.s3.{
+  ApiVersion,
+  BucketVersioning,
+  BucketVersioningStatus,
+  MFAStatus,
+  MultipartUpload,
+  S3Settings
+}
 import pekko.stream.scaladsl.Source
 import pekko.util.ByteString
 import software.amazon.awssdk.regions.Region
 
 import scala.collection.immutable.Seq
 import scala.concurrent.{ ExecutionContext, Future }
+import scala.xml.NodeSeq

Review Comment:
   one minor quibble - scala.xml is a bit unloved, the use here is small but it might be better just to use Java's in-built XML support (JAXP API)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich commented on a diff in pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich commented on code in PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#discussion_r1180214984


##########
s3/src/main/scala/org/apache/pekko/stream/connectors/s3/impl/HttpRequests.scala:
##########
@@ -24,13 +24,21 @@ import pekko.http.scaladsl.model.Uri.{ Authority, Query }
 import pekko.http.scaladsl.model.headers.{ `Raw-Request-URI`, Host, RawHeader }
 import pekko.http.scaladsl.model.{ RequestEntity, _ }
 import pekko.stream.connectors.s3.AccessStyle.{ PathAccessStyle, VirtualHostAccessStyle }
-import pekko.stream.connectors.s3.{ ApiVersion, MultipartUpload, S3Settings }
+import pekko.stream.connectors.s3.{
+  ApiVersion,
+  BucketVersioning,
+  BucketVersioningStatus,
+  MFAStatus,
+  MultipartUpload,
+  S3Settings
+}
 import pekko.stream.scaladsl.Source
 import pekko.util.ByteString
 import software.amazon.awssdk.regions.Region
 
 import scala.collection.immutable.Seq
 import scala.concurrent.{ ExecutionContext, Future }
+import scala.xml.NodeSeq

Review Comment:
   For now I will keep using Scala xml because that is what's used throughout the codebase, can always file an issue about it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org


[GitHub] [incubator-pekko-connectors] mdedetrich commented on pull request #84: Add S3 put plus get bucketWithVersioning API

Posted by "mdedetrich (via GitHub)" <gi...@apache.org>.
mdedetrich commented on PR #84:
URL: https://github.com/apache/incubator-pekko-connectors/pull/84#issuecomment-1527326132

   > lgtm - I don't think we need to worry as much about mainintaining legacy alpakka behaviour in the specific connectors, if the legacy code is causing trouble
   
   We are actually using this so there is selfish motivation here but aside from that I am working with INFRA to get an S3 account for integration tests and they are waiting for a reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@pekko.apache.org
For additional commands, e-mail: notifications-help@pekko.apache.org