You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/05/15 01:29:53 UTC

[GitHub] [samza] cameronlee314 commented on a change in pull request #1358: SAMZA-2522: Add metadata generator for azure storage blob

cameronlee314 commented on a change in pull request #1358:
URL: https://github.com/apache/samza/pull/1358#discussion_r425509989



##########
File path: samza-azure/src/main/java/org/apache/samza/system/azureblob/AzureBlobConfig.java
##########
@@ -94,6 +95,18 @@
   public static final String SYSTEM_SUFFIX_RANDOM_STRING_TO_BLOB_NAME = SYSTEM_AZUREBLOB_PREFIX + "suffixRandomStringToBlobName";
   private static final boolean SYSTEM_SUFFIX_RANDOM_STRING_TO_BLOB_NAME_DEFAULT = true;
 
+  // full class name of an implementation of org.apache.samza.system.azureblob.utils.BlobMetadataGeneratorFactory
+  // this factory should return an implementation of org.apache.samza.system.azureblob.utils.BlobMetadataGenerator
+  // this generator will be invoked when a blob is committed to add metadata properties to it
+  public static final String SYSTEM_BLOB_METADATA_PROPERTIES_GENERATOR_FACTORY = SYSTEM_AZUREBLOB_PREFIX + "metadataPropertiesGeneratorFactory";
+  private static final String SYSTEM_BLOB_METADATA_PROPERTIES_GENERATOR_FACTORY_DEFAULT =
+      "org.apache.samza.system.azureblob.utils.NullBlobMetadataGeneratorFactory";
+
+  // Additional configs for the metadata generator should be prefixed with this string which is passed to the generator.
+  // for example, to pass a "key":"value" pair to the metadata generator, add config like
+  // systems.<system-name>.azureblob.metadataGeneratorConfig.<key> with value <value>
+  public static final String SYSTEM_BLOB_METADATA_GENERATOR_CONFIG_PREFIX = SYSTEM_AZUREBLOB_PREFIX + "metadataGeneratorConfig";

Review comment:
       Does this need to have an extra `.` at the end so it gets stripped off in `getSystemMetadataGeneratorConfigs`?

##########
File path: samza-azure/src/main/java/org/apache/samza/system/azureblob/AzureBlobConfig.java
##########
@@ -185,4 +198,13 @@ public long getMaxBlobSize(String systemName) {
   public long getMaxMessagesPerBlob(String systemName) {
     return getLong(String.format(SYSTEM_MAX_MESSAGES_PER_BLOB, systemName), SYSTEM_MAX_MESSAGES_PER_BLOB_DEFAULT);
   }
+
+  public String getSystemMetadataPropertiesGeneratorFactory(String systemName) {
+    return get(String.format(SYSTEM_BLOB_METADATA_PROPERTIES_GENERATOR_FACTORY, systemName),
+        SYSTEM_BLOB_METADATA_PROPERTIES_GENERATOR_FACTORY_DEFAULT);
+  }
+
+  public Config getSystemMetadataGeneratorConfigs(String systemName) {
+    return subset(String.format(SYSTEM_BLOB_METADATA_GENERATOR_CONFIG_PREFIX, systemName));

Review comment:
       Minor: For consistency, consider naming these `getSystemBlobMetadata...`

##########
File path: samza-azure/src/main/java/org/apache/samza/system/azureblob/avro/AzureBlobOutputStream.java
##########
@@ -172,8 +193,8 @@ public synchronized void close() {
       future.get((long) flushTimeoutMs, TimeUnit.MILLISECONDS);
       LOG.info("For blob: {} committing blockList size:{}", blobAsyncClient.getBlobUrl().toString(), blockList.size());
       metrics.updateAzureCommitMetrics();
-      Map<String, String> blobMetadata = Collections.singletonMap(BLOB_RAW_SIZE_BYTES_METADATA, Long.toString(totalUploadedBlockSize));

Review comment:
       If an existing case needs this BLOB_RAW_SIZE_BYTES_METADATA, then will they need to implement their own blob metadata generator?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org