You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by GitBox <gi...@apache.org> on 2020/03/20 19:30:18 UTC

[GitHub] [samza] lakshmi-manasa-g opened a new pull request #1323: Add docs for configs of Azure Blob SystemProducer

lakshmi-manasa-g opened a new pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r397601770
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
 
 Review comment:
   Actually, i just realized that this is an old azure config (before Azure v12). Hence removing it completely.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396828747
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
 
 Review comment:
   Minor: The `**__system-name__**` part looks a little inconsistent with the other sections (which use `systems.*.samza.factory`).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 merged pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 merged pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r397600691
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
 
 Review comment:
   I did it on purpose to be able to highlight that system-name is the Azure container name. Seemed simpler to do this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396831076
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
+|systems.**_system-name_**.azureblob.writer.factory.class|`org.apache.samza.system.`<br>`azureblob.avro.`<br>`AzureBlobAvroWriterFactory`|Fully qualified class name of the `org.apache.samza.system.azureblob.producer.AzureBlobWriter` impl for the system producer.<br><br>The default writer creates blobs that are of type AVRO and require the messages sent to a blob to be AVRO records. The blobs created by the default writer are of type [Block Blobs](https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs).<br>All the following configs are relevant to this default writer.|
+|systems.**_system-name_**.azureblob.compression.type|"none"|type of compression to be used before uploading blocks. Can be "none" or "gzip".|
+|systems.**_system-name_**.azureblob.maxFlushThresholdSize|10485760 (10 MB)|max size of the uncompressed block to be uploaded in bytes. Maximum size allowed by Azure is 100MB.|
+|systems.**_system-name_**.azureblob.maxBlobSize|Long.MAX_VALUE (unlimited)|max size of the uncompressed blob in bytes.<br>If default value then size is unlimited capped only by Azure BlockBlob size of  4.75 TB (100 MB per block X 50,000 blocks).|
+|systems.**_system-name_**.azureblob.maxMessagesPerBlob|Long.MAX_VALUE (unlimited)|max number of messages per blob.|
+|systems.**_system-name_**.azureblob.threadPoolCount|2|number of threads for the asynchronous uploading of blocks.|
+|systems.**_system-name_**.azureblob.blockingQueueSize|Thread Pool Count * 2|size of the queue to hold blocks ready to be uploaded by asynchronous threads.<br>If all threads are busy uploading then blocks are queued and if queue is full then main thread will start uploading which will block processing of incoming messages.|
+|systems.**_system-name_**.azureblob.flushTimeoutMs|3 mins|timeout to finish uploading all blocks before committing a blob.|
+|systems.**_system-name_**.azureblob.closeTimeoutMs|5 mins|timeout to finish committing all the blobs currently being written to. This does not include the flush timeout per blob.|
 
 Review comment:
   Minor: same as above regarding using the actual milliseconds value

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396830826
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
+|systems.**_system-name_**.azureblob.writer.factory.class|`org.apache.samza.system.`<br>`azureblob.avro.`<br>`AzureBlobAvroWriterFactory`|Fully qualified class name of the `org.apache.samza.system.azureblob.producer.AzureBlobWriter` impl for the system producer.<br><br>The default writer creates blobs that are of type AVRO and require the messages sent to a blob to be AVRO records. The blobs created by the default writer are of type [Block Blobs](https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs).<br>All the following configs are relevant to this default writer.|
+|systems.**_system-name_**.azureblob.compression.type|"none"|type of compression to be used before uploading blocks. Can be "none" or "gzip".|
+|systems.**_system-name_**.azureblob.maxFlushThresholdSize|10485760 (10 MB)|max size of the uncompressed block to be uploaded in bytes. Maximum size allowed by Azure is 100MB.|
+|systems.**_system-name_**.azureblob.maxBlobSize|Long.MAX_VALUE (unlimited)|max size of the uncompressed blob in bytes.<br>If default value then size is unlimited capped only by Azure BlockBlob size of  4.75 TB (100 MB per block X 50,000 blocks).|
 
 Review comment:
   Minor: extra space before `4.75TB`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] lakshmi-manasa-g commented on issue #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
lakshmi-manasa-g commented on issue #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#issuecomment-603633414
 
 
   Actually, IntelliJ has an optional side bar which shows what my changes look like.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
lakshmi-manasa-g commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r397602908
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
+|systems.**_system-name_**.azureblob.writer.factory.class|`org.apache.samza.system.`<br>`azureblob.avro.`<br>`AzureBlobAvroWriterFactory`|Fully qualified class name of the `org.apache.samza.system.azureblob.producer.AzureBlobWriter` impl for the system producer.<br><br>The default writer creates blobs that are of type AVRO and require the messages sent to a blob to be AVRO records. The blobs created by the default writer are of type [Block Blobs](https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs).<br>All the following configs are relevant to this default writer.|
 
 Review comment:
   yes, you are right. if a new non-default writer were to wired it, some of these would apply to it but not all - for example, the new writer might choose to create append blobs and not block blobs in which case flush threshold size and flush timeout and maybe even thread pool count dont make sense as these are all for uploading blocks of a blob. But understand your concern and removing that sentence.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396830013
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
 
 Review comment:
   Minor: For consistency, maybe put the actual milliseconds number. You can put `30s` in parentheses or as a note in the Description part.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396829354
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
 
 Review comment:
   Minor: It looks like other parts of this documentation use `false` instead of `"false"`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on issue #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on issue #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#issuecomment-602931582
 
 
   FYI, in case you didn't know, you can test what your changes look like by following `docs/README.md`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396833814
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
+|systems.**_system-name_**.azureblob.writer.factory.class|`org.apache.samza.system.`<br>`azureblob.avro.`<br>`AzureBlobAvroWriterFactory`|Fully qualified class name of the `org.apache.samza.system.azureblob.producer.AzureBlobWriter` impl for the system producer.<br><br>The default writer creates blobs that are of type AVRO and require the messages sent to a blob to be AVRO records. The blobs created by the default writer are of type [Block Blobs](https://docs.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs#about-block-blobs).<br>All the following configs are relevant to this default writer.|
 
 Review comment:
   Regarding "All the following configs are relevant to this default writer.": The following configs apply to other writers too, right? The wording kind of makes it sound like the following configs won't apply to a non-default writer. Can you please clarify that a little bit (or maybe you can just remove that sentence)?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [samza] cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer

Posted by GitBox <gi...@apache.org>.
cameronlee314 commented on a change in pull request #1323: Add docs for configs of Azure Blob SystemProducer 
URL: https://github.com/apache/samza/pull/1323#discussion_r396832610
 
 

 ##########
 File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
 ##########
 @@ -245,6 +246,34 @@ Configs for producing to [ElasticSearch](https://www.elastic.co/products/elastic
 |systems.**_system-name_**.<br>bulk.flush.max.size.mb|5|The maximum aggregate size of messages in the buffered before flushing.|
 |systems.**_system-name_**.<br>bulk.flush.interval.ms|never|How often buffered messages should be flushed.|
 
+#### <a name="azure-blob-storage"></a>[3.7 Azure Blob Storage](#azure-blob-storage)
+Configs for producing to [Azure Blob Storage](https://azure.microsoft.com/en-us/services/storage/blobs/). This section applies if you have set systems.**__system-name__**.samza.factory = `org.apache.samza.system.azureblob.AzureBlobSystemFactory`.<br>
+**_system-name_** is the Azure container name you want to produce blobs to. If such a container does not exist then it is created.<br> 
+
+|Name|Default|Description|
+|--- |--- |--- |
+|sensitive.systems.**_system-name_**.azureblob.account.name| |__Required:__ The Azure account name to which the Azure container belongs to. |
+|sensitive.systems.**_system-name_**.azureblob.account.key| |__Required:__ Key for the Azure account specified above.|
+
+#### <a name="advanced-azure-blob-storage"></a>[Advanced Azure Blob Storage Configurations](#advanced-azure-blob-storage)
+|Name|Default|Description|
+|--- |--- |--- |
+|systems.**_system-name_**.azureblob.proxy.use |"false"|if true, proxy will be used to connect to Azure.|
+|systems.**_system-name_**.azureblob.proxy.hostname| |if proxy.use is true then host name of proxy.|
+|systems.**_system-name_**.azureblob.proxy.port| |if proxy.use is true then port of proxy.|
+|samza.azureblob.log.slowRequestMs|30 secs|The duration after which an Azure request will be logged as a warning.|
 
 Review comment:
   Can you please clarify the description? I think the usage of the term "duration" might be overloaded. Do you mean that if the Azure request takes 30s to complete, then it will be logged?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services