You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "nchammas (via GitHub)" <gi...@apache.org> on 2024/01/19 16:50:12 UTC

[PR] [SPARK-46775][SS] Fix formatting of Kinesis docs [spark]

nchammas opened a new pull request, #44802:
URL: https://github.com/apache/spark/pull/44802

   ### What changes were proposed in this pull request?
   
   - Convert the mixed indentation styles to spaces only.
   - Add syntax highlighting to the code blocks.
   - Fix a couple of broken links to API docs.
   
   ### Why are the changes needed?
   
   This makes the docs a bit easier to read and edit.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, it changes the formatting of this documentation.
   
   ### How was this patch tested?
   
   I built the docs and manually reviewed them.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46775][DOCS] Fix formatting of Kinesis docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon commented on PR #44802:
URL: https://github.com/apache/spark/pull/44802#issuecomment-1902836146

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46775][SS] Fix formatting of Kinesis docs [spark]

Posted by "nchammas (via GitHub)" <gi...@apache.org>.

nchammas commented on code in PR #44802:
URL: https://github.com/apache/spark/pull/44802#discussion_r1459338158


##########
docs/streaming-kinesis-integration.md:
##########
@@ -32,201 +32,216 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
 
 1. **Linking:** For Scala/Java applications using SBT/Maven project definitions, link your streaming application against the following artifact (see [Linking section](streaming-programming-guide.html#linking) in the main programming guide for further information).
 
-		groupId = org.apache.spark
-		artifactId = spark-streaming-kinesis-asl_{{site.SCALA_BINARY_VERSION}}
-		version = {{site.SPARK_VERSION_SHORT}}
+        groupId = org.apache.spark
+        artifactId = spark-streaming-kinesis-asl_{{site.SCALA_BINARY_VERSION}}
+        version = {{site.SPARK_VERSION_SHORT}}
 
-	For Python applications, you will have to add this above library and its dependencies when deploying your application. See the *Deploying* subsection below.
-	**Note that by linking to this library, you will include [ASL](https://aws.amazon.com/asl/)-licensed code in your application.**
+    For Python applications, you will have to add this above library and its dependencies when deploying your application. See the *Deploying* subsection below.
+    **Note that by linking to this library, you will include [ASL](https://aws.amazon.com/asl/)-licensed code in your application.**
 
 2. **Programming:** In the streaming application code, import `KinesisInputDStream` and create the input DStream of byte array as follows:
 
-	<div class="codetabs">
+    <div class="codetabs">
 
     <div data-lang="python" markdown="1">
-            from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
-
-            kinesisStream = KinesisUtils.createStream(
-                streamingContext, [Kinesis app name], [Kinesis stream name], [endpoint URL],
-                [region name], [initial position], [checkpoint interval], [metricsLevel.DETAILED], StorageLevel.MEMORY_AND_DISK_2)
-
-	See the [API docs](api/python/reference/pyspark.streaming.html#kinesis)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
-
-	- CloudWatch metrics level and dimensions. See [the AWS documentation about monitoring KCL](https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html) for details. Default is MetricsLevel.DETAILED
-
-	</div>
-
-	<div data-lang="scala" markdown="1">
-            import org.apache.spark.storage.StorageLevel
-            import org.apache.spark.streaming.kinesis.KinesisInputDStream
-            import org.apache.spark.streaming.{Seconds, StreamingContext}
-            import org.apache.spark.streaming.kinesis.KinesisInitialPositions
-
-            val kinesisStream = KinesisInputDStream.builder
-                .streamingContext(streamingContext)
-                .endpointUrl([endpoint URL])
-                .regionName([region name])
-                .streamName([streamName])
-                .initialPosition([initial position])
-                .checkpointAppName([Kinesis app name])
-                .checkpointInterval([checkpoint interval])
-                .metricsLevel([metricsLevel.DETAILED])
-                .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                .build()
-
-	See the [API docs](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala). Refer to the [Running the Example](#running-the-example) subsection for instructions on how to run the example.
-
-	</div>
-
-	<div data-lang="java" markdown="1">
-            import org.apache.spark.storage.StorageLevel;
-            import org.apache.spark.streaming.kinesis.KinesisInputDStream;
-            import org.apache.spark.streaming.Seconds;
-            import org.apache.spark.streaming.StreamingContext;
-            import org.apache.spark.streaming.kinesis.KinesisInitialPositions;
-
-            KinesisInputDStream<byte[]> kinesisStream = KinesisInputDStream.builder()
-                .streamingContext(streamingContext)
-                .endpointUrl([endpoint URL])
-                .regionName([region name])
-                .streamName([streamName])
-                .initialPosition([initial position])
-                .checkpointAppName([Kinesis app name])
-                .checkpointInterval([checkpoint interval])
-                .metricsLevel([metricsLevel.DETAILED])
-                .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                .build();
-
-	See the [API docs](api/java/index.html?org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
-
-	</div>
-
-	</div>
-
-	You may also provide the following settings. This is currently only supported in Scala and Java.
-
-	- A "message handler function" that takes a Kinesis `Record` and returns a generic object `T`, in case you would like to use other data included in a `Record` such as partition key.
-
-	<div class="codetabs">
-	<div data-lang="scala" markdown="1">
-                import collection.JavaConverters._
-                import org.apache.spark.storage.StorageLevel
-                import org.apache.spark.streaming.kinesis.KinesisInputDStream
-                import org.apache.spark.streaming.{Seconds, StreamingContext}
-                import org.apache.spark.streaming.kinesis.KinesisInitialPositions
-                import com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration
-                import com.amazonaws.services.kinesis.metrics.interfaces.MetricsLevel
-
-                val kinesisStream = KinesisInputDStream.builder
-                    .streamingContext(streamingContext)
-                    .endpointUrl([endpoint URL])
-                    .regionName([region name])
-                    .streamName([streamName])
-                    .initialPosition([initial position])
-                    .checkpointAppName([Kinesis app name])
-                    .checkpointInterval([checkpoint interval])
-                    .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                    .metricsLevel(MetricsLevel.DETAILED)
-                    .metricsEnabledDimensions(KinesisClientLibConfiguration.DEFAULT_METRICS_ENABLED_DIMENSIONS.asScala.toSet)
-                    .buildWithMessageHandler([message handler])
-
-	</div>
-	<div data-lang="java" markdown="1">
-                import org.apache.spark.storage.StorageLevel;
-                import org.apache.spark.streaming.kinesis.KinesisInputDStream;
-                import org.apache.spark.streaming.Seconds;
-                import org.apache.spark.streaming.StreamingContext;
-                import org.apache.spark.streaming.kinesis.KinesisInitialPositions;
-                import com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration;
-                import com.amazonaws.services.kinesis.metrics.interfaces.MetricsLevel;
-                import scala.collection.JavaConverters;
-
-                KinesisInputDStream<byte[]> kinesisStream = KinesisInputDStream.builder()
-                    .streamingContext(streamingContext)
-                    .endpointUrl([endpoint URL])
-                    .regionName([region name])
-                    .streamName([streamName])
-                    .initialPosition([initial position])
-                    .checkpointAppName([Kinesis app name])
-                    .checkpointInterval([checkpoint interval])
-                    .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                    .metricsLevel(MetricsLevel.DETAILED)
-                    .metricsEnabledDimensions(JavaConverters.asScalaSetConverter(KinesisClientLibConfiguration.DEFAULT_METRICS_ENABLED_DIMENSIONS).asScala().toSet())
-                    .buildWithMessageHandler([message handler]);
-
-	</div>
-	</div>
-
-	- `streamingContext`: StreamingContext containing an application name used by Kinesis to tie this Kinesis application to the Kinesis stream
-
-	- `[Kinesis app name]`: The application name that will be used to checkpoint the Kinesis
-		sequence numbers in DynamoDB table.
-		- The application name must be unique for a given account and region.
-		- If the table exists but has incorrect checkpoint information (for a different stream, or
-			old expired sequenced numbers), then there may be temporary errors.
-
-	- `[Kinesis stream name]`: The Kinesis stream that this streaming application will pull data from.
-
-	- `[endpoint URL]`: Valid Kinesis endpoints URL can be found [here](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region).
-
-	- `[region name]`: Valid Kinesis region names can be found [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html).
-
-	- `[checkpoint interval]`: The interval (e.g., Duration(2000) = 2 seconds) at which the Kinesis Client Library saves its position in the stream.  For starters, set it to the same as the batch interval of the streaming application.
-
-	- `[initial position]`: Can be either `KinesisInitialPositions.TrimHorizon` or `KinesisInitialPositions.Latest` or `KinesisInitialPositions.AtTimestamp` (see [`Kinesis Checkpointing`](#kinesis-checkpointing) section and [`Amazon Kinesis API documentation`](http://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-sdk.html) for more details).
-
-	- `[message handler]`: A function that takes a Kinesis `Record` and outputs generic `T`.
-
-	In other versions of the API, you can also specify the AWS access key and secret key directly.
+    ```python
+    from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
+
+    kinesisStream = KinesisUtils.createStream(
+        streamingContext, [Kinesis app name], [Kinesis stream name], [endpoint URL],
+        [region name], [initial position], [checkpoint interval], [metricsLevel.DETAILED],
+        StorageLevel.MEMORY_AND_DISK_2)
+    ```
+
+    See the [API docs](api/python/reference/pyspark.streaming.html#kinesis)
+    and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
+
+    - CloudWatch metrics level and dimensions. See [the AWS documentation about monitoring KCL](https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html) for details. Default is `MetricsLevel.DETAILED`.
+
+    </div>
+
+    <div data-lang="scala" markdown="1">
+    ```scala
+    import org.apache.spark.storage.StorageLevel
+    import org.apache.spark.streaming.kinesis.KinesisInputDStream
+    import org.apache.spark.streaming.{Seconds, StreamingContext}
+    import org.apache.spark.streaming.kinesis.KinesisInitialPositions
+
+    val kinesisStream = KinesisInputDStream.builder
+        .streamingContext(streamingContext)
+        .endpointUrl([endpoint URL])
+        .regionName([region name])
+        .streamName([streamName])
+        .initialPosition([initial position])
+        .checkpointAppName([Kinesis app name])
+        .checkpointInterval([checkpoint interval])
+        .metricsLevel([metricsLevel.DETAILED])
+        .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
+        .build()
+    ```
+
+    See the [API docs](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream$.html)

Review Comment:
   The old link was also broken here. It just needed the `$` at the end.



##########
docs/streaming-kinesis-integration.md:
##########
@@ -32,201 +32,216 @@ A Kinesis stream can be set up at one of the valid Kinesis endpoints with 1 or m
 
 1. **Linking:** For Scala/Java applications using SBT/Maven project definitions, link your streaming application against the following artifact (see [Linking section](streaming-programming-guide.html#linking) in the main programming guide for further information).
 
-		groupId = org.apache.spark
-		artifactId = spark-streaming-kinesis-asl_{{site.SCALA_BINARY_VERSION}}
-		version = {{site.SPARK_VERSION_SHORT}}
+        groupId = org.apache.spark
+        artifactId = spark-streaming-kinesis-asl_{{site.SCALA_BINARY_VERSION}}
+        version = {{site.SPARK_VERSION_SHORT}}
 
-	For Python applications, you will have to add this above library and its dependencies when deploying your application. See the *Deploying* subsection below.
-	**Note that by linking to this library, you will include [ASL](https://aws.amazon.com/asl/)-licensed code in your application.**
+    For Python applications, you will have to add this above library and its dependencies when deploying your application. See the *Deploying* subsection below.
+    **Note that by linking to this library, you will include [ASL](https://aws.amazon.com/asl/)-licensed code in your application.**
 
 2. **Programming:** In the streaming application code, import `KinesisInputDStream` and create the input DStream of byte array as follows:
 
-	<div class="codetabs">
+    <div class="codetabs">
 
     <div data-lang="python" markdown="1">
-            from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
-
-            kinesisStream = KinesisUtils.createStream(
-                streamingContext, [Kinesis app name], [Kinesis stream name], [endpoint URL],
-                [region name], [initial position], [checkpoint interval], [metricsLevel.DETAILED], StorageLevel.MEMORY_AND_DISK_2)
-
-	See the [API docs](api/python/reference/pyspark.streaming.html#kinesis)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
-
-	- CloudWatch metrics level and dimensions. See [the AWS documentation about monitoring KCL](https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html) for details. Default is MetricsLevel.DETAILED
-
-	</div>
-
-	<div data-lang="scala" markdown="1">
-            import org.apache.spark.storage.StorageLevel
-            import org.apache.spark.streaming.kinesis.KinesisInputDStream
-            import org.apache.spark.streaming.{Seconds, StreamingContext}
-            import org.apache.spark.streaming.kinesis.KinesisInitialPositions
-
-            val kinesisStream = KinesisInputDStream.builder
-                .streamingContext(streamingContext)
-                .endpointUrl([endpoint URL])
-                .regionName([region name])
-                .streamName([streamName])
-                .initialPosition([initial position])
-                .checkpointAppName([Kinesis app name])
-                .checkpointInterval([checkpoint interval])
-                .metricsLevel([metricsLevel.DETAILED])
-                .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                .build()
-
-	See the [API docs](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala). Refer to the [Running the Example](#running-the-example) subsection for instructions on how to run the example.
-
-	</div>
-
-	<div data-lang="java" markdown="1">
-            import org.apache.spark.storage.StorageLevel;
-            import org.apache.spark.streaming.kinesis.KinesisInputDStream;
-            import org.apache.spark.streaming.Seconds;
-            import org.apache.spark.streaming.StreamingContext;
-            import org.apache.spark.streaming.kinesis.KinesisInitialPositions;
-
-            KinesisInputDStream<byte[]> kinesisStream = KinesisInputDStream.builder()
-                .streamingContext(streamingContext)
-                .endpointUrl([endpoint URL])
-                .regionName([region name])
-                .streamName([streamName])
-                .initialPosition([initial position])
-                .checkpointAppName([Kinesis app name])
-                .checkpointInterval([checkpoint interval])
-                .metricsLevel([metricsLevel.DETAILED])
-                .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                .build();
-
-	See the [API docs](api/java/index.html?org/apache/spark/streaming/kinesis/KinesisInputDStream.html)
-	and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/java/org/apache/spark/examples/streaming/JavaKinesisWordCountASL.java). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
-
-	</div>
-
-	</div>
-
-	You may also provide the following settings. This is currently only supported in Scala and Java.
-
-	- A "message handler function" that takes a Kinesis `Record` and returns a generic object `T`, in case you would like to use other data included in a `Record` such as partition key.
-
-	<div class="codetabs">
-	<div data-lang="scala" markdown="1">
-                import collection.JavaConverters._
-                import org.apache.spark.storage.StorageLevel
-                import org.apache.spark.streaming.kinesis.KinesisInputDStream
-                import org.apache.spark.streaming.{Seconds, StreamingContext}
-                import org.apache.spark.streaming.kinesis.KinesisInitialPositions
-                import com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration
-                import com.amazonaws.services.kinesis.metrics.interfaces.MetricsLevel
-
-                val kinesisStream = KinesisInputDStream.builder
-                    .streamingContext(streamingContext)
-                    .endpointUrl([endpoint URL])
-                    .regionName([region name])
-                    .streamName([streamName])
-                    .initialPosition([initial position])
-                    .checkpointAppName([Kinesis app name])
-                    .checkpointInterval([checkpoint interval])
-                    .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                    .metricsLevel(MetricsLevel.DETAILED)
-                    .metricsEnabledDimensions(KinesisClientLibConfiguration.DEFAULT_METRICS_ENABLED_DIMENSIONS.asScala.toSet)
-                    .buildWithMessageHandler([message handler])
-
-	</div>
-	<div data-lang="java" markdown="1">
-                import org.apache.spark.storage.StorageLevel;
-                import org.apache.spark.streaming.kinesis.KinesisInputDStream;
-                import org.apache.spark.streaming.Seconds;
-                import org.apache.spark.streaming.StreamingContext;
-                import org.apache.spark.streaming.kinesis.KinesisInitialPositions;
-                import com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration;
-                import com.amazonaws.services.kinesis.metrics.interfaces.MetricsLevel;
-                import scala.collection.JavaConverters;
-
-                KinesisInputDStream<byte[]> kinesisStream = KinesisInputDStream.builder()
-                    .streamingContext(streamingContext)
-                    .endpointUrl([endpoint URL])
-                    .regionName([region name])
-                    .streamName([streamName])
-                    .initialPosition([initial position])
-                    .checkpointAppName([Kinesis app name])
-                    .checkpointInterval([checkpoint interval])
-                    .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
-                    .metricsLevel(MetricsLevel.DETAILED)
-                    .metricsEnabledDimensions(JavaConverters.asScalaSetConverter(KinesisClientLibConfiguration.DEFAULT_METRICS_ENABLED_DIMENSIONS).asScala().toSet())
-                    .buildWithMessageHandler([message handler]);
-
-	</div>
-	</div>
-
-	- `streamingContext`: StreamingContext containing an application name used by Kinesis to tie this Kinesis application to the Kinesis stream
-
-	- `[Kinesis app name]`: The application name that will be used to checkpoint the Kinesis
-		sequence numbers in DynamoDB table.
-		- The application name must be unique for a given account and region.
-		- If the table exists but has incorrect checkpoint information (for a different stream, or
-			old expired sequenced numbers), then there may be temporary errors.
-
-	- `[Kinesis stream name]`: The Kinesis stream that this streaming application will pull data from.
-
-	- `[endpoint URL]`: Valid Kinesis endpoints URL can be found [here](http://docs.aws.amazon.com/general/latest/gr/rande.html#ak_region).
-
-	- `[region name]`: Valid Kinesis region names can be found [here](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html).
-
-	- `[checkpoint interval]`: The interval (e.g., Duration(2000) = 2 seconds) at which the Kinesis Client Library saves its position in the stream.  For starters, set it to the same as the batch interval of the streaming application.
-
-	- `[initial position]`: Can be either `KinesisInitialPositions.TrimHorizon` or `KinesisInitialPositions.Latest` or `KinesisInitialPositions.AtTimestamp` (see [`Kinesis Checkpointing`](#kinesis-checkpointing) section and [`Amazon Kinesis API documentation`](http://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-sdk.html) for more details).
-
-	- `[message handler]`: A function that takes a Kinesis `Record` and outputs generic `T`.
-
-	In other versions of the API, you can also specify the AWS access key and secret key directly.
+    ```python
+    from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
+
+    kinesisStream = KinesisUtils.createStream(
+        streamingContext, [Kinesis app name], [Kinesis stream name], [endpoint URL],
+        [region name], [initial position], [checkpoint interval], [metricsLevel.DETAILED],
+        StorageLevel.MEMORY_AND_DISK_2)
+    ```
+
+    See the [API docs](api/python/reference/pyspark.streaming.html#kinesis)
+    and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py). Refer to the [Running the Example](#running-the-example) subsection for instructions to run the example.
+
+    - CloudWatch metrics level and dimensions. See [the AWS documentation about monitoring KCL](https://docs.aws.amazon.com/streams/latest/dev/monitoring-with-kcl.html) for details. Default is `MetricsLevel.DETAILED`.
+
+    </div>
+
+    <div data-lang="scala" markdown="1">
+    ```scala
+    import org.apache.spark.storage.StorageLevel
+    import org.apache.spark.streaming.kinesis.KinesisInputDStream
+    import org.apache.spark.streaming.{Seconds, StreamingContext}
+    import org.apache.spark.streaming.kinesis.KinesisInitialPositions
+
+    val kinesisStream = KinesisInputDStream.builder
+        .streamingContext(streamingContext)
+        .endpointUrl([endpoint URL])
+        .regionName([region name])
+        .streamName([streamName])
+        .initialPosition([initial position])
+        .checkpointAppName([Kinesis app name])
+        .checkpointInterval([checkpoint interval])
+        .metricsLevel([metricsLevel.DETAILED])
+        .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
+        .build()
+    ```
+
+    See the [API docs](api/scala/org/apache/spark/streaming/kinesis/KinesisInputDStream$.html)
+    and the [example]({{site.SPARK_GITHUB_URL}}/tree/master/connector/kinesis-asl/src/main/scala/org/apache/spark/examples/streaming/KinesisWordCountASL.scala). Refer to the [Running the Example](#running-the-example) subsection for instructions on how to run the example.
+
+    </div>
+
+    <div data-lang="java" markdown="1">
+    ```java
+    import org.apache.spark.storage.StorageLevel;
+    import org.apache.spark.streaming.kinesis.KinesisInputDStream;
+    import org.apache.spark.streaming.Seconds;
+    import org.apache.spark.streaming.StreamingContext;
+    import org.apache.spark.streaming.kinesis.KinesisInitialPositions;
+
+    KinesisInputDStream<byte[]> kinesisStream = KinesisInputDStream.builder()
+        .streamingContext(streamingContext)
+        .endpointUrl([endpoint URL])
+        .regionName([region name])
+        .streamName([streamName])
+        .initialPosition([initial position])
+        .checkpointAppName([Kinesis app name])
+        .checkpointInterval([checkpoint interval])
+        .metricsLevel([metricsLevel.DETAILED])
+        .storageLevel(StorageLevel.MEMORY_AND_DISK_2)
+        .build();
+    ```
+
+    See the [API docs](api/java/org/apache/spark/streaming/kinesis/package-summary.html)

Review Comment:
   The old link was broken, but I cannot find a Java API doc for `KinesisInputDStream` so I just linked to the package summary.
   
   Maybe @HeartSaVioR knows what's going on here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46775][DOCS] Fix formatting of Kinesis docs [spark]

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon closed pull request #44802: [SPARK-46775][DOCS] Fix formatting of Kinesis docs
URL: https://github.com/apache/spark/pull/44802


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org