You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/09/03 09:20:55 UTC

[GitHub] [flink] gyfora commented on a change in pull request #9530: [FLINK-13842][docs] Improve Documentation of the StreamingFileSink

gyfora commented on a change in pull request #9530: [FLINK-13842][docs] Improve Documentation of the StreamingFileSink
URL: https://github.com/apache/flink/pull/9530#discussion_r320172726
 
 

 ##########
 File path: docs/dev/connectors/streamfile_sink.md
 ##########
 @@ -23,30 +23,83 @@ specific language governing permissions and limitations
 under the License.
 -->
 
+* This will be replaced by the TOC
+{:toc}
+
 This connector provides a Sink that writes partitioned files to filesystems
 supported by the [Flink `FileSystem` abstraction]({{ site.baseurl}}/ops/filesystems/index.html).
 
-Since in streaming the input is potentially infinite, the streaming file sink writes data
-into buckets. The bucketing behaviour is configurable but a useful default is time-based
-bucketing where we start writing a new bucket every hour and thus get
-individual files that each contain a part of the infinite output stream.
+In order to handle unbounded data streams, the streaming file sink writes incoming data
+into buckets. The bucketing behaviour is fully configurable with a default time-based
+bucketing where we start writing a new bucket every hour and thus get files that correspond to
+records received during certain time intervals from the stream.
+
+The bucket directories themselves contain several part files with the actual output data, with at least
+one for each parallel subtask of the sink. Additional part files will be created according to the configurable
 
 Review comment:
   You are right, fixing

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services