You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2018/09/27 07:51:31 UTC

[GitHub] aljoscha closed pull request #6753: [FLINK-1960] [Documentation] Add docs for withForwardedFields and related operators in Scala API

aljoscha closed pull request #6753: [FLINK-1960] [Documentation] Add docs for withForwardedFields and related operators in Scala API
URL: https://github.com/apache/flink/pull/6753
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala b/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
index 71037c3ae1d..c3a46c79e6a 100644
--- a/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
+++ b/flink-scala/src/main/scala/org/apache/flink/api/scala/DataSet.scala
@@ -282,6 +282,49 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the user-defined function.
+    * The forwarded fields information declares fields which are never modified by the function and
+    * which are forwarded to the same position in the output or copied unchanged to another position
+    * in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by their position.
+    * The specified position must be valid for the input and output data type and have
+    * the same type.
+    * For example <code>withForwardedFields("_3")</code> declares that the third field of
+    * an input tuple is copied to the third field of an output tuple.
+    *
+    * <p>Fields which are copied to another position in the output unchanged are declared by
+    * specifying the source field reference in the input and the target field reference
+    * in the output.
+    * {@code withForwardedFields("_1->_3")} denotes that the first field of the input tuple is
+    * copied to the third field of the output tuple unchanged. When using a wildcard ("*") ensure
+    * that the number of declared fields and their types in input and output type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFields("_2; _3->_1; _4")})
+    * or separate Strings ({@code withForwardedFields("_2", "_3->_1", "_4")}).
+    * Please refer to the JavaDoc of {@link org.apache.flink.api.common.functions.Function}
+    * or Flink's documentation for details on field references such as nested fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about forwarded fields
+    * which was for example added by a
+    * {@link org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields} class
+    * annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of field forward expressions.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFields
+    */
   def withForwardedFields(forwardedFields: String*) = {
     javaSet match {
       case op: SingleInputUdfOperator[_, _, _] => op.withForwardedFields(forwardedFields: _*)
@@ -292,6 +335,52 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the first input
+    * of the user-defined function.
+    * The forwarded fields information declares fields which are never modified by the function and
+    * which are forwarded to the same position in the output or copied unchanged
+    * to another position in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by their position.
+    * The specified position must be valid for the input and output data type
+    * and have the same type.
+    * For example <code>withForwardedFieldsFirst("_3")</code> declares that the third field
+    * of an input tuple from the first input is copied to the third field of an output tuple.
+    *
+    * <p>Fields which are copied from the first input to another position in the output unchanged
+    * are declared by specifying the source field reference in the first input and the target field
+    * reference in the output. {@code withForwardedFieldsFirst("_1->_3")} denotes that the first
+    * field of the first input tuple is copied to the third field of the output tuple unchanged.
+    * When using a wildcard ("*") ensure that the number of declared fields and their types
+    * in the first input and output type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFieldsFirst("_2; _3->_0; _4")})
+    * or separate Strings ({@code withForwardedFieldsFirst("_2", "_3->_0", "_4")}).
+    * Please refer to the JavaDoc of
+    * {@link org.apache.flink.api.common.functions.Function} or Flink's documentation for
+    * details on field references such as nested fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about forwarded fields
+    * of the first input which was for example added by a
+    * {@link org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst}
+    * class annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of forwarded field expressions for the first input
+    *                        of the function.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst
+    */
   def withForwardedFieldsFirst(forwardedFields: String*) = {
     javaSet match {
       case op: TwoInputUdfOperator[_, _, _, _] => op.withForwardedFieldsFirst(forwardedFields: _*)
@@ -302,6 +391,53 @@ class DataSet[T: ClassTag](set: JavaDataSet[T]) {
     this
   }
 
+  /**
+    * Adds semantic information about forwarded fields of the second input
+    * of the user-defined function.
+    * The forwarded fields information declares fields which are never modified by the function and
+    * which are forwarded to the same position in the output or copied unchanged
+    * to another position in the output.
+    *
+    * <p>Fields that are forwarded to the same position are specified just by their position.
+    * The specified position must be valid for the input and output data type
+    * and have the same type.
+    * For example <code>withForwardedFieldsFirst("_3")</code> declares that the third field
+    * of an input tuple from the second input is copied to the third field of an output tuple.
+    *
+    * <p>Fields which are copied from the second input to another position
+    * in the output unchanged are declared by specifying the source field reference
+    * in the second input and the target field reference in the output.
+    * {@code withForwardedFieldsFirst("_1->_3")} denotes that the first field of the second input
+    * tuple is copied to the third field of the output tuple unchanged. When using a wildcard ("*")
+    * ensure that the number of declared fields and their types in the second input and
+    * output type match.
+    *
+    * <p>Multiple forwarded fields can be annotated in one
+    * ({@code withForwardedFieldsFirst("_2; _3->_0; _4")})
+    * or separate Strings ({@code withForwardedFieldsFirst("_2", "_3->_0", "_4")}).
+    * Please refer to the JavaDoc of
+    * {@link org.apache.flink.api.common.functions.Function} or Flink's documentation for
+    * details on field references such as nested fields and wildcard.
+    *
+    * <p>It is not possible to override existing semantic information about forwarded fields
+    * of the second input which was for example added by a
+    * {@link org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst}
+    * class annotation.
+    *
+    * <p><b>NOTE: Adding semantic information for functions is optional!
+    * If used correctly, semantic information can help the Flink optimizer to generate more
+    * efficient execution plans.
+    * However, incorrect semantic information can cause the optimizer to generate incorrect
+    * execution plans which compute wrong results!
+    * So be careful when adding semantic information.
+    * </b>
+    *
+    * @param forwardedFields A list of forwarded field expressions for the second input
+    *                        of the function.
+    * @return This operator with annotated forwarded field information.
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation
+    * @see org.apache.flink.api.java.functions.FunctionAnnotation.ForwardedFieldsFirst
+    */
   def withForwardedFieldsSecond(forwardedFields: String*) = {
     javaSet match {
       case op: TwoInputUdfOperator[_, _, _, _] => op.withForwardedFieldsSecond(forwardedFields: _*)


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services