You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by vasia <gi...@git.apache.org> on 2015/10/01 15:14:05 UTC

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

GitHub user vasia opened a pull request:

    https://github.com/apache/flink/pull/1205

    [FLINK-2785] [gelly] implements fromCsvReader for gelly-scala

    This is the last method missing from the Scala Graph implementation.
    In order to be as close as possible to the Java implementation, I implemented one method where different options are given with optional parameters. This resulted into the method exceeding the allowed max number of parameters, as defined in the scala checkstyle. I decided to disable the checkstyle for this particular method, because most of the parameters are optional and are there mainly for completeness. In the common case, a user would only give the path and delimiters. If you think there's a better way to do this without having to disable the checkstyle, let me know!

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vasia/flink gelly-readCsv

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1205
    
----
commit efe84acd8180f1cc4ece197c9b5193f8a926a98f
Author: vasia <va...@apache.org>
Date:   2015-09-30T21:53:06Z

    [FLINK-2785] [gelly] implement fromCsvReader for gelly-scala; add tests and docs

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/1205


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41287065
  
    --- Diff: flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala ---
    @@ -126,6 +126,122 @@ object Graph {
         wrapGraph(jg.Graph.fromTupleDataSet[K, VV, EV](javaTupleEdges, mapper, env.getJavaEnv))
       }
     
    +  /**
    +  * Creates a Graph with from a CSV file of vertices and a CSV file of edges
    +  * 
    +  * @param pathVertices The file path containing the vertices.
    +  * @param vertexValue Defines whether the vertices have associated values.
    +  * If set to false, the vertex input is ignored and vertices are created from the edges file.
    +  * True by default.
    +  * @param lineDelimiterVertices The string that separates lines in the vertices file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterVertices The string that separates vertex Ids from vertex values
    +  * in the vertices file.
    +  * @param quoteCharacterVertices The character to use for quoted String parsing
    +  * in the vertices file. Disabled by default.
    +  * @param ignoreFirstLineVertices Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsVertices Lines that start with the given String in the vertices file
    +  * are ignored, disabled by default.
    +  * @param lenientVertices Whether the parser should silently ignore malformed lines in the
    +  * vertices file.
    +  * @param includedFieldsVertices The fields in the vertices file that should be read.
    +  * By default all fields are read.
    +  * @param pathEdges The file path containing the edges.
    +  * @param edgeValue Defines whether the edges have associated values. True by default.
    +  * @param lineDelimiterEdges The string that separates lines in the edges file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterEdges The string that separates fileds in the edges file.
    +  * @param quoteCharacterEdges The character to use for quoted String parsing
    +  * in the edges file. Disabled by default.
    +  * @param ignoreFirstLineEdges Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsEdges Lines that start with the given String in the edges file
    +  * are ignored, disabled by default.
    +  * @param lenientEdges Whether the parser should silently ignore malformed lines in the
    +  * edges file.
    +  * @param includedFieldsEdges The fields in the edges file that should be read.
    +  * By default all fields are read.
    +  * 
    +  */
    +  // scalastyle:off
    +  // This method exceeds the max allowed number of parameters -->  
    +  def fromCsvReader[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag,
    +    EV: TypeInformation : ClassTag](
    +      pathVertices: String = null,
    +      vertexValue: Boolean = true,
    +      lineDelimiterVertices: String = "\n",
    +      fieldDelimiterVertices: String = ",",
    +      quoteCharacterVertices: Character = null,
    +      ignoreFirstLineVertices: Boolean = false,
    +      ignoreCommentsVertices: String = null,
    +      lenientVertices: Boolean = false,
    +      includedFieldsVertices: Array[Int] = null,
    +      pathEdges: String,
    +      edgeValue: Boolean = true,
    +      lineDelimiterEdges: String = "\n",
    +      fieldDelimiterEdges: String = ",",
    +      quoteCharacterEdges: Character = null,
    +      ignoreFirstLineEdges: Boolean = false,
    +      ignoreCommentsEdges: String = null,
    +      lenientEdges: Boolean = false,
    +      includedFieldsEdges: Array[Int] = null,
    +      mapper: MapFunction[K, VV] = null,
    +      env: ExecutionEnvironment) = {
    +
    +    // with vertex and edge values
    +    if (vertexValue && edgeValue) {
    +      val vertices = env.readCsvFile[(K, VV)](pathVertices, lineDelimiterVertices,
    +        fieldDelimiterVertices, quoteCharacterVertices, ignoreFirstLineVertices,
    +        ignoreCommentsVertices, lenientVertices, includedFieldsVertices)
    +
    +      val edges = env.readCsvFile[(K, K, EV)](pathEdges, lineDelimiterEdges, fieldDelimiterEdges,
    +        quoteCharacterEdges, ignoreFirstLineEdges, ignoreCommentsEdges, lenientEdges,
    +        includedFieldsEdges)
    +     
    +      fromTupleDataSet[K, VV, EV](vertices, edges, env)
    +    }
    +    // with vertex value and no edge value
    +    else if (vertexValue && (!edgeValue)) {
    +      val vertices = env.readCsvFile[(K, VV)](pathVertices, lineDelimiterVertices,
    --- End diff --
    
    check if `pathVertices` is correctly set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-145970361
  
    I'm worrying about the name of method `fromCsvReader`. It seems that the method should receive `CsvReader` object and read graph data from the reader object. Why we don't use other name such as `fromCsvFile`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41377863
  
    --- Diff: docs/libs/gelly_guide.md ---
    @@ -194,6 +194,29 @@ val edgeTuples = env.readCsvFile[String, String, Double]("path/to/edge/input")
     
     val graph = Graph.fromTupleDataSet(vertexTuples, edgeTuples, env)
     {% endhighlight %}
    +
    +* from a CSV file of Edge data and an optional CSV file of Vertex data.
    +In this case, Gelly will convert each row from the Edge CSV file to an `Edge`, where the first field will be the source ID, the second field will be the target ID and the third field (if present) will be the edge value. The parameter `readVertices` defines whether vertex data are provided. If `readVertices` is set to `true`, then `pathVertices` must be specified. In this case, each row from the Vertex CSV file will be converted to a `Vertex`, where the first field will be the vertex ID and the second field will be the vertex value. If `readVertices` is set to false, then Vertex data will be ignored and vertices will be automatically created from the edges input.
    +If the edges have no associated value, set the `hasEdgeValues` parameter to `false`.
    --- End diff --
    
    Move this sentence after "... (if present) will be the edge value."?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41284498
  
    --- Diff: docs/libs/gelly_guide.md ---
    @@ -194,6 +194,27 @@ val edgeTuples = env.readCsvFile[String, String, Double]("path/to/edge/input")
     
     val graph = Graph.fromTupleDataSet(vertexTuples, edgeTuples, env)
     {% endhighlight %}
    +
    +* from a CSV file of Edge data and an optional CSV file of Vertex data. In this case, Gelly will convert each row from the Edge CSV file to an `Edge`, where the first field will be the source ID, the second field will be the target ID and the third field (if present) will be the edge value. Equivalently, each row from the optional Vertex CSV file will be converted to a `Vertex`, where the first field will be the vertex ID and the second field (if present) will be the vertex value.
    +If the vertices have no associated value, set the `vertexValue` parameter to `false`. If the edges have no associated value, set the `edgeValue` parameter to `false`.
    +
    +{% highlight scala %}
    +val env = ExecutionEnvironment.getExecutionEnvironment
    +
    +// create a Graph with String Vertex IDs, Long Vertex values and Double Edge values
    +val graph = Graph.fromCsvReader[String, Long, Double](
    +		pathVertices = "path/to/vertex/input",
    +		pathEdges = "path/to/edge/input",
    +		env = env)
    +
    +
    +// create a Graph with neither Vertex nor Edge values
    +val simpleGraph = Graph.fromCsvReader[Long, NullValue, NullVale](
    +		vertexValue = false,
    --- End diff --
    
    Is it necessary to set `vertexValue` to false if no vertex path is specified?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41284820
  
    --- Diff: flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala ---
    @@ -126,6 +126,122 @@ object Graph {
         wrapGraph(jg.Graph.fromTupleDataSet[K, VV, EV](javaTupleEdges, mapper, env.getJavaEnv))
       }
     
    +  /**
    +  * Creates a Graph with from a CSV file of vertices and a CSV file of edges
    +  * 
    +  * @param pathVertices The file path containing the vertices.
    +  * @param vertexValue Defines whether the vertices have associated values.
    --- End diff --
    
    rename to `hasVertexValue`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-146030951
  
    Okay, I understand it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41287202
  
    --- Diff: docs/libs/gelly_guide.md ---
    @@ -194,6 +194,27 @@ val edgeTuples = env.readCsvFile[String, String, Double]("path/to/edge/input")
     
     val graph = Graph.fromTupleDataSet(vertexTuples, edgeTuples, env)
     {% endhighlight %}
    +
    +* from a CSV file of Edge data and an optional CSV file of Vertex data. In this case, Gelly will convert each row from the Edge CSV file to an `Edge`, where the first field will be the source ID, the second field will be the target ID and the third field (if present) will be the edge value. Equivalently, each row from the optional Vertex CSV file will be converted to a `Vertex`, where the first field will be the vertex ID and the second field (if present) will be the vertex value.
    +If the vertices have no associated value, set the `vertexValue` parameter to `false`. If the edges have no associated value, set the `edgeValue` parameter to `false`.
    --- End diff --
    
    The role of `vertexValues` in creating the vertex set from the edges is not discussed here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41284938
  
    --- Diff: flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala ---
    @@ -126,6 +126,122 @@ object Graph {
         wrapGraph(jg.Graph.fromTupleDataSet[K, VV, EV](javaTupleEdges, mapper, env.getJavaEnv))
       }
     
    +  /**
    +  * Creates a Graph with from a CSV file of vertices and a CSV file of edges
    +  * 
    +  * @param pathVertices The file path containing the vertices.
    +  * @param vertexValue Defines whether the vertices have associated values.
    +  * If set to false, the vertex input is ignored and vertices are created from the edges file.
    +  * True by default.
    +  * @param lineDelimiterVertices The string that separates lines in the vertices file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterVertices The string that separates vertex Ids from vertex values
    +  * in the vertices file.
    +  * @param quoteCharacterVertices The character to use for quoted String parsing
    +  * in the vertices file. Disabled by default.
    +  * @param ignoreFirstLineVertices Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsVertices Lines that start with the given String in the vertices file
    +  * are ignored, disabled by default.
    +  * @param lenientVertices Whether the parser should silently ignore malformed lines in the
    +  * vertices file.
    +  * @param includedFieldsVertices The fields in the vertices file that should be read.
    +  * By default all fields are read.
    +  * @param pathEdges The file path containing the edges.
    +  * @param edgeValue Defines whether the edges have associated values. True by default.
    --- End diff --
    
    `hasEdgeValues`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41287032
  
    --- Diff: flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala ---
    @@ -126,6 +126,122 @@ object Graph {
         wrapGraph(jg.Graph.fromTupleDataSet[K, VV, EV](javaTupleEdges, mapper, env.getJavaEnv))
       }
     
    +  /**
    +  * Creates a Graph with from a CSV file of vertices and a CSV file of edges
    +  * 
    +  * @param pathVertices The file path containing the vertices.
    +  * @param vertexValue Defines whether the vertices have associated values.
    +  * If set to false, the vertex input is ignored and vertices are created from the edges file.
    +  * True by default.
    +  * @param lineDelimiterVertices The string that separates lines in the vertices file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterVertices The string that separates vertex Ids from vertex values
    +  * in the vertices file.
    +  * @param quoteCharacterVertices The character to use for quoted String parsing
    +  * in the vertices file. Disabled by default.
    +  * @param ignoreFirstLineVertices Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsVertices Lines that start with the given String in the vertices file
    +  * are ignored, disabled by default.
    +  * @param lenientVertices Whether the parser should silently ignore malformed lines in the
    +  * vertices file.
    +  * @param includedFieldsVertices The fields in the vertices file that should be read.
    +  * By default all fields are read.
    +  * @param pathEdges The file path containing the edges.
    +  * @param edgeValue Defines whether the edges have associated values. True by default.
    +  * @param lineDelimiterEdges The string that separates lines in the edges file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterEdges The string that separates fileds in the edges file.
    +  * @param quoteCharacterEdges The character to use for quoted String parsing
    +  * in the edges file. Disabled by default.
    +  * @param ignoreFirstLineEdges Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsEdges Lines that start with the given String in the edges file
    +  * are ignored, disabled by default.
    +  * @param lenientEdges Whether the parser should silently ignore malformed lines in the
    +  * edges file.
    +  * @param includedFieldsEdges The fields in the edges file that should be read.
    +  * By default all fields are read.
    +  * 
    +  */
    +  // scalastyle:off
    +  // This method exceeds the max allowed number of parameters -->  
    +  def fromCsvReader[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag,
    +    EV: TypeInformation : ClassTag](
    +      pathVertices: String = null,
    +      vertexValue: Boolean = true,
    +      lineDelimiterVertices: String = "\n",
    +      fieldDelimiterVertices: String = ",",
    +      quoteCharacterVertices: Character = null,
    +      ignoreFirstLineVertices: Boolean = false,
    +      ignoreCommentsVertices: String = null,
    +      lenientVertices: Boolean = false,
    +      includedFieldsVertices: Array[Int] = null,
    +      pathEdges: String,
    +      edgeValue: Boolean = true,
    +      lineDelimiterEdges: String = "\n",
    +      fieldDelimiterEdges: String = ",",
    +      quoteCharacterEdges: Character = null,
    +      ignoreFirstLineEdges: Boolean = false,
    +      ignoreCommentsEdges: String = null,
    +      lenientEdges: Boolean = false,
    +      includedFieldsEdges: Array[Int] = null,
    +      mapper: MapFunction[K, VV] = null,
    +      env: ExecutionEnvironment) = {
    +
    +    // with vertex and edge values
    +    if (vertexValue && edgeValue) {
    +      val vertices = env.readCsvFile[(K, VV)](pathVertices, lineDelimiterVertices,
    --- End diff --
    
    Check if `pathVertices` is correctly set to return a better error message than a NPE from `env.readCsvFile()`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-145921072
  
    Hi, you might have noticed, I got a bit confused about the `vertexValue` parameter and how it affects the graph creation. I made a proposal inline to rename that parameter and making it mandatory. Let me know what you think. The documentation should discuss the `vertexValue` in more detail.
    
    Otherwise the PR looks good. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-146158372
  
    Thanks for the fast update! Good to merge.
    I added a comment about the parameter order that could be addressed if you agree.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-146225895
  
    Great, thanks! I will re-order the parameters and merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-146288828
  
    There was an error in `RandomSamplerTest ` in one of the builds. I'll open a JIRA and merge this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1205#discussion_r41377804
  
    --- Diff: flink-staging/flink-gelly-scala/src/main/scala/org/apache/flink/graph/scala/Graph.scala ---
    @@ -126,6 +126,131 @@ object Graph {
         wrapGraph(jg.Graph.fromTupleDataSet[K, VV, EV](javaTupleEdges, mapper, env.getJavaEnv))
       }
     
    +  /**
    +  * Creates a Graph with from a CSV file of vertices and a CSV file of edges
    +  * 
    +  * @param pathVertices The file path containing the vertices.
    +  * @param readVertices Defines whether the vertices have associated values.
    +  * If set to false, the vertex input is ignored and vertices are created from the edges file.
    +  * @param lineDelimiterVertices The string that separates lines in the vertices file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterVertices The string that separates vertex Ids from vertex values
    +  * in the vertices file.
    +  * @param quoteCharacterVertices The character to use for quoted String parsing
    +  * in the vertices file. Disabled by default.
    +  * @param ignoreFirstLineVertices Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsVertices Lines that start with the given String in the vertices file
    +  * are ignored, disabled by default.
    +  * @param lenientVertices Whether the parser should silently ignore malformed lines in the
    +  * vertices file.
    +  * @param includedFieldsVertices The fields in the vertices file that should be read.
    +  * By default all fields are read.
    +  * @param pathEdges The file path containing the edges.
    +  * @param hasEdgeValues Defines whether the edges have associated values. True by default.
    +  * @param lineDelimiterEdges The string that separates lines in the edges file.
    +  * It defaults to newline.
    +  * @param fieldDelimiterEdges The string that separates fileds in the edges file.
    +  * @param quoteCharacterEdges The character to use for quoted String parsing
    +  * in the edges file. Disabled by default.
    +  * @param ignoreFirstLineEdges Whether the first line in the vertices file should be ignored.
    +  * @param ignoreCommentsEdges Lines that start with the given String in the edges file
    +  * are ignored, disabled by default.
    +  * @param lenientEdges Whether the parser should silently ignore malformed lines in the
    +  * edges file.
    +  * @param includedFieldsEdges The fields in the edges file that should be read.
    +  * By default all fields are read.
    +  * 
    +  */
    +  // scalastyle:off
    +  // This method exceeds the max allowed number of parameters -->  
    +  def fromCsvReader[K: TypeInformation : ClassTag, VV: TypeInformation : ClassTag,
    +    EV: TypeInformation : ClassTag](
    +      pathVertices: String = null,
    --- End diff --
    
    Should we change the order of the parameters and move the most common once first? This will allow to specify some parameters without names. How about:
    - pathEdges (mandatory)
    - readVertices
    - pathVertices
    - hasEdgeValues
    - ... (the others in the current order)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-145979805
  
    Hey @chiwanpark,
    this is how the Java method works, i.e. creates a `CsvReader ` object and returns the graph when calling the appropriate types methods. I only kept this name in Scala to be consistent with the Java API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2785] [gelly] implements fromCsvReader ...

Posted by vasia <gi...@git.apache.org>.
Github user vasia commented on the pull request:

    https://github.com/apache/flink/pull/1205#issuecomment-145939652
  
    Thanks so much for the review @fhueske!
    I agree with your comments. I will rename the parameter to `readVertices`, make it mandatory, and better explain it on the docs. I also like the suggestion for `edgeValue` -> `hasEdgeValues`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---