You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by chiwanpark <gi...@git.apache.org> on 2015/08/25 11:15:58 UTC

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

GitHub user chiwanpark opened a pull request:

    https://github.com/apache/flink/pull/1053

    [FLINK-2569] [core] Add CsvReader support for Value types

    This PR contains following changes:
    
    * Extend `TupleTypeInfo` to support `Value` type as basic (primitive) type.
    * Rename `CsvReaderWithPOJOITCase` to `CsvReaderITCase`
    * Add a unit test about type extraction from `CsvReader`
    * Add a integrated test about `CsvReader` with `Value` types.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chiwanpark/flink FLINK-2569

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1053.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1053
    
----
commit bf322291370225a36ba4ff9ed13fecfd2f604ea5
Author: Chiwan Park <ch...@apache.org>
Date:   2015-08-25T08:53:43Z

    [FLINK-2569] [core] Add CsvReader support for Value types

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1053#issuecomment-135366569
  
    Merging this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1053#issuecomment-134621653
  
    Looks good to merge :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1053#issuecomment-134988681
  
    Good to merge


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37886531
  
    --- Diff: flink-tests/src/test/java/org/apache/flink/test/io/CsvReaderITCase.java ---
    @@ -123,6 +132,21 @@ public void testPOJOTypeWithFieldsOrderAndFieldsSelection() throws Exception {
     		expected = "ABC,3,0.00\nDEF,5,0.00\nDEF,1,0.00\nGHI,10,0.00";
     	}
     
    +	@Test
    +	public void testValueTypes() throws Exception {
    +		final String inputData = "ABC,true,1,2,3,4,5.0,6.0\nBCD,false,1,2,3,4,5.0,6.0";
    +		final String dataPath = createInputData(inputData);
    +		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    +
    +		DataSet<Tuple8<StringValue, BooleanValue, ByteValue, ShortValue, IntValue, LongValue, FloatValue, DoubleValue>> data =
    +				env.readCsvFile(dataPath).types(StringValue.class, BooleanValue.class, ByteValue.class, ShortValue.class, IntValue.class, LongValue.class, FloatValue.class, DoubleValue.class);
    +		data.writeAsText(resultPath);
    --- End diff --
    
    That is true, but it might ease the problem a little bit if newly added tests try to use `collect`. And I doubt that we'll soon find somebody who will take care of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37947758
  
    --- Diff: flink-tests/src/test/java/org/apache/flink/test/io/CsvReaderITCase.java ---
    @@ -123,6 +132,21 @@ public void testPOJOTypeWithFieldsOrderAndFieldsSelection() throws Exception {
     		expected = "ABC,3,0.00\nDEF,5,0.00\nDEF,1,0.00\nGHI,10,0.00";
     	}
     
    +	@Test
    +	public void testValueTypes() throws Exception {
    +		final String inputData = "ABC,true,1,2,3,4,5.0,6.0\nBCD,false,1,2,3,4,5.0,6.0";
    +		final String dataPath = createInputData(inputData);
    +		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    +
    +		DataSet<Tuple8<StringValue, BooleanValue, ByteValue, ShortValue, IntValue, LongValue, FloatValue, DoubleValue>> data =
    +				env.readCsvFile(dataPath).types(StringValue.class, BooleanValue.class, ByteValue.class, ShortValue.class, IntValue.class, LongValue.class, FloatValue.class, DoubleValue.class);
    +		data.writeAsText(resultPath);
    --- End diff --
    
    Okay, I'll update `CsvReaderITCase` to use `collect` instead of writing data to disk.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1053#issuecomment-134584979
  
    @fhueske Thanks for review. :) I addressed your comments.
    
    * Add `getBasicAndBasicValueTupleTypeInfo` method into `TupleTypeInfo`
    * Add `isBasicValueType` method into `ValueTypeInfo` class to check whether the type is basic value or not


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1053#issuecomment-134853875
  
    I fixed the test to use `collect` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37884384
  
    --- Diff: flink-tests/src/test/java/org/apache/flink/test/io/CsvReaderITCase.java ---
    @@ -123,6 +132,21 @@ public void testPOJOTypeWithFieldsOrderAndFieldsSelection() throws Exception {
     		expected = "ABC,3,0.00\nDEF,5,0.00\nDEF,1,0.00\nGHI,10,0.00";
     	}
     
    +	@Test
    +	public void testValueTypes() throws Exception {
    +		final String inputData = "ABC,true,1,2,3,4,5.0,6.0\nBCD,false,1,2,3,4,5.0,6.0";
    +		final String dataPath = createInputData(inputData);
    +		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    +
    +		DataSet<Tuple8<StringValue, BooleanValue, ByteValue, ShortValue, IntValue, LongValue, FloatValue, DoubleValue>> data =
    +				env.readCsvFile(dataPath).types(StringValue.class, BooleanValue.class, ByteValue.class, ShortValue.class, IntValue.class, LongValue.class, FloatValue.class, DoubleValue.class);
    +		data.writeAsText(resultPath);
    --- End diff --
    
    @tillrohrmann Yeah, I know that. But to use `collect` instead of writing to disk, we need to change all test methods in `CsvReaderITCase`. Maybe we can cover this in other issue (FLINK-2032).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37847832
  
    --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/TupleTypeInfo.java ---
    @@ -174,13 +177,15 @@ public String toString() {
     			
     			TypeInformation<?> info = BasicTypeInfo.getInfoFor(type);
     			if (info == null) {
    -				throw new IllegalArgumentException("Type at position " + i + " is not a basic type.");
    +				try {
    +					info = ValueTypeInfo.getValueTypeInfo((Class<Value>) type);
    --- End diff --
    
    Any class can implement the `Value` interface.
     We need to check if the class is on of the "BasicValue" classes, that Flink ships.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/1053


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37880533
  
    --- Diff: flink-tests/src/test/java/org/apache/flink/test/io/CsvReaderITCase.java ---
    @@ -123,6 +132,21 @@ public void testPOJOTypeWithFieldsOrderAndFieldsSelection() throws Exception {
     		expected = "ABC,3,0.00\nDEF,5,0.00\nDEF,1,0.00\nGHI,10,0.00";
     	}
     
    +	@Test
    +	public void testValueTypes() throws Exception {
    +		final String inputData = "ABC,true,1,2,3,4,5.0,6.0\nBCD,false,1,2,3,4,5.0,6.0";
    +		final String dataPath = createInputData(inputData);
    +		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    +
    +		DataSet<Tuple8<StringValue, BooleanValue, ByteValue, ShortValue, IntValue, LongValue, FloatValue, DoubleValue>> data =
    +				env.readCsvFile(dataPath).types(StringValue.class, BooleanValue.class, ByteValue.class, ShortValue.class, IntValue.class, LongValue.class, FloatValue.class, DoubleValue.class);
    +		data.writeAsText(resultPath);
    --- End diff --
    
    If I'm not mistaken, then we wanted to avoid writing data to disk because this sometimes fails on Travis. Instead we should use `collect` to keep the data in memory.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2569] [core] Add CsvReader support for ...

Posted by fhueske <gi...@git.apache.org>.
Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1053#discussion_r37847775
  
    --- Diff: flink-java/src/main/java/org/apache/flink/api/java/typeutils/TupleTypeInfo.java ---
    @@ -159,7 +161,8 @@ public String toString() {
     	}
     
     	// --------------------------------------------------------------------------------------------
    -	
    +
    +	@SuppressWarnings("unchecked")
     	public static <X extends Tuple> TupleTypeInfo<X> getBasicTupleTypeInfo(Class<?>... basicTypes) {
    --- End diff --
    
    I would keep the original method and add a `getBasicAndBasicValueTupleTypeInfo()` method.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---