You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@gobblin.apache.org by ap...@gmail.com on 2019/12/19 06:35:29 UTC

Apache Gobblin Gitter messages at 2019/12/18 22:35:27

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:36:55.041Z]** @pritamsarkar86 : looking specifically at your
error in the converter, it seems like you might be importing two versions of
the parquet GroupType class, one from org.apache.parquet.schema and one from
parquet.schema ? fixing the imports might fix your error

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T17:49:33.549Z]** Hi  @shirshanka , that is right. Here is the
problem: ParquetDataWriterBuilder which comes as a part of gobblin-
parquet-0.14.0.jar depends on `MessageType and Group` which is being
referenced from parquet.example.data package. Now for protobuf schema to be
converted into parquet schema, the helper classes are part of parquet-mr as
you mentioned above, and they reference `MessageType and Group` from
org.apache.parquet.example.data

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:50:34.563Z]** so maybe bumping up the version of parquet and
fixing the imports on the gobblin side is the right answer?

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T17:50:54.139Z]** yes

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:51:04.341Z]** what version of parquet should we be using

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T17:55:56.327Z]** If you are asking about the jar dependencies, I
have not seen any issues from 1.7.0 to all they way 1.10.1. For parquet
version I think we will have to support both v1 and v2

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T17:56:06.490Z]** May be I misunderstood the question

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:56:17.596Z]** yeah asking about jar dependencies

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:56:31.225Z]** if we upgrade the parquet version, would like
to upgrade to something the community is using

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T17:56:46.470Z]** since we don?t use parquet at LinkedIn, I don?t
know the appropriate version to use

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T17:58:52.907Z]** I see 1.10.1 having good usage

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T18:02:14.793Z]** ok

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T18:02:32.341Z]** we?ll need a couple of days to make the changes
at the least

* * *

####  ![](https://avatars1.githubusercontent.com/u/1339772?v=4&s=60) Tamas
Nemeth (treff7es)

**[2019-12-18T18:11:44.508Z]** No, we don?t. :(

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-18T18:13:21.866Z]** @shirshanka that should be fine. Thank you.

* * *

####  ![](https://avatars1.githubusercontent.com/u/1339772?v=4&s=60) Tamas
Nemeth (treff7es)

**[2019-12-18T18:22:18.101Z]** @shirshanka We never really could get to even
ORC format but nowadays I keep an eye on Iceberg, Hudi, Delta Lake as storage
format, it would be cool to have some integration in Gobblin. Unfortunately
our priorities are other places and fortunatelly Gobblin/ingestion working
fine. :)

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T18:23:13.450Z]** interesting, we recently rolled out native ORC
in Gobblin (ditching the hive serde)

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T18:23:26.658Z]** and we are working on Iceberg integration
currently (first for metadata)

* * *

####  ![](https://avatars1.githubusercontent.com/u/1339772?v=4&s=60) Tamas
Nemeth (treff7es)

**[2019-12-18T18:24:42.899Z]** wow, I need to catchup what is happening there,
I wish I would have more time. :)

* * *

####  ![](https://avatars2.githubusercontent.com/u/906477?v=4&s=60) Shirshanka
Das (shirshanka)

**[2019-12-18T18:25:55.076Z]** We?re hoping to write a blog soon on recent
developments

* * *

####  ![](https://avatars2.githubusercontent.com/u/1669073?v=4&s=60) kchando
(kchando)

**[2019-12-18T21:46:55.066Z]** @shirshanka : Curious to know regarding the
Native ORC in Gobblin. In the current Gobblin 0.14 still HiveSerdeConverter is
the only way to convert to ORC? When you said "rolled out native ORC in
Gobblin" did you have any new release version for that? or would it be part of
next release 0.15?

* * *