You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@gobblin.apache.org by ap...@gmail.com on 2019/12/18 03:21:23 UTC

Apache Gobblin Gitter messages at 2019/12/17 19:21:22

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-17T18:11:47.771Z]** Hi, new to Gobblin. Trying to convert protobuf
messages to parquet and write into HDFS. My understanding is we need to
convert schema and convert the record in the converter. I have handle to the
protobuf message and schema in converter and I am converting the schema with
`new ProtoSchemaConverter().convert(Event.class)`. for convertRecord we need
to be able to return an Iterable<Group> as ParquetDataWriterBuilder takes a
MessageType and Group. How do we get from a Protobuf Message to Group ? Need
help.

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-17T19:04:06.340Z]**
<https://stackoverflow.com/questions/59366492/gobblin-mapreduce-convert-from-
protobuf-to-parquet>

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-17T19:22:35.078Z]**

    
    
       public Iterable<ParquetGroup> convertRecord(MessageType outputSchema, byte[] inputRecord, WorkUnitState workUnit) {
        try {
          LOGGER.info("Converter Input Record: {}", new String(inputRecord));
          Event event = Event.parseFrom(inputRecord);
          ParquetGroup parquetGroup = new ParquetGroup(outputSchema.asGroupType());
          for (final Map.Entry<Descriptors.FieldDescriptor, Object> entry : event.getAllFields().entrySet()) {
            parquetGroup.add(entry.getKey().toString(), entry.getValue());
          }
          return new SingleRecordIterable<>(parquetGroup);
        } catch (Exception exp) {
          LOGGER.error("ConvertRecord Error: {}", exp.getMessage());
          return new EmptyIterable<>();
        }
      }

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-17T19:23:19.198Z]** The above show this error:
`EventConverterV2.java:55: error: incompatible types:
org.apache.parquet.schema.GroupType cannot be converted to
parquet.schema.GroupType ParquetGroup parquetGroup = new
ParquetGroup(outputSchema.asGroupType()); ^`

* * *

####  ![](https://avatars0.githubusercontent.com/u/291820?v=4&s=60) Pritam
Sarkar (pritamsarkar86)

**[2019-12-17T21:31:16.886Z]** @tilakpatidar Looks like you have been involved
on the Parquet side. Need help. ^

* * *