You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Bhavesh K Shah <Bh...@bitwiseglobal.com> on 2015/08/10 17:25:26 UTC

How to write date in parquet-avro?

Hi,
I have an use case where I want to write my data into Hive using Parquet file format. I want to perform all this using Cascading framework. I came across the Cascading Parquet-Avro scheme (To convert data in Parquet format) & HiveTap (TO Write the data in Hive Table) which does this job. But the problem I am facing while doing this I was not able to write the Date datatype in Avro.

So I tried the same thing by converting the date into long values (as Avro doesn't provide Date provision in it schema) but it didn't worked for me as my Hive table has field with Date datatype. It gave me exception "LongWritable cannot cast to DateWritable".

So is there any way through which I can write the date in Hive Table through Parquet-Avro?

Below is my code for the same:
public class Test {

       public static void main(String[] args) throws IOException {

              Properties properties = new Properties();
              String tableName = "datetest";
              CoercibleType dateType = new DateType("yyyy-MM-dd");

              Fields field = new Fields("f1", "f2", "f3").applyTypes(new Type[] {
                           String.class, String.class, dateType });

              String[] columnNames = { "f1", "f2", "f3" };
              String[] columnTypes = { "string", "string", "date" };

              Tap source = new Hfs(new TextDelimited(field, false, ","),
                           "data/file2.txt");

              HiveTableDescriptor tableDesc = new HiveTableDescriptor("default",
              tableName, columnNames, columnTypes, new String[] {}, ",");

              HiveTap hiveTap = new HiveTap(tableDesc, new ParquetAvroScheme(
              new Schema.Parser().parse(ParquetAvro2.class.getClassLoader()
              .getResourceAsStream("avro2.avsc"))));


              Pipe p = new Pipe("pipe");

              FlowDef flowDef = FlowDef.flowDef().addSource(p, source)
                           .addTailSink(p, hiveTap);

              properties = new Properties();
              AppProps.setApplicationName(properties,
                           "cascading hive integration demo");

              new Hadoop2MR1FlowConnector(properties).connect(flowDef).complete();
       }
}

avro2.avsc(Avro Schema):
{
     "type": "record",
     "name": "avro",
     "fields": [
       { "name": "f1", "type": "string" },
       { "name": "f2", "type": "string" },
      {"name": "f3", "type":"long"}
     ]
}



Thanks,
Bhavesh
**************************************Disclaimer****************************************** This e-mail message and any attachments may contain confidential information and is for the sole use of the intended recipient(s) only. Any views or opinions presented or implied are solely those of the author and do not necessarily represent the views of BitWise. If you are not the intended recipient(s), you are hereby notified that disclosure, printing, copying, forwarding, distribution, or the taking of any action whatsoever in reliance on the contents of this electronic information is strictly prohibited. If you have received this e-mail message in error, please immediately notify the sender and delete the electronic message and any attachments.BitWise does not accept liability for any virus introduced by this e-mail or any attachments. ********************************************************************************************