You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Marcello de Sales <ma...@gmail.com> on 2010/11/16 21:11:16 UTC

Best approach for data mashup... Import external Data from MySQL to a Hive table or use Custom Map using JDBC?

Hello all,

I'm using Hive to do mashups of data that is currently on HDFS and data is
that is on a MySQL database... I'm considering this scenario instead of
bringing the data from MySQL to HDFS because it is a matter verifying the
relating data from the query to the data in the external data storage
(mysql). This external data is has a fixed size measured in Megabytes and
provide indexes for calculations...

What's the best approach? Should I import the external MySQL tables into
Hive (probably using sqoop) and provide the needed joins/passes or should I
just use a Map and access the needed data using JDBC from a custom Map?

Thanks
Marcello