You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lens.apache.org by Tao Yan <ty...@linkedin.com> on 2016/08/26 22:12:25 UTC

Adding MySQL as a Data Source

Hi Lens Developers,

I am trying to add mysql as a data source, so, I created the driver as
follows:

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
  <property>
    <name>lens.driver.jdbc.driver.class</name>
    <value>com.mysql.jdbc.Driver</value>
  </property>
  <property>
    <name>lens.driver.jdbc.db.uri</name>

<value>jdbc:mysql://****mysql**hostname****/lens;user=****;passowrd=****</value>
  </property>
  <property>
    <name>lens.driver.jdbc.db.user</name>
    <value>****</value>
  </property>
  <property>
    <name>lens.cube.query.driver.supported.storages</name>
    <value>mysql</value>
    <final>true</final>
  </property>
  <property>
    <name>lens.driver.jdbc.query.rewriter</name>
    <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
  </property>
  <property>
    <name>lens.driver.jdbc.explain.keyword</name>
    <value>explain plan for </value>
  </property>
</configuration>

And created a test table with data in mysql:

create table dimension1_subset (
primary_key BIGINT(8) not null primary key,
attr1 BIGINT(8),
attr5 BIGINT(8)
);


​
And created the storage file:

<?xml version="1.0" encoding="UTF-8"?>

<x_storage classname="org.apache.lens.storage.db.DBStorage" name="mysql"
xmlns="uri:lens:cube:0.1"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
  <properties>
    *<property name="lens.storage.db.url"
value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
<http://lva1-db34.corp.linkedin.com/lens>"/>*
  </properties>
</x_storage>

I am confused when set the value for *lens.storage.db.url *because* it is
neither local filesystem nor HDFS.*

And I don't know what should the storage table look like:

<x_dimension_table dimension_name="dimension1"
table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
  <columns>
    <column name="primary_key" _type="BIGINT"/>
    <column name="attr1" _type="BIGINT"/>
    <column name="attr5" _type="BIGINT"/>
  </columns>
  <properties>
    <property name="dimension1.prop" value="t3"/>
  </properties>
  <storage_tables>


*    <storage_table>      ...    </storage_table>*
  </storage_tables>
</x_dimension_table>

What should I use for the table_location, and what about part_cols? How do
I add data to this table?

Thanks,
-- 

*Tao Yan*
Software Engineer
Data Analytics Infrastructure Tools and Services



206.250.5345
tyan@linkedin.com
https://www.linkedin.com/in/taousc

Re: Adding MySQL as a Data Source

Posted by Tao Yan <ty...@linkedin.com>.
Thanks! That worked.

On Tue, Aug 30, 2016 at 1:33 AM, Puneet Gupta <pu...@inmobi.com>
wrote:

> Adding example for dimension table without partitions as mentioned by Rajat
>
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <x_dimension_table xmlns="uri:lens:cube:0.1" dimension_name="account" table_name="account_mysql" weight="0.1">
>     <columns>
>         <column name="id" _type="int" comment=""/>
>         <column name="account_name" _type="string" comment=""/>
>         <column name="status" _type="string" comment=""/>
>         <column name="enabled" _type="boolean" comment=""/>
>     </columns>
>     <storage_tables>
>         <storage_table>
>             <storage_name>mysql_storage</storage_name>
>             <table_desc external="true"  input_format="org.apache.hadoop.mapred.TextInputFormat" output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" serde_class_name="org.apache.lens.storage.db.DBSerde" storage_handler_name="org.apache.lens.storage.db.DBStorageHandler"  >
>                 <table_parameters>
>                     <property name="lens.metastore.native.table.name" value="dim_account"/>
>                     <property name="lens.metastore.native.db.name" value="lensdb"/>
>                 </table_parameters>
>             </table_desc>
>         </storage_table>
>     </storage_tables>
> </x_dimension_table>
>
>
> The actual table in mysql is lensdb.dim_account
>
>
> Thanks,
> Puneet Gupta
>
> On Tue, Aug 30, 2016 at 12:33 PM, Rajat Khandelwal <rajatgupta59@gmail.com
> > wrote:
>
>> The idea of partitions in dimtables is based on snapshots. A metadata is
>> either snapshotted (at some interval the entire metadata is added in a new
>> partition), or it's fixed (like you mentioned). In that case, I believe
>> removing the partition column declaration from the dimtable xml file should
>> work. We have the same use case and are using it in production without any
>> issue. Hive metadata is snpathotted hourly/daily and db metadata is fixed.
>>
>>
>> On Tue, Aug 30, 2016 at 2:02 AM Tao Yan <ty...@linkedin.com> wrote:
>>
>>> Hi Rajat,
>>>
>>> I created the storage and dimtable using the configurations you
>>> provided, and the query failed because no partition is added to the table,
>>> so, I added the following partition:
>>>
>>> *<x_partition fact_or_dimension_table_name="dimension1_subset"
>>> update_period="HOURLY"*
>>> *  xmlns="uri:lens:cube:0.1"
>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>>> <http://www.w3.org/2001/XMLSchema-instance>"*
>>> *  xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">*
>>> *  <time_partition_spec>*
>>> *    <part_spec_element key="dt" value="2016-08-25T00:00:00"/>*
>>> *  </time_partition_spec>*
>>> *</x_partition>*
>>>
>>> And when I run the query again, Lens translate the Lens query to the
>>> following SQL:
>>>
>>>
>>>
>>> *lens-shell>select primary_key, attr1, attr5 from dimension129 Aug 2016
>>> 18:10:06 [Spring Shell] INFO  cliLogger - Query handle:
>>> ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b029 Aug 2016 18:10:06 [Spring Shell]
>>> INFO  cliLogger - User query: 'select primary_key, attr1, attr5 from
>>> dimension1' was submitted to jdbc/mysql29 Aug 2016 18:10:06 [Spring Shell]
>>> INFO  cliLogger -  Driver query: 'SELECT ( dimension1 . primary_key ), (
>>> dimension1 . attr1 ), ( dimension1 . attr5 ) FROM mysql_dimension1_subset
>>> dimension1 WHERE ((((dimension1.dt = 'latest'))))' and Driver handle: null**And
>>> it failed because of 'unknown column' error:*
>>>
>>> *29 Aug 2016 17:59:24 [Spring Shell] INFO  cliLogger - Query Status:
>>> Status : FAILED Message : Query execution failed! Progress : 1.0 Error :
>>> Unknown column 'dimension1_subset__lens_dimension1_subset_dimension1.dt' in
>>> 'where clause'*
>>>
>>> *This is expected because the column dt is not part of mysql table **dimension1_subset.
>>> I added the column to the table and also set the value to 'latest', then,
>>> the query returned the result.*
>>>
>>>
>>>
>>> *​-----------------*
>>>
>>> *primary_key    attr1    attr5Results of query stored at :
>>> hdfs://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv
>>> <http://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv>
>>>  4 rows processed in (0) seconds.*
>>>
>>>
>>> *I think this is not the way Lens should work with MySQL because
>>> sometimes MySQL table cannot be changed, and it should not always be
>>> changed just because Lens is going to query it. In Hive, when a newer
>>> partition is added, Lens will automatically created a 'latest' partition, I
>>> wonder how that work in MySQL or other databases. *
>>>
>>> Is it possible to add a MySQL table to a partitioned dimtable without
>>> specifying the partition column, and assume the added MySQL table is the
>>> latest partition?
>>>
>>> Thanks,
>>>
>>> On Mon, Aug 29, 2016 at 12:46 AM, Rajat Khandelwal <
>>> rajatgupta59@gmail.com> wrote:
>>>
>>>> Use something like
>>>>
>>>> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>>>> <x_storage xmlns="uri:lens:cube:0.1" name="mysql" classname="org.apache.lens.storage.db.DBStorage">
>>>> </x_storage>
>>>>
>>>>
>>>> <storage_table>
>>>>   <update_periods>
>>>>     <update_period>HOURLY</update_period>
>>>>   </update_periods>
>>>>   <storage_name>db</storage_name>
>>>>   <table_desc external="true"  input_format="org.apache.hadoop.mapred.TextInputFormat" output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" serde_class_name="org.apache.lens.storage.db.DBSerde" storage_handler_name="org.apache.lens.storage.db.DBStorageHandler" num_buckets="0" compressed="false">
>>>>     <part_cols>
>>>>       <column name="pt" _type="string" comment="date partition"/>
>>>>       <column name="dt" _type="string" comment="date partition"/>
>>>>       <column name="ot" _type="string" comment="date partition"/>
>>>>     </part_cols>
>>>>     <table_parameters>
>>>>       <property name="lens.metastore.native.table.name" value="table_name_in_mysql_db"/>
>>>>       <property name="lens.metastore.native.db.name" value="mysql_db_name"/>
>>>>       <property name="lens.metastore.native.table.column.mapping"
>>>>                 value="pt=process_time,field1=field1_actual_name,field2=field2_actual_name"/>
>>>>       <property name="cube.storagetable.start.times" value="now - 4 days"/>
>>>>     </table_parameters>
>>>>     <time_part_cols>pt</time_part_cols>
>>>>     <time_part_cols>dt</time_part_cols>
>>>>     <time_part_cols>ot</time_part_cols>
>>>>   </table_desc>
>>>> </storage_table>
>>>>
>>>>
>>>> On Sat, Aug 27, 2016 at 3:42 AM Tao Yan <ty...@linkedin.com> wrote:
>>>>
>>>>> Hi Lens Developers,
>>>>>
>>>>> I am trying to add mysql as a data source, so, I created the driver as
>>>>> follows:
>>>>>
>>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>>
>>>>> <configuration>
>>>>>   <property>
>>>>>     <name>lens.driver.jdbc.driver.class</name>
>>>>>     <value>com.mysql.jdbc.Driver</value>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>lens.driver.jdbc.db.uri</name>
>>>>>     <value>jdbc:mysql://****mysql**hostname****/lens;user=****;p
>>>>> assowrd=****</value>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>lens.driver.jdbc.db.user</name>
>>>>>     <value>****</value>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>lens.cube.query.driver.supported.storages</name>
>>>>>     <value>mysql</value>
>>>>>     <final>true</final>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>lens.driver.jdbc.query.rewriter</name>
>>>>>     <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
>>>>>   </property>
>>>>>   <property>
>>>>>     <name>lens.driver.jdbc.explain.keyword</name>
>>>>>     <value>explain plan for </value>
>>>>>   </property>
>>>>> </configuration>
>>>>>
>>>>> And created a test table with data in mysql:
>>>>>
>>>>> create table dimension1_subset (
>>>>> primary_key BIGINT(8) not null primary key,
>>>>> attr1 BIGINT(8),
>>>>> attr5 BIGINT(8)
>>>>> );
>>>>>
>>>>> [image: Screen Shot 2016-08-26 at 2.44.24 PM.png]
>>>>> ​
>>>>> And created the storage file:
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>
>>>>> <x_storage classname="org.apache.lens.storage.db.DBStorage"
>>>>> name="mysql" xmlns="uri:lens:cube:0.1"
>>>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>>>   <properties>
>>>>>     *<property name="lens.storage.db.url"
>>>>> value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
>>>>> <http://lva1-db34.corp.linkedin.com/lens>"/>*
>>>>>   </properties>
>>>>> </x_storage>
>>>>>
>>>>> I am confused when set the value for *lens.storage.db.url *because* it
>>>>> is neither local filesystem nor HDFS.*
>>>>>
>>>>> And I don't know what should the storage table look like:
>>>>>
>>>>> <x_dimension_table dimension_name="dimension1"
>>>>> table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
>>>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>>>   <columns>
>>>>>     <column name="primary_key" _type="BIGINT"/>
>>>>>     <column name="attr1" _type="BIGINT"/>
>>>>>     <column name="attr5" _type="BIGINT"/>
>>>>>   </columns>
>>>>>   <properties>
>>>>>     <property name="dimension1.prop" value="t3"/>
>>>>>   </properties>
>>>>>   <storage_tables>
>>>>>
>>>>>
>>>>> *    <storage_table>      ...    </storage_table>*
>>>>>   </storage_tables>
>>>>> </x_dimension_table>
>>>>>
>>>>> What should I use for the table_location, and what about part_cols?
>>>>> How do I add data to this table?
>>>>>
>>>>> Thanks,
>>>>> --
>>>>>
>>>>> *Tao Yan*
>>>>> Software Engineer
>>>>> Data Analytics Infrastructure Tools and Services
>>>>>
>>>>>
>>>>>
>>>>> 206.250.5345
>>>>> tyan@linkedin.com
>>>>> https://www.linkedin.com/in/taousc
>>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> *Tao Yan*
>>> Software Engineer
>>> Data Analytics Infrastructure Tools and Services
>>>
>>>
>>>
>>> 206.250.5345
>>> tyan@linkedin.com
>>> https://www.linkedin.com/in/taousc
>>>
>>
>
> _____________________________________________________________
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.




-- 

*Tao Yan*
Software Engineer
Data Analytics Infrastructure Tools and Services



206.250.5345
tyan@linkedin.com
https://www.linkedin.com/in/taousc

Re: Adding MySQL as a Data Source

Posted by Puneet Gupta <pu...@inmobi.com>.
Adding example for dimension table without partitions as mentioned by Rajat


<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<x_dimension_table xmlns="uri:lens:cube:0.1" dimension_name="account"
table_name="account_mysql" weight="0.1">
    <columns>
        <column name="id" _type="int" comment=""/>
        <column name="account_name" _type="string" comment=""/>
        <column name="status" _type="string" comment=""/>
        <column name="enabled" _type="boolean" comment=""/>
    </columns>
    <storage_tables>
        <storage_table>
            <storage_name>mysql_storage</storage_name>
            <table_desc external="true"
input_format="org.apache.hadoop.mapred.TextInputFormat"
output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
serde_class_name="org.apache.lens.storage.db.DBSerde"
storage_handler_name="org.apache.lens.storage.db.DBStorageHandler"  >
                <table_parameters>
                    <property name="lens.metastore.native.table.name"
value="dim_account"/>
                    <property name="lens.metastore.native.db.name"
value="lensdb"/>
                </table_parameters>
            </table_desc>
        </storage_table>
    </storage_tables>
</x_dimension_table>


The actual table in mysql is lensdb.dim_account


Thanks,
Puneet Gupta

On Tue, Aug 30, 2016 at 12:33 PM, Rajat Khandelwal <ra...@gmail.com>
wrote:

> The idea of partitions in dimtables is based on snapshots. A metadata is
> either snapshotted (at some interval the entire metadata is added in a new
> partition), or it's fixed (like you mentioned). In that case, I believe
> removing the partition column declaration from the dimtable xml file should
> work. We have the same use case and are using it in production without any
> issue. Hive metadata is snpathotted hourly/daily and db metadata is fixed.
>
>
> On Tue, Aug 30, 2016 at 2:02 AM Tao Yan <ty...@linkedin.com> wrote:
>
>> Hi Rajat,
>>
>> I created the storage and dimtable using the configurations you provided,
>> and the query failed because no partition is added to the table, so, I
>> added the following partition:
>>
>> *<x_partition fact_or_dimension_table_name="dimension1_subset"
>> update_period="HOURLY"*
>> *  xmlns="uri:lens:cube:0.1"
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>> <http://www.w3.org/2001/XMLSchema-instance>"*
>> *  xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">*
>> *  <time_partition_spec>*
>> *    <part_spec_element key="dt" value="2016-08-25T00:00:00"/>*
>> *  </time_partition_spec>*
>> *</x_partition>*
>>
>> And when I run the query again, Lens translate the Lens query to the
>> following SQL:
>>
>>
>>
>> *lens-shell>select primary_key, attr1, attr5 from dimension129 Aug 2016
>> 18:10:06 [Spring Shell] INFO  cliLogger - Query handle:
>> ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b029 Aug 2016 18:10:06 [Spring Shell]
>> INFO  cliLogger - User query: 'select primary_key, attr1, attr5 from
>> dimension1' was submitted to jdbc/mysql29 Aug 2016 18:10:06 [Spring Shell]
>> INFO  cliLogger -  Driver query: 'SELECT ( dimension1 . primary_key ), (
>> dimension1 . attr1 ), ( dimension1 . attr5 ) FROM mysql_dimension1_subset
>> dimension1 WHERE ((((dimension1.dt = 'latest'))))' and Driver handle: null**And
>> it failed because of 'unknown column' error:*
>>
>> *29 Aug 2016 17:59:24 [Spring Shell] INFO  cliLogger - Query Status:
>> Status : FAILED Message : Query execution failed! Progress : 1.0 Error :
>> Unknown column 'dimension1_subset__lens_dimension1_subset_dimension1.dt' in
>> 'where clause'*
>>
>> *This is expected because the column dt is not part of mysql table **dimension1_subset.
>> I added the column to the table and also set the value to 'latest', then,
>> the query returned the result.*
>>
>>
>>
>> *​-----------------*
>>
>> *primary_key    attr1    attr5Results of query stored at :
>> hdfs://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv
>> <http://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv>
>>  4 rows processed in (0) seconds.*
>>
>>
>> *I think this is not the way Lens should work with MySQL because
>> sometimes MySQL table cannot be changed, and it should not always be
>> changed just because Lens is going to query it. In Hive, when a newer
>> partition is added, Lens will automatically created a 'latest' partition, I
>> wonder how that work in MySQL or other databases. *
>>
>> Is it possible to add a MySQL table to a partitioned dimtable without
>> specifying the partition column, and assume the added MySQL table is the
>> latest partition?
>>
>> Thanks,
>>
>> On Mon, Aug 29, 2016 at 12:46 AM, Rajat Khandelwal <
>> rajatgupta59@gmail.com> wrote:
>>
>>> Use something like
>>>
>>> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>>> <x_storage xmlns="uri:lens:cube:0.1" name="mysql" classname="org.apache.lens.storage.db.DBStorage">
>>> </x_storage>
>>>
>>>
>>> <storage_table>
>>>   <update_periods>
>>>     <update_period>HOURLY</update_period>
>>>   </update_periods>
>>>   <storage_name>db</storage_name>
>>>   <table_desc external="true"  input_format="org.apache.hadoop.mapred.TextInputFormat" output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" serde_class_name="org.apache.lens.storage.db.DBSerde" storage_handler_name="org.apache.lens.storage.db.DBStorageHandler" num_buckets="0" compressed="false">
>>>     <part_cols>
>>>       <column name="pt" _type="string" comment="date partition"/>
>>>       <column name="dt" _type="string" comment="date partition"/>
>>>       <column name="ot" _type="string" comment="date partition"/>
>>>     </part_cols>
>>>     <table_parameters>
>>>       <property name="lens.metastore.native.table.name" value="table_name_in_mysql_db"/>
>>>       <property name="lens.metastore.native.db.name" value="mysql_db_name"/>
>>>       <property name="lens.metastore.native.table.column.mapping"
>>>                 value="pt=process_time,field1=field1_actual_name,field2=field2_actual_name"/>
>>>       <property name="cube.storagetable.start.times" value="now - 4 days"/>
>>>     </table_parameters>
>>>     <time_part_cols>pt</time_part_cols>
>>>     <time_part_cols>dt</time_part_cols>
>>>     <time_part_cols>ot</time_part_cols>
>>>   </table_desc>
>>> </storage_table>
>>>
>>>
>>> On Sat, Aug 27, 2016 at 3:42 AM Tao Yan <ty...@linkedin.com> wrote:
>>>
>>>> Hi Lens Developers,
>>>>
>>>> I am trying to add mysql as a data source, so, I created the driver as
>>>> follows:
>>>>
>>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>
>>>> <configuration>
>>>>   <property>
>>>>     <name>lens.driver.jdbc.driver.class</name>
>>>>     <value>com.mysql.jdbc.Driver</value>
>>>>   </property>
>>>>   <property>
>>>>     <name>lens.driver.jdbc.db.uri</name>
>>>>     <value>jdbc:mysql://****mysql**hostname****/lens;user=****;
>>>> passowrd=****</value>
>>>>   </property>
>>>>   <property>
>>>>     <name>lens.driver.jdbc.db.user</name>
>>>>     <value>****</value>
>>>>   </property>
>>>>   <property>
>>>>     <name>lens.cube.query.driver.supported.storages</name>
>>>>     <value>mysql</value>
>>>>     <final>true</final>
>>>>   </property>
>>>>   <property>
>>>>     <name>lens.driver.jdbc.query.rewriter</name>
>>>>     <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
>>>>   </property>
>>>>   <property>
>>>>     <name>lens.driver.jdbc.explain.keyword</name>
>>>>     <value>explain plan for </value>
>>>>   </property>
>>>> </configuration>
>>>>
>>>> And created a test table with data in mysql:
>>>>
>>>> create table dimension1_subset (
>>>> primary_key BIGINT(8) not null primary key,
>>>> attr1 BIGINT(8),
>>>> attr5 BIGINT(8)
>>>> );
>>>>
>>>> [image: Screen Shot 2016-08-26 at 2.44.24 PM.png]
>>>> ​
>>>> And created the storage file:
>>>>
>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>
>>>> <x_storage classname="org.apache.lens.storage.db.DBStorage"
>>>> name="mysql" xmlns="uri:lens:cube:0.1"
>>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>>   <properties>
>>>>     *<property name="lens.storage.db.url"
>>>> value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
>>>> <http://lva1-db34.corp.linkedin.com/lens>"/>*
>>>>   </properties>
>>>> </x_storage>
>>>>
>>>> I am confused when set the value for *lens.storage.db.url *because* it
>>>> is neither local filesystem nor HDFS.*
>>>>
>>>> And I don't know what should the storage table look like:
>>>>
>>>> <x_dimension_table dimension_name="dimension1"
>>>> table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
>>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>>   <columns>
>>>>     <column name="primary_key" _type="BIGINT"/>
>>>>     <column name="attr1" _type="BIGINT"/>
>>>>     <column name="attr5" _type="BIGINT"/>
>>>>   </columns>
>>>>   <properties>
>>>>     <property name="dimension1.prop" value="t3"/>
>>>>   </properties>
>>>>   <storage_tables>
>>>>
>>>>
>>>> *    <storage_table>      ...    </storage_table>*
>>>>   </storage_tables>
>>>> </x_dimension_table>
>>>>
>>>> What should I use for the table_location, and what about part_cols? How
>>>> do I add data to this table?
>>>>
>>>> Thanks,
>>>> --
>>>>
>>>> *Tao Yan*
>>>> Software Engineer
>>>> Data Analytics Infrastructure Tools and Services
>>>>
>>>>
>>>>
>>>> 206.250.5345
>>>> tyan@linkedin.com
>>>> https://www.linkedin.com/in/taousc
>>>>
>>>
>>
>>
>> --
>>
>> *Tao Yan*
>> Software Engineer
>> Data Analytics Infrastructure Tools and Services
>>
>>
>>
>> 206.250.5345
>> tyan@linkedin.com
>> https://www.linkedin.com/in/taousc
>>
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: Adding MySQL as a Data Source

Posted by Rajat Khandelwal <ra...@gmail.com>.
The idea of partitions in dimtables is based on snapshots. A metadata is
either snapshotted (at some interval the entire metadata is added in a new
partition), or it's fixed (like you mentioned). In that case, I believe
removing the partition column declaration from the dimtable xml file should
work. We have the same use case and are using it in production without any
issue. Hive metadata is snpathotted hourly/daily and db metadata is fixed.


On Tue, Aug 30, 2016 at 2:02 AM Tao Yan <ty...@linkedin.com> wrote:

> Hi Rajat,
>
> I created the storage and dimtable using the configurations you provided,
> and the query failed because no partition is added to the table, so, I
> added the following partition:
>
> *<x_partition fact_or_dimension_table_name="dimension1_subset"
> update_period="HOURLY"*
> *  xmlns="uri:lens:cube:0.1"
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
> <http://www.w3.org/2001/XMLSchema-instance>"*
> *  xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">*
> *  <time_partition_spec>*
> *    <part_spec_element key="dt" value="2016-08-25T00:00:00"/>*
> *  </time_partition_spec>*
> *</x_partition>*
>
> And when I run the query again, Lens translate the Lens query to the
> following SQL:
>
>
>
> *lens-shell>select primary_key, attr1, attr5 from dimension129 Aug 2016
> 18:10:06 [Spring Shell] INFO  cliLogger - Query handle:
> ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b029 Aug 2016 18:10:06 [Spring Shell]
> INFO  cliLogger - User query: 'select primary_key, attr1, attr5 from
> dimension1' was submitted to jdbc/mysql29 Aug 2016 18:10:06 [Spring Shell]
> INFO  cliLogger -  Driver query: 'SELECT ( dimension1 . primary_key ), (
> dimension1 . attr1 ), ( dimension1 . attr5 ) FROM mysql_dimension1_subset
> dimension1 WHERE ((((dimension1.dt = 'latest'))))' and Driver handle: null**And
> it failed because of 'unknown column' error:*
>
> *29 Aug 2016 17:59:24 [Spring Shell] INFO  cliLogger - Query Status:
> Status : FAILED Message : Query execution failed! Progress : 1.0 Error :
> Unknown column 'dimension1_subset__lens_dimension1_subset_dimension1.dt' in
> 'where clause'*
>
> *This is expected because the column dt is not part of mysql table **dimension1_subset.
> I added the column to the table and also set the value to 'latest', then,
> the query returned the result.*
>
>
>
> *​-----------------*
>
> *primary_key    attr1    attr5Results of query stored at :
> hdfs://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv
> <http://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv>
>  4 rows processed in (0) seconds.*
>
>
> *I think this is not the way Lens should work with MySQL because sometimes
> MySQL table cannot be changed, and it should not always be changed just
> because Lens is going to query it. In Hive, when a newer partition is
> added, Lens will automatically created a 'latest' partition, I wonder how
> that work in MySQL or other databases. *
>
> Is it possible to add a MySQL table to a partitioned dimtable without
> specifying the partition column, and assume the added MySQL table is the
> latest partition?
>
> Thanks,
>
> On Mon, Aug 29, 2016 at 12:46 AM, Rajat Khandelwal <rajatgupta59@gmail.com
> > wrote:
>
>> Use something like
>>
>> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>> <x_storage xmlns="uri:lens:cube:0.1" name="mysql" classname="org.apache.lens.storage.db.DBStorage">
>> </x_storage>
>>
>>
>> <storage_table>
>>   <update_periods>
>>     <update_period>HOURLY</update_period>
>>   </update_periods>
>>   <storage_name>db</storage_name>
>>   <table_desc external="true"  input_format="org.apache.hadoop.mapred.TextInputFormat" output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" serde_class_name="org.apache.lens.storage.db.DBSerde" storage_handler_name="org.apache.lens.storage.db.DBStorageHandler" num_buckets="0" compressed="false">
>>     <part_cols>
>>       <column name="pt" _type="string" comment="date partition"/>
>>       <column name="dt" _type="string" comment="date partition"/>
>>       <column name="ot" _type="string" comment="date partition"/>
>>     </part_cols>
>>     <table_parameters>
>>       <property name="lens.metastore.native.table.name" value="table_name_in_mysql_db"/>
>>       <property name="lens.metastore.native.db.name" value="mysql_db_name"/>
>>       <property name="lens.metastore.native.table.column.mapping"
>>                 value="pt=process_time,field1=field1_actual_name,field2=field2_actual_name"/>
>>       <property name="cube.storagetable.start.times" value="now - 4 days"/>
>>     </table_parameters>
>>     <time_part_cols>pt</time_part_cols>
>>     <time_part_cols>dt</time_part_cols>
>>     <time_part_cols>ot</time_part_cols>
>>   </table_desc>
>> </storage_table>
>>
>>
>> On Sat, Aug 27, 2016 at 3:42 AM Tao Yan <ty...@linkedin.com> wrote:
>>
>>> Hi Lens Developers,
>>>
>>> I am trying to add mysql as a data source, so, I created the driver as
>>> follows:
>>>
>>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>
>>> <configuration>
>>>   <property>
>>>     <name>lens.driver.jdbc.driver.class</name>
>>>     <value>com.mysql.jdbc.Driver</value>
>>>   </property>
>>>   <property>
>>>     <name>lens.driver.jdbc.db.uri</name>
>>>
>>> <value>jdbc:mysql://****mysql**hostname****/lens;user=****;passowrd=****</value>
>>>   </property>
>>>   <property>
>>>     <name>lens.driver.jdbc.db.user</name>
>>>     <value>****</value>
>>>   </property>
>>>   <property>
>>>     <name>lens.cube.query.driver.supported.storages</name>
>>>     <value>mysql</value>
>>>     <final>true</final>
>>>   </property>
>>>   <property>
>>>     <name>lens.driver.jdbc.query.rewriter</name>
>>>     <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
>>>   </property>
>>>   <property>
>>>     <name>lens.driver.jdbc.explain.keyword</name>
>>>     <value>explain plan for </value>
>>>   </property>
>>> </configuration>
>>>
>>> And created a test table with data in mysql:
>>>
>>> create table dimension1_subset (
>>> primary_key BIGINT(8) not null primary key,
>>> attr1 BIGINT(8),
>>> attr5 BIGINT(8)
>>> );
>>>
>>> [image: Screen Shot 2016-08-26 at 2.44.24 PM.png]
>>> ​
>>> And created the storage file:
>>>
>>> <?xml version="1.0" encoding="UTF-8"?>
>>>
>>> <x_storage classname="org.apache.lens.storage.db.DBStorage" name="mysql"
>>> xmlns="uri:lens:cube:0.1"
>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>   <properties>
>>>     *<property name="lens.storage.db.url"
>>> value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
>>> <http://lva1-db34.corp.linkedin.com/lens>"/>*
>>>   </properties>
>>> </x_storage>
>>>
>>> I am confused when set the value for *lens.storage.db.url *because* it
>>> is neither local filesystem nor HDFS.*
>>>
>>> And I don't know what should the storage table look like:
>>>
>>> <x_dimension_table dimension_name="dimension1"
>>> table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
>>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>>   <columns>
>>>     <column name="primary_key" _type="BIGINT"/>
>>>     <column name="attr1" _type="BIGINT"/>
>>>     <column name="attr5" _type="BIGINT"/>
>>>   </columns>
>>>   <properties>
>>>     <property name="dimension1.prop" value="t3"/>
>>>   </properties>
>>>   <storage_tables>
>>>
>>>
>>> *    <storage_table>      ...    </storage_table>*
>>>   </storage_tables>
>>> </x_dimension_table>
>>>
>>> What should I use for the table_location, and what about part_cols? How
>>> do I add data to this table?
>>>
>>> Thanks,
>>> --
>>>
>>> *Tao Yan*
>>> Software Engineer
>>> Data Analytics Infrastructure Tools and Services
>>>
>>>
>>>
>>> 206.250.5345
>>> tyan@linkedin.com
>>> https://www.linkedin.com/in/taousc
>>>
>>
>
>
> --
>
> *Tao Yan*
> Software Engineer
> Data Analytics Infrastructure Tools and Services
>
>
>
> 206.250.5345
> tyan@linkedin.com
> https://www.linkedin.com/in/taousc
>

Re: Adding MySQL as a Data Source

Posted by Tao Yan <ty...@linkedin.com>.
Hi Rajat,

I created the storage and dimtable using the configurations you provided,
and the query failed because no partition is added to the table, so, I
added the following partition:

*<x_partition fact_or_dimension_table_name="dimension1_subset"
update_period="HOURLY"*
*  xmlns="uri:lens:cube:0.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
<http://www.w3.org/2001/XMLSchema-instance>"*
*  xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">*
*  <time_partition_spec>*
*    <part_spec_element key="dt" value="2016-08-25T00:00:00"/>*
*  </time_partition_spec>*
*</x_partition>*

And when I run the query again, Lens translate the Lens query to the
following SQL:



*lens-shell>select primary_key, attr1, attr5 from dimension129 Aug 2016
18:10:06 [Spring Shell] INFO  cliLogger - Query handle:
ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b029 Aug 2016 18:10:06 [Spring Shell]
INFO  cliLogger - User query: 'select primary_key, attr1, attr5 from
dimension1' was submitted to jdbc/mysql29 Aug 2016 18:10:06 [Spring Shell]
INFO  cliLogger -  Driver query: 'SELECT ( dimension1 . primary_key ), (
dimension1 . attr1 ), ( dimension1 . attr5 ) FROM mysql_dimension1_subset
dimension1 WHERE ((((dimension1.dt = 'latest'))))' and Driver handle: null**And
it failed because of 'unknown column' error:*

*29 Aug 2016 17:59:24 [Spring Shell] INFO  cliLogger - Query Status: Status
: FAILED Message : Query execution failed! Progress : 1.0 Error : Unknown
column 'dimension1_subset__lens_dimension1_subset_dimension1.dt' in 'where
clause'*

*This is expected because the column dt is not part of mysql table
**dimension1_subset.
I added the column to the table and also set the value to 'latest', then,
the query returned the result.*



*​-----------------*

*primary_key    attr1    attr5Results of query stored at :
hdfs://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv
<http://ltx1-holdemnn01.grid.linkedin.com:9000/user/dev_svc/lensreports/ccb3b99a-22d6-4fbb-9bb8-9f06f633c4b0.csv>
 4 rows processed in (0) seconds.*


*I think this is not the way Lens should work with MySQL because sometimes
MySQL table cannot be changed, and it should not always be changed just
because Lens is going to query it. In Hive, when a newer partition is
added, Lens will automatically created a 'latest' partition, I wonder how
that work in MySQL or other databases. *

Is it possible to add a MySQL table to a partitioned dimtable without
specifying the partition column, and assume the added MySQL table is the
latest partition?

Thanks,

On Mon, Aug 29, 2016 at 12:46 AM, Rajat Khandelwal <ra...@gmail.com>
wrote:

> Use something like
>
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
> <x_storage xmlns="uri:lens:cube:0.1" name="mysql" classname="org.apache.lens.storage.db.DBStorage">
> </x_storage>
>
>
> <storage_table>
>   <update_periods>
>     <update_period>HOURLY</update_period>
>   </update_periods>
>   <storage_name>db</storage_name>
>   <table_desc external="true"  input_format="org.apache.hadoop.mapred.TextInputFormat" output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat" serde_class_name="org.apache.lens.storage.db.DBSerde" storage_handler_name="org.apache.lens.storage.db.DBStorageHandler" num_buckets="0" compressed="false">
>     <part_cols>
>       <column name="pt" _type="string" comment="date partition"/>
>       <column name="dt" _type="string" comment="date partition"/>
>       <column name="ot" _type="string" comment="date partition"/>
>     </part_cols>
>     <table_parameters>
>       <property name="lens.metastore.native.table.name" value="table_name_in_mysql_db"/>
>       <property name="lens.metastore.native.db.name" value="mysql_db_name"/>
>       <property name="lens.metastore.native.table.column.mapping"
>                 value="pt=process_time,field1=field1_actual_name,field2=field2_actual_name"/>
>       <property name="cube.storagetable.start.times" value="now - 4 days"/>
>     </table_parameters>
>     <time_part_cols>pt</time_part_cols>
>     <time_part_cols>dt</time_part_cols>
>     <time_part_cols>ot</time_part_cols>
>   </table_desc>
> </storage_table>
>
>
> On Sat, Aug 27, 2016 at 3:42 AM Tao Yan <ty...@linkedin.com> wrote:
>
>> Hi Lens Developers,
>>
>> I am trying to add mysql as a data source, so, I created the driver as
>> follows:
>>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <configuration>
>>   <property>
>>     <name>lens.driver.jdbc.driver.class</name>
>>     <value>com.mysql.jdbc.Driver</value>
>>   </property>
>>   <property>
>>     <name>lens.driver.jdbc.db.uri</name>
>>     <value>jdbc:mysql://****mysql**hostname****/lens;user=****;
>> passowrd=****</value>
>>   </property>
>>   <property>
>>     <name>lens.driver.jdbc.db.user</name>
>>     <value>****</value>
>>   </property>
>>   <property>
>>     <name>lens.cube.query.driver.supported.storages</name>
>>     <value>mysql</value>
>>     <final>true</final>
>>   </property>
>>   <property>
>>     <name>lens.driver.jdbc.query.rewriter</name>
>>     <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
>>   </property>
>>   <property>
>>     <name>lens.driver.jdbc.explain.keyword</name>
>>     <value>explain plan for </value>
>>   </property>
>> </configuration>
>>
>> And created a test table with data in mysql:
>>
>> create table dimension1_subset (
>> primary_key BIGINT(8) not null primary key,
>> attr1 BIGINT(8),
>> attr5 BIGINT(8)
>> );
>>
>> [image: Screen Shot 2016-08-26 at 2.44.24 PM.png]
>> ​
>> And created the storage file:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <x_storage classname="org.apache.lens.storage.db.DBStorage" name="mysql"
>> xmlns="uri:lens:cube:0.1"
>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>   <properties>
>>     *<property name="lens.storage.db.url"
>> value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
>> <http://lva1-db34.corp.linkedin.com/lens>"/>*
>>   </properties>
>> </x_storage>
>>
>> I am confused when set the value for *lens.storage.db.url *because* it
>> is neither local filesystem nor HDFS.*
>>
>> And I don't know what should the storage table look like:
>>
>> <x_dimension_table dimension_name="dimension1"
>> table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
>>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>>   <columns>
>>     <column name="primary_key" _type="BIGINT"/>
>>     <column name="attr1" _type="BIGINT"/>
>>     <column name="attr5" _type="BIGINT"/>
>>   </columns>
>>   <properties>
>>     <property name="dimension1.prop" value="t3"/>
>>   </properties>
>>   <storage_tables>
>>
>>
>> *    <storage_table>      ...    </storage_table>*
>>   </storage_tables>
>> </x_dimension_table>
>>
>> What should I use for the table_location, and what about part_cols? How
>> do I add data to this table?
>>
>> Thanks,
>> --
>>
>> *Tao Yan*
>> Software Engineer
>> Data Analytics Infrastructure Tools and Services
>>
>>
>>
>> 206.250.5345
>> tyan@linkedin.com
>> https://www.linkedin.com/in/taousc
>>
>


-- 

*Tao Yan*
Software Engineer
Data Analytics Infrastructure Tools and Services



206.250.5345
tyan@linkedin.com
https://www.linkedin.com/in/taousc

Re: Adding MySQL as a Data Source

Posted by Rajat Khandelwal <ra...@gmail.com>.
Use something like

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<x_storage xmlns="uri:lens:cube:0.1" name="mysql"
classname="org.apache.lens.storage.db.DBStorage">
</x_storage>


<storage_table>
  <update_periods>
    <update_period>HOURLY</update_period>
  </update_periods>
  <storage_name>db</storage_name>
  <table_desc external="true"
input_format="org.apache.hadoop.mapred.TextInputFormat"
output_format="org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
serde_class_name="org.apache.lens.storage.db.DBSerde"
storage_handler_name="org.apache.lens.storage.db.DBStorageHandler"
num_buckets="0" compressed="false">
    <part_cols>
      <column name="pt" _type="string" comment="date partition"/>
      <column name="dt" _type="string" comment="date partition"/>
      <column name="ot" _type="string" comment="date partition"/>
    </part_cols>
    <table_parameters>
      <property name="lens.metastore.native.table.name"
value="table_name_in_mysql_db"/>
      <property name="lens.metastore.native.db.name" value="mysql_db_name"/>
      <property name="lens.metastore.native.table.column.mapping"

value="pt=process_time,field1=field1_actual_name,field2=field2_actual_name"/>
      <property name="cube.storagetable.start.times" value="now - 4 days"/>
    </table_parameters>
    <time_part_cols>pt</time_part_cols>
    <time_part_cols>dt</time_part_cols>
    <time_part_cols>ot</time_part_cols>
  </table_desc>
</storage_table>


On Sat, Aug 27, 2016 at 3:42 AM Tao Yan <ty...@linkedin.com> wrote:

> Hi Lens Developers,
>
> I am trying to add mysql as a data source, so, I created the driver as
> follows:
>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
>   <property>
>     <name>lens.driver.jdbc.driver.class</name>
>     <value>com.mysql.jdbc.Driver</value>
>   </property>
>   <property>
>     <name>lens.driver.jdbc.db.uri</name>
>
> <value>jdbc:mysql://****mysql**hostname****/lens;user=****;passowrd=****</value>
>   </property>
>   <property>
>     <name>lens.driver.jdbc.db.user</name>
>     <value>****</value>
>   </property>
>   <property>
>     <name>lens.cube.query.driver.supported.storages</name>
>     <value>mysql</value>
>     <final>true</final>
>   </property>
>   <property>
>     <name>lens.driver.jdbc.query.rewriter</name>
>     <value>org.apache.lens.driver.jdbc.ColumnarSQLRewriter</value>
>   </property>
>   <property>
>     <name>lens.driver.jdbc.explain.keyword</name>
>     <value>explain plan for </value>
>   </property>
> </configuration>
>
> And created a test table with data in mysql:
>
> create table dimension1_subset (
> primary_key BIGINT(8) not null primary key,
> attr1 BIGINT(8),
> attr5 BIGINT(8)
> );
>
> [image: Screen Shot 2016-08-26 at 2.44.24 PM.png]
> ​
> And created the storage file:
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <x_storage classname="org.apache.lens.storage.db.DBStorage" name="mysql"
> xmlns="uri:lens:cube:0.1"
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>   <properties>
>     *<property name="lens.storage.db.url"
> value="jdbc:mysql://lva1-db34.corp.linkedin.com/lens
> <http://lva1-db34.corp.linkedin.com/lens>"/>*
>   </properties>
> </x_storage>
>
> I am confused when set the value for *lens.storage.db.url *because* it is
> neither local filesystem nor HDFS.*
>
> And I don't know what should the storage table look like:
>
> <x_dimension_table dimension_name="dimension1"
> table_name="dimension1_table3" weight="5.0" xmlns="uri:lens:cube:0.1"
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> xsi:schemaLocation="uri:lens:cube:0.1 cube-0.1.xsd ">
>   <columns>
>     <column name="primary_key" _type="BIGINT"/>
>     <column name="attr1" _type="BIGINT"/>
>     <column name="attr5" _type="BIGINT"/>
>   </columns>
>   <properties>
>     <property name="dimension1.prop" value="t3"/>
>   </properties>
>   <storage_tables>
>
>
> *    <storage_table>      ...    </storage_table>*
>   </storage_tables>
> </x_dimension_table>
>
> What should I use for the table_location, and what about part_cols? How do
> I add data to this table?
>
> Thanks,
> --
>
> *Tao Yan*
> Software Engineer
> Data Analytics Infrastructure Tools and Services
>
>
>
> 206.250.5345
> tyan@linkedin.com
> https://www.linkedin.com/in/taousc
>