You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2016/04/13 00:29:25 UTC

[jira] [Commented] (HIVE-13496) Create initial test data once across multiple runs

    [ https://issues.apache.org/jira/browse/HIVE-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238128#comment-15238128 ] 

Siddharth Seth commented on HIVE-13496:
---------------------------------------

Couple of options
1. Checkin the derby file that is generated. (This would create another update step if anyone changes the generation scripts. This may not be a problem, given that q_test_init was last modified in November 2014)
2. [~ashutoshc] - was mentioning some other way to load derby which is cheaper.
3. Eventually - automate this, i.e. look for the existence of the data - and create it only if it does not exist.

> Create initial test data once across multiple runs
> --------------------------------------------------
>
>                 Key: HIVE-13496
>                 URL: https://issues.apache.org/jira/browse/HIVE-13496
>             Project: Hive
>          Issue Type: Improvement
>          Components: Test
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>
> All TestCliDriver, TezMiniTezCliDriver etc tests create a standard data set when they start up. When running on a box with SSDs - this step takes over a minute.
> Running a single qtest cannot be faster than this. On the ptest framework - all batches end up doing this which is a lot of wastage.
> Instead, this data generation should be shared across runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)