You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2016/04/13 00:29:25 UTC
[jira] [Commented] (HIVE-13496) Create initial test data once
across multiple runs
[ https://issues.apache.org/jira/browse/HIVE-13496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238128#comment-15238128 ]
Siddharth Seth commented on HIVE-13496:
---------------------------------------
Couple of options
1. Checkin the derby file that is generated. (This would create another update step if anyone changes the generation scripts. This may not be a problem, given that q_test_init was last modified in November 2014)
2. [~ashutoshc] - was mentioning some other way to load derby which is cheaper.
3. Eventually - automate this, i.e. look for the existence of the data - and create it only if it does not exist.
> Create initial test data once across multiple runs
> --------------------------------------------------
>
> Key: HIVE-13496
> URL: https://issues.apache.org/jira/browse/HIVE-13496
> Project: Hive
> Issue Type: Improvement
> Components: Test
> Reporter: Siddharth Seth
> Assignee: Siddharth Seth
>
> All TestCliDriver, TezMiniTezCliDriver etc tests create a standard data set when they start up. When running on a box with SSDs - this step takes over a minute.
> Running a single qtest cannot be faster than this. On the ptest framework - all batches end up doing this which is a lot of wastage.
> Instead, this data generation should be shared across runs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)