You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2016/07/21 04:30:20 UTC
[jira] [Issue Comment Deleted] (TEZ-3358) Group ATSLogs for
multiple DAGs into one file.
[ https://issues.apache.org/jira/browse/TEZ-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hitesh Shah updated TEZ-3358:
-----------------------------
Comment: was deleted
(was: Some comments - mostly minor:
- Lets use a slightly different naming convention for TEZ_HISTORY_LOGGING_USED_NUM_DAGS_PER_GROUP - it is a bit too similar to TEZ_HISTORY_LOGGING_NUM_DAGS_PER_GROUP and can potential cause confusion.
- TEZ_HISTORY_LOGGING_NUM_DAGS_PER_GROUP - add javadoc denoting impact on HDFS in terms of no. of files per dag vs group
- s/ATS/YARN Timeline/
- also lets mark all of these new configs as Private and Unstable.
- "getGroupId(int numDagsPerGroup)" - should this throw an invalid arg for groupCnt == 1?
{code}
Set<TimelineEntityGroupId> groupId = convertToTimelineEntityGroupIds(entityType, entityId);
+ if (groupId != null && !groupId.isEmpty()) {
+ groupIds.addAll(groupId);
+ appIdSet.add(groupId.iterator().next().getApplicationId());
}
{code}
- could this code be moved into createTimelineEntityGroupIds() or does it need to be replicated in the various places in use currently?
- In TestTimelineCachePluginImpl, minor nit: use "new Configuration(false) " instead of "new Configuration()"
)
> Group ATSLogs for multiple DAGs into one file.
> ----------------------------------------------
>
> Key: TEZ-3358
> URL: https://issues.apache.org/jira/browse/TEZ-3358
> Project: Apache Tez
> Issue Type: Sub-task
> Reporter: Harish Jaiprakash
> Assignee: Harish Jaiprakash
> Attachments: TEZ-3358.01.patch, TEZ-3358.02.patch
>
>
> Currently we create one history log file per DAG, change to use one group for multiple DAGs to prevent creation of too many files on hdfs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)