You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "aimahou (Jira)" <ji...@apache.org> on 2020/06/01 09:42:00 UTC
[jira] [Created] (YARN-10298) TimeLine entity information only
stored in one region when use apache HBase as backend storage
aimahou created YARN-10298:
------------------------------
Summary: TimeLine entity information only stored in one region when use apache HBase as backend storage
Key: YARN-10298
URL: https://issues.apache.org/jira/browse/YARN-10298
Project: Hadoop YARN
Issue Type: Improvement
Components: ATSv2, timelineservice
Affects Versions: 3.1.1
Reporter: aimahou
h2. Issue
TimeLine entity information only stored in one region when use apache HBase as backend storage
h2. Probable cause
We found in the source code that the rowKey is composed of clusterId、userId、flowName、flowRunId and appId when hbase timeline writer stores timeline entity info,which probably cause the rowKey is sorted by dictionary order. Thus timeline entity may only store in one region or few adjacent regions.
h2. Related code snippet
HBaseTimelineWriterImpl.java
public TimelineWriteResponse write(TimelineCollectorContext context,
TimelineEntities data, UserGroupInformation callerUgi)
throws IOException {
...
boolean isApplication = ApplicationEntity.isApplicationEntity(te);
byte[] rowKey;
if (isApplication) {
ApplicationRowKey applicationRowKey =
new ApplicationRowKey(clusterId, userId, flowName, flowRunId,
appId);
rowKey = applicationRowKey.getRowKey();
store(rowKey, te, flowVersion, Tables.APPLICATION_TABLE);
} else {
EntityRowKey entityRowKey =
new EntityRowKey(clusterId, userId, flowName, flowRunId, appId,
te.getType(), te.getIdPrefix(), te.getId());
rowKey = entityRowKey.getRowKey();
store(rowKey, te, flowVersion, Tables.ENTITY_TABLE);
}
if (!isApplication && SubApplicationEntity.isSubApplicationEntity(te)) {
SubApplicationRowKey subApplicationRowKey =
new SubApplicationRowKey(subApplicationUser, clusterId,
te.getType(), te.getIdPrefix(), te.getId(), userId);
rowKey = subApplicationRowKey.getRowKey();
store(rowKey, te, flowVersion, Tables.SUBAPPLICATION_TABLE);
}
...
}
h2. Suggestion
We can use the hash code of original rowKey as the rowKey to store and read timeline entity data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org