You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry via Review Board <no...@reviews.apache.org> on 2020/03/02 18:57:16 UTC
Re: Review Request 71025: Import Service: Support Concurrent Ingest
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
-----------------------------------------------------------
(Updated March 2, 2020, 6:57 p.m.)
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Updates include:
- Reduces size of patch by breaking in to smaller implementations.
Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320
Repository: atlas
Description
-------
**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within _AtlasImportRequest_.
- Existing import implementation continues to function as before. This is maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```
**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```
Diffs (updated)
-----
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/StatusReporter.java PRE-CREATION
Diff: https://reviews.apache.org/r/71025/diff/7/
Changes: https://reviews.apache.org/r/71025/diff/6-7/
Testing
-------
**Unit tests**
Existing tests.
**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.
**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
**Volume tests**
- Measure performance with large data.
+----------+----------+----------+------------------------+
| File | Before | After | Configuration |
+----------+----------+----------+------------------------+
| smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
| (2.2 MB) | | | |
+----------+----------+----------+------------------------+
| largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
| (40 MB) | | | |
+----------+----------+----------+------------------------+
Thanks,
Ashutosh Mestry
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219749
-----------------------------------------------------------
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java
Lines 384 (patched)
<https://reviews.apache.org/r/71025/#comment307948>
can you avoid this null check? consider initializing 'entityChangeNotifier' to a no-op operation.
- Sarath Subramanian
On March 2, 2020, 9:13 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 2, 2020, 9:13 p.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/71025/diff/9/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Nikhil Bonte <ni...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219746
-----------------------------------------------------------
Ship it!
Ship It!
- Nikhil Bonte
On March 3, 2020, 5:13 a.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 3, 2020, 5:13 a.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/71025/diff/9/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Nixon Rodrigues <ni...@freestoneinfotech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219732
-----------------------------------------------------------
Ship it!
Ship It!
- Nixon Rodrigues
On March 3, 2020, 5:13 a.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 3, 2020, 5:13 a.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/71025/diff/9/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219784
-----------------------------------------------------------
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java
Line 114 (original), 117 (patched)
<https://reviews.apache.org/r/71025/#comment307980>
nit: casting to String is not needed.
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java
Line 56 (original), 56 (patched)
<https://reviews.apache.org/r/71025/#comment307982>
add '@Override' annotation to methods overriding from interface.
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java
Line 69 (original), 69 (patched)
<https://reviews.apache.org/r/71025/#comment307983>
add '@Override' annotation to methods overriding from interface.
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java
Line 73 (original), 64 (patched)
<https://reviews.apache.org/r/71025/#comment307981>
ternary operation here is long and not intuitive. Consider refactoring to method:
ImportStrategy importStrategy = initImportStrategy(importResult);
- Sarath Subramanian
On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 4, 2020, 10:09 p.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
> repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
>
>
> Diff: https://reviews.apache.org/r/71025/diff/12/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
> On March 5, 2020, 9:30 a.m., Sarath Subramanian wrote:
> > repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
> > Lines 34 (patched)
> > <https://reviews.apache.org/r/71025/diff/12/?file=2212930#file2212930line34>
> >
> > methods defined here looks more of like helper methods than interface methods.
Since this is a drop-in for reduced impact, it needs to have same signature as the original concrete implementation. Changing this will involve refactoring original code. I can take it up after this commit.
- Ashutosh
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
-----------------------------------------------------------
On March 5, 2020, 5:43 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 5, 2020, 5:43 p.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
> repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
>
>
> Diff: https://reviews.apache.org/r/71025/diff/13/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219785
-----------------------------------------------------------
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java
Lines 34 (patched)
<https://reviews.apache.org/r/71025/#comment307984>
methods defined here looks more of like helper methods than interface methods.
- Sarath Subramanian
On March 4, 2020, 10:09 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 4, 2020, 10:09 p.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
> repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
>
>
> Diff: https://reviews.apache.org/r/71025/diff/12/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Sarath Subramanian <sa...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/#review219859
-----------------------------------------------------------
Ship it!
Ship It!
- Sarath Subramanian
On March 5, 2020, 9:43 a.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71025/
> -----------------------------------------------------------
>
> (Updated March 5, 2020, 9:43 a.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
>
>
> Bugs: ATLAS-3320
> https://issues.apache.org/jira/browse/ATLAS-3320
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Approach**
> - Use existing producer-consumer (PC) framework.
> - Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
> - Add support for configuring number of workers and batch size within _AtlasImportRequest_.
> - Existing import implementation continues to function as before. This is maintained for backward compatibility.
> - New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
> - The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
>
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25
> }
> }
> ```
> Support for ZipDirect format:
> _AtlasImportRequest_
> ```
> {
> "options": {
> "numWorkers": 8,
> "batchSize": 25,
> "format": "zipDirect",
> "migration": "true"
> }
> }
> ```
>
>
> **CURL**
> ```
> curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
> ```
>
>
> Diffs
> -----
>
> graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
> repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
> repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
> repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
> repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
> repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
> repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
> repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
>
>
> Diff: https://reviews.apache.org/r/71025/diff/14/
>
>
> Testing
> -------
>
> **Unit tests**
> Existing tests.
>
> **Functional tests**
> - Verified import for pre-1.0 and post-1.0 exported ZIP files.
>
> **Pre-commit**
> https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
>
> **Volume tests**
> - Measure performance with large data.
>
> +----------+----------+----------+------------------------+
> | File | Before | After | Configuration |
> +----------+----------+----------+------------------------+
> | smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
> | (2.2 MB) | | | |
> +----------+----------+----------+------------------------+
> | largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
> | (40 MB) | | | |
> +----------+----------+----------+------------------------+
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
-----------------------------------------------------------
(Updated March 5, 2020, 5:43 p.m.)
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Updates include:
- Addressed review comments.
Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320
Repository: atlas
Description
-------
**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within _AtlasImportRequest_.
- Existing import implementation continues to function as before. This is maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```
**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```
Diffs (updated)
-----
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
Diff: https://reviews.apache.org/r/71025/diff/13/
Changes: https://reviews.apache.org/r/71025/diff/12-13/
Testing
-------
**Unit tests**
Existing tests.
**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.
**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
**Volume tests**
- Measure performance with large data.
+----------+----------+----------+------------------------+
| File | Before | After | Configuration |
+----------+----------+----------+------------------------+
| smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
| (2.2 MB) | | | |
+----------+----------+----------+------------------------+
| largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
| (40 MB) | | | |
+----------+----------+----------+------------------------+
Thanks,
Ashutosh Mestry
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
-----------------------------------------------------------
(Updated March 5, 2020, 6:09 a.m.)
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Updates include:
- Found fix for failing UT.
Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320
Repository: atlas
Description
-------
**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within _AtlasImportRequest_.
- Existing import implementation continues to function as before. This is maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```
**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```
Diffs (updated)
-----
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapperV2.java 0f2b4bfae
repository/src/main/java/org/apache/atlas/repository/graph/IFullTextMapper.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityChangeNotifier.java d7020a702
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/IAtlasEntityChangeNotifier.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
repository/src/test/java/org/apache/atlas/TestModules.java 06e0ebc6c
Diff: https://reviews.apache.org/r/71025/diff/12/
Changes: https://reviews.apache.org/r/71025/diff/11-12/
Testing (updated)
-------
**Unit tests**
Existing tests.
**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.
**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1712/
**Volume tests**
- Measure performance with large data.
+----------+----------+----------+------------------------+
| File | Before | After | Configuration |
+----------+----------+----------+------------------------+
| smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
| (2.2 MB) | | | |
+----------+----------+----------+------------------------+
| largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
| (40 MB) | | | |
+----------+----------+----------+------------------------+
Thanks,
Ashutosh Mestry
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
-----------------------------------------------------------
(Updated March 4, 2020, 6:30 a.m.)
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Updates include:
- Addressed review comments.
Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320
Repository: atlas
Description
-------
**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within _AtlasImportRequest_.
- Existing import implementation continues to function as before. This is maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```
**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```
Diffs (updated)
-----
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/EntityChangeNotifierNop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/FullTextMapperV2Nop.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
Diff: https://reviews.apache.org/r/71025/diff/10/
Changes: https://reviews.apache.org/r/71025/diff/9-10/
Testing
-------
**Unit tests**
Existing tests.
**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.
**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
**Volume tests**
- Measure performance with large data.
+----------+----------+----------+------------------------+
| File | Before | After | Configuration |
+----------+----------+----------+------------------------+
| smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
| (2.2 MB) | | | |
+----------+----------+----------+------------------------+
| largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
| (40 MB) | | | |
+----------+----------+----------+------------------------+
Thanks,
Ashutosh Mestry
Re: Review Request 71025: Import Service: Support Concurrent Ingest
Posted by Ashutosh Mestry via Review Board <no...@reviews.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71025/
-----------------------------------------------------------
(Updated March 3, 2020, 5:13 a.m.)
Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, and Sarath Subramanian.
Changes
-------
Updates include:
- Modified approach for getting zip file size during migraiton import.
Bugs: ATLAS-3320
https://issues.apache.org/jira/browse/ATLAS-3320
Repository: atlas
Description
-------
**Approach**
- Use existing producer-consumer (PC) framework.
- Modify _BulkImporterImpl_ to implement _WorkItemConsumer_.
- Add support for configuring number of workers and batch size within _AtlasImportRequest_.
- Existing import implementation continues to function as before. This is maintained for backward compatibility.
- New implementation supports additional more memory efficient zip format (_ZipDirect_). This drastically reduces memory requirement during import.
- The new import strategy, _MigrationImport_ uses the _bulkLoading_ mode of _JanusGraph_ thereby achieving high ingest rates.
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25
}
}
```
Support for ZipDirect format:
_AtlasImportRequest_
```
{
"options": {
"numWorkers": 8,
"batchSize": 25,
"format": "zipDirect",
"migration": "true"
}
}
```
**CURL**
```
curl -v -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@./import-options.json -F data=@./Default-3-pre.zip http://localhost:21000/api/atlas/admin/import
```
Diffs (updated)
-----
graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraph.java 4acb371f1
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 3362bf158
repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java bbe0dc5ba
repository/src/main/java/org/apache/atlas/repository/impexp/AuditsWriter.java 55990f780
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 1964ade9a
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSourceDirect.java cb5a7acd0
repository/src/main/java/org/apache/atlas/repository/migration/ZipFileMigrationImporter.java f552525a4
repository/src/main/java/org/apache/atlas/repository/store/graph/AtlasEntityStore.java 39ea3f82e
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasEntityStoreV2.java 30f5e5a7c
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/AtlasRelationshipStoreV2.java fdf117a25
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/BulkImporterImpl.java 54c32c5e8
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/EntityGraphMapper.java 2f3aad06b
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/ImportStrategy.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/MigrationImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/RegularImport.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumer.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityConsumerBuilder.java PRE-CREATION
repository/src/main/java/org/apache/atlas/repository/store/graph/v2/bulkimport/pc/EntityCreationManager.java PRE-CREATION
Diff: https://reviews.apache.org/r/71025/diff/9/
Changes: https://reviews.apache.org/r/71025/diff/8-9/
Testing
-------
**Unit tests**
Existing tests.
**Functional tests**
- Verified import for pre-1.0 and post-1.0 exported ZIP files.
**Pre-commit**
https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1292
**Volume tests**
- Measure performance with large data.
+----------+----------+----------+------------------------+
| File | Before | After | Configuration |
+----------+----------+----------+------------------------+
| smalldb | 6 min | 2 min | Shards: 4, Threads: 8 |
| (2.2 MB) | | | |
+----------+----------+----------+------------------------+
| largedb | 3 hrs | 10 mins | Shards: 4, Threads: 16 |
| (40 MB) | | | |
+----------+----------+----------+------------------------+
Thanks,
Ashutosh Mestry