You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry <am...@hortonworks.com> on 2017/06/02 20:52:17 UTC
Review Request 59722: Import API: Support for resuming Import
operation
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/
-----------------------------------------------------------
Review request for atlas and Madhan Neethiraj.
Bugs: ATLAS-1851
https://issues.apache.org/jira/browse/ATLAS-1851
Repository: atlas
Description
-------
**Implementation**
- Added additional options to _AtlasImportRequest_.
- Additional options:
- _startGuid_
- _startPosition_
- Added method for percentage calculation to _AtlasEntityStoreV1_.
- Updated logging message to include entity guid, type and position.
**CURL**
Create file with these contents call it _importTransform.json_:
```javascript
{ "options": {
"startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
}
}
```javascript
```
curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
```
Steps to use the behavior:
- Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
- While the import is in progress, stop atlas server (using _atlas_stop.py_).
- From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
- Update the _importTransform.json_ with the guid.
- Restart import.
You should see that import resumes from where it left off.
**Highlights**
Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
Diffs
-----
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
Diff: https://reviews.apache.org/r/59722/diff/1/
Testing
-------
**Unit tests**
Added tests to cover the new functionality. Note the usage of mock for _Logger_.
**Volume tests**
- Performed large imports with resume.
- Noted the numbers against baseline. Did not observe significant deviation.
**Functional tests**
- Used common scenarios from test suite.
**Accuracy testing**
- Not done.
Thanks,
Ashutosh Mestry
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Ashutosh Mestry <am...@hortonworks.com>.
> On June 5, 2017, 11:20 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java
> > Lines 247 (patched)
> > <https://reviews.apache.org/r/59722/diff/2/?file=1740892#file1740892line247>
> >
> > this assumes that iterator is at the begining when setPosition() is called. Consider resetting the iterator to the begining here, to remove this assumption.
Good find!
> On June 5, 2017, 11:20 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java
> > Lines 213 (patched)
> > <https://reviews.apache.org/r/59722/diff/2/?file=1740893#file1740893line223>
> >
> > Wouldn't this cause excessive in-progress messages in log - like 1 message for every entity imported, upto a million entities in import stream? It might be good to limit this limit to either "1%" or 1000 entities, which ever occurs first.
Just see the unit test. After 100 entities, the increment will be 1% at a time. In short, it will be limited to 100 entries.
- Ashutosh
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/#review176965
-----------------------------------------------------------
On June 6, 2017, 4:17 a.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59722/
> -----------------------------------------------------------
>
> (Updated June 6, 2017, 4:17 a.m.)
>
>
> Review request for atlas and Madhan Neethiraj.
>
>
> Bugs: ATLAS-1851
> https://issues.apache.org/jira/browse/ATLAS-1851
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Implementation**
> - Added additional options to _AtlasImportRequest_.
> - Additional options:
> - _startGuid_
> - _startPosition_
> - Added method for percentage calculation to _AtlasEntityStoreV1_.
> - Updated logging message to include entity guid, type and position.
>
> **CURL**
>
> Create file with these contents call it _importTransform.json_:
> ```javascript
> { "options": {
> "startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
> }
> }
> ```javascript
>
>
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
> ```
>
> Steps to use the behavior:
> - Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
> - While the import is in progress, stop atlas server (using _atlas_stop.py_).
> - From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
> - Update the _importTransform.json_ with the guid.
> - Restart import.
>
> You should see that import resumes from where it left off.
>
>
> **Highlights**
> Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
>
>
> Diffs
> -----
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
> repository/src/test/java/org/apache/atlas/repository/impexp/ZipSourceTest.java be9c20b0
> repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/59722/diff/3/
>
>
> Testing
> -------
>
> **Unit tests**
> Added tests to cover the new functionality. Note the usage of mock for _Logger_.
>
> **Volume tests**
> - Performed large imports with resume.
> - Noted the numbers against baseline. Did not observe significant deviation.
>
> **Functional tests**
> - Used common scenarios from test suite.
>
> **Accuracy testing**
> - Not done.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/#review176965
-----------------------------------------------------------
Fix it, then Ship it!
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java
Lines 247 (patched)
<https://reviews.apache.org/r/59722/#comment250501>
this assumes that iterator is at the begining when setPosition() is called. Consider resetting the iterator to the begining here, to remove this assumption.
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java
Lines 213 (patched)
<https://reviews.apache.org/r/59722/#comment250502>
Wouldn't this cause excessive in-progress messages in log - like 1 message for every entity imported, upto a million entities in import stream? It might be good to limit this limit to either "1%" or 1000 entities, which ever occurs first.
- Madhan Neethiraj
On June 2, 2017, 10:18 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59722/
> -----------------------------------------------------------
>
> (Updated June 2, 2017, 10:18 p.m.)
>
>
> Review request for atlas and Madhan Neethiraj.
>
>
> Bugs: ATLAS-1851
> https://issues.apache.org/jira/browse/ATLAS-1851
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Implementation**
> - Added additional options to _AtlasImportRequest_.
> - Additional options:
> - _startGuid_
> - _startPosition_
> - Added method for percentage calculation to _AtlasEntityStoreV1_.
> - Updated logging message to include entity guid, type and position.
>
> **CURL**
>
> Create file with these contents call it _importTransform.json_:
> ```javascript
> { "options": {
> "startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
> }
> }
> ```javascript
>
>
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
> ```
>
> Steps to use the behavior:
> - Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
> - While the import is in progress, stop atlas server (using _atlas_stop.py_).
> - From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
> - Update the _importTransform.json_ with the guid.
> - Restart import.
>
> You should see that import resumes from where it left off.
>
>
> **Highlights**
> Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
>
>
> Diffs
> -----
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
> repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/59722/diff/2/
>
>
> Testing
> -------
>
> **Unit tests**
> Added tests to cover the new functionality. Note the usage of mock for _Logger_.
>
> **Volume tests**
> - Performed large imports with resume.
> - Noted the numbers against baseline. Did not observe significant deviation.
>
> **Functional tests**
> - Used common scenarios from test suite.
>
> **Accuracy testing**
> - Not done.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/#review176996
-----------------------------------------------------------
Ship it!
Ship It!
- Madhan Neethiraj
On June 6, 2017, 4:17 a.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59722/
> -----------------------------------------------------------
>
> (Updated June 6, 2017, 4:17 a.m.)
>
>
> Review request for atlas and Madhan Neethiraj.
>
>
> Bugs: ATLAS-1851
> https://issues.apache.org/jira/browse/ATLAS-1851
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Implementation**
> - Added additional options to _AtlasImportRequest_.
> - Additional options:
> - _startGuid_
> - _startPosition_
> - Added method for percentage calculation to _AtlasEntityStoreV1_.
> - Updated logging message to include entity guid, type and position.
>
> **CURL**
>
> Create file with these contents call it _importTransform.json_:
> ```javascript
> { "options": {
> "startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
> }
> }
> ```javascript
>
>
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
> ```
>
> Steps to use the behavior:
> - Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
> - While the import is in progress, stop atlas server (using _atlas_stop.py_).
> - From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
> - Update the _importTransform.json_ with the guid.
> - Restart import.
>
> You should see that import resumes from where it left off.
>
>
> **Highlights**
> Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
>
>
> Diffs
> -----
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
> repository/src/test/java/org/apache/atlas/repository/impexp/ZipSourceTest.java be9c20b0
> repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/59722/diff/3/
>
>
> Testing
> -------
>
> **Unit tests**
> Added tests to cover the new functionality. Note the usage of mock for _Logger_.
>
> **Volume tests**
> - Performed large imports with resume.
> - Noted the numbers against baseline. Did not observe significant deviation.
>
> **Functional tests**
> - Used common scenarios from test suite.
>
> **Accuracy testing**
> - Not done.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Ashutosh Mestry <am...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/
-----------------------------------------------------------
(Updated June 6, 2017, 4:17 a.m.)
Review request for atlas and Madhan Neethiraj.
Changes
-------
Updates:
- Addressed review comments.
- Added unit test for ZipSource.
Bugs: ATLAS-1851
https://issues.apache.org/jira/browse/ATLAS-1851
Repository: atlas
Description
-------
**Implementation**
- Added additional options to _AtlasImportRequest_.
- Additional options:
- _startGuid_
- _startPosition_
- Added method for percentage calculation to _AtlasEntityStoreV1_.
- Updated logging message to include entity guid, type and position.
**CURL**
Create file with these contents call it _importTransform.json_:
```javascript
{ "options": {
"startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
}
}
```javascript
```
curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
```
Steps to use the behavior:
- Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
- While the import is in progress, stop atlas server (using _atlas_stop.py_).
- From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
- Update the _importTransform.json_ with the guid.
- Restart import.
You should see that import resumes from where it left off.
**Highlights**
Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
Diffs (updated)
-----
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
repository/src/test/java/org/apache/atlas/repository/impexp/ZipSourceTest.java be9c20b0
repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
Diff: https://reviews.apache.org/r/59722/diff/3/
Changes: https://reviews.apache.org/r/59722/diff/2-3/
Testing
-------
**Unit tests**
Added tests to cover the new functionality. Note the usage of mock for _Logger_.
**Volume tests**
- Performed large imports with resume.
- Noted the numbers against baseline. Did not observe significant deviation.
**Functional tests**
- Used common scenarios from test suite.
**Accuracy testing**
- Not done.
Thanks,
Ashutosh Mestry
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Ashutosh Mestry <am...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/
-----------------------------------------------------------
(Updated June 2, 2017, 10:18 p.m.)
Review request for atlas and Madhan Neethiraj.
Changes
-------
Updates:
- Addressed review comments.
Bugs: ATLAS-1851
https://issues.apache.org/jira/browse/ATLAS-1851
Repository: atlas
Description
-------
**Implementation**
- Added additional options to _AtlasImportRequest_.
- Additional options:
- _startGuid_
- _startPosition_
- Added method for percentage calculation to _AtlasEntityStoreV1_.
- Updated logging message to include entity guid, type and position.
**CURL**
Create file with these contents call it _importTransform.json_:
```javascript
{ "options": {
"startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
}
}
```javascript
```
curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
```
Steps to use the behavior:
- Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
- While the import is in progress, stop atlas server (using _atlas_stop.py_).
- From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
- Update the _importTransform.json_ with the guid.
- Restart import.
You should see that import resumes from where it left off.
**Highlights**
Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
Diffs (updated)
-----
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
Diff: https://reviews.apache.org/r/59722/diff/2/
Changes: https://reviews.apache.org/r/59722/diff/1-2/
Testing
-------
**Unit tests**
Added tests to cover the new functionality. Note the usage of mock for _Logger_.
**Volume tests**
- Performed large imports with resume.
- Noted the numbers against baseline. Did not observe significant deviation.
**Functional tests**
- Used common scenarios from test suite.
**Accuracy testing**
- Not done.
Thanks,
Ashutosh Mestry
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Ashutosh Mestry <am...@hortonworks.com>.
> On June 2, 2017, 9:28 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java
> > Lines 208 (patched)
> > <https://reviews.apache.org/r/59722/diff/1/?file=1740869#file1740869line218>
> >
> > Why 'currentIndex + 1'? ZipSource.currentIndex seems to have the number of entities processed so far. Please review. Also 'ZipSource.currentPosition' might be a better name, instead of currentIndex. Please review.
There small accuracy problem that comes up when starting with 0 to calculate %age. Starting with 1 seems to handle all cases well.
> On June 2, 2017, 9:28 p.m., Madhan Neethiraj wrote:
> > repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java
> > Lines 48 (patched)
> > <https://reviews.apache.org/r/59722/diff/1/?file=1740870#file1740870line48>
> >
> > Wouldn't size be always "1"? Given this object is initialized with an instance of AtlasEntityWithExtInfo?
Intent here is to discourage use of this class. Your observation is valid. I will change it.
- Ashutosh
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/#review176849
-----------------------------------------------------------
On June 2, 2017, 10:18 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59722/
> -----------------------------------------------------------
>
> (Updated June 2, 2017, 10:18 p.m.)
>
>
> Review request for atlas and Madhan Neethiraj.
>
>
> Bugs: ATLAS-1851
> https://issues.apache.org/jira/browse/ATLAS-1851
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Implementation**
> - Added additional options to _AtlasImportRequest_.
> - Additional options:
> - _startGuid_
> - _startPosition_
> - Added method for percentage calculation to _AtlasEntityStoreV1_.
> - Updated logging message to include entity guid, type and position.
>
> **CURL**
>
> Create file with these contents call it _importTransform.json_:
> ```javascript
> { "options": {
> "startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
> }
> }
> ```javascript
>
>
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
> ```
>
> Steps to use the behavior:
> - Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
> - While the import is in progress, stop atlas server (using _atlas_stop.py_).
> - From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
> - Update the _importTransform.json_ with the guid.
> - Restart import.
>
> You should see that import resumes from where it left off.
>
>
> **Highlights**
> Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
>
>
> Diffs
> -----
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
> repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/59722/diff/2/
>
>
> Testing
> -------
>
> **Unit tests**
> Added tests to cover the new functionality. Note the usage of mock for _Logger_.
>
> **Volume tests**
> - Performed large imports with resume.
> - Noted the numbers against baseline. Did not observe significant deviation.
>
> **Functional tests**
> - Used common scenarios from test suite.
>
> **Accuracy testing**
> - Not done.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 59722: Import API: Support for resuming Import
operation
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59722/#review176849
-----------------------------------------------------------
intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java
Lines 76 (patched)
<https://reviews.apache.org/r/59722/#comment250321>
Consider adding @JsonIgnore annotation here, so that 'startGuid' and 'startPosition' will not be serialized.
repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java
Lines 93 (patched)
<https://reviews.apache.org/r/59722/#comment250324>
It might be clearer if this method only sets the start-position; and have the caller call processEntities(). Consider the following:
Replace:
processEntitiesUsingStartOption(request, source, result);
With:
setStartPosition(source, request);
processEntities(source, result);
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java
Lines 208 (patched)
<https://reviews.apache.org/r/59722/#comment250326>
Why 'currentIndex + 1'? ZipSource.currentIndex seems to have the number of entities processed so far. Please review. Also 'ZipSource.currentPosition' might be a better name, instead of currentIndex. Please review.
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java
Lines 48 (patched)
<https://reviews.apache.org/r/59722/#comment250328>
Wouldn't size be always "1"? Given this object is initialized with an instance of AtlasEntityWithExtInfo?
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java
Lines 53 (patched)
<https://reviews.apache.org/r/59722/#comment250330>
Consider adding a comment like:
// not applicable for a single entity stream
repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java
Lines 58 (patched)
<https://reviews.apache.org/r/59722/#comment250329>
It will be useful for getNextEntityWithExtInfo() to maintain a counter, which can be returned from getPosition() - similar to the implementation in ZipSource.
- Madhan Neethiraj
On June 2, 2017, 8:52 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59722/
> -----------------------------------------------------------
>
> (Updated June 2, 2017, 8:52 p.m.)
>
>
> Review request for atlas and Madhan Neethiraj.
>
>
> Bugs: ATLAS-1851
> https://issues.apache.org/jira/browse/ATLAS-1851
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Implementation**
> - Added additional options to _AtlasImportRequest_.
> - Additional options:
> - _startGuid_
> - _startPosition_
> - Added method for percentage calculation to _AtlasEntityStoreV1_.
> - Updated logging message to include entity guid, type and position.
>
> **CURL**
>
> Create file with these contents call it _importTransform.json_:
> ```javascript
> { "options": {
> "startGuid": "bd97c78e-3fa5-4f9c-9f48-3683ca3d1fb1"
> }
> }
> ```javascript
>
>
> ```
> curl -g -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F request=@../docs/importTransform.json -F data=@../docs/Stocks-2.zip "http://localhost:21000/api/atlas/admin/import"
> ```
>
> Steps to use the behavior:
> - Start an import (using the CURL above) that is fairly long, say about 1000+ entities.
> - While the import is in progress, stop atlas server (using _atlas_stop.py_).
> - From the log file located at _/var/log/atlas/application.log_ get the last successfully imported entity GUID or position.
> - Update the _importTransform.json_ with the guid.
> - Restart import.
>
> You should see that import resumes from where it left off.
>
>
> **Highlights**
> Specify the _startGuid_ option and notice that the operation resumes from the correct percentage and not 0%.
>
>
> Diffs
> -----
>
> intg/src/main/java/org/apache/atlas/model/impexp/AtlasImportRequest.java 4f2c1fbc
> repository/src/main/java/org/apache/atlas/repository/impexp/ImportService.java 8a7e3585
> repository/src/main/java/org/apache/atlas/repository/impexp/ZipSource.java 76451c98
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1.java 27c0b5d4
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStreamForImport.java 69140e69
> repository/src/main/java/org/apache/atlas/repository/store/graph/v1/EntityImportStream.java 0f711db4
> repository/src/test/java/org/apache/atlas/repository/store/graph/v1/AtlasEntityStoreV1BulkImportPercentTest.java PRE-CREATION
>
>
> Diff: https://reviews.apache.org/r/59722/diff/1/
>
>
> Testing
> -------
>
> **Unit tests**
> Added tests to cover the new functionality. Note the usage of mock for _Logger_.
>
> **Volume tests**
> - Performed large imports with resume.
> - Noted the numbers against baseline. Did not observe significant deviation.
>
> **Functional tests**
> - Used common scenarios from test suite.
>
> **Accuracy testing**
> - Not done.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>