You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by GitBox <gi...@apache.org> on 2021/09/10 18:08:33 UTC

[GitHub] [accumulo-testing] DomGarguilo opened a new pull request #152: Added check for existing output directory in /bin/cingest verify

DomGarguilo opened a new pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152


   Fixes #92 
   
   Added a check to see if the output directory exists. If so, the user is prompted to see if they want the directory to be removed and the verify job rerun.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918330635


   > One way to handle this which might be nice would be by default, create a new directory for each verify run. That way the user would not have to be prompted and no results would be lost.
   
   The only concern there is how large each directory could get when running verify on a larger cluster. I think prompting the user for each run is the safest approach. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo edited a comment on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo edited a comment on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-919463821


   With the changes in 3e6926d, a new subdirectory gets created for each run:
   ```
   dgarguilo@thor:~/github/fluo-uno$ hadoop fs -ls /tmp/ci-verify
   Found 5 items
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:14 /tmp/ci-verify/ContinuousVerify_ci_1631646755241
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:18 /tmp/ci-verify/ContinuousVerify_ci_1631646992174
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:33 /tmp/ci-verify/ContinuousVerify_ci_1631647887518
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:38 /tmp/ci-verify/ContinuousVerify_ci_1631648160103
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:44 /tmp/ci-verify/ContinuousVerify_ci_1631648563940
   ```
   EDIT:
   In 4b2ee0a, I also added a logger so we could let the user know where the results will be stored.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 edited a comment on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918313294


   Currently, If you choose to keep the directory then it will exit the operation. I do think we could add other options such as overwriting the existing directory or create a new directory to hold the results of the new verify. I do think prompting the user is necessary. ~~I do not want to accidentally overwrite my previous verify results and then have to run the entire map-reduce job over again.~~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] milleruntime commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
milleruntime commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918156778


   Does it resume properly if you chose to keep the directory? If not, you could just have it always delete the directory and not have to bother prompting the user.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918313294


   Currently, If you choose to keep the directory then it will exit the operation. I do think we could add other options such as overwriting the existing directory or create a new directory to hold the results of the new verify. I do think prompting the user is necessary. I do not want to accidentally overwrite my previous verify results and then have to run the entire map-reduce job over again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918334437


   > The only concern there is how large each directory could get when running verify on a larger cluster. I think prompting the user for each run is the safest approach.
   
   Good point. Then maybe the user could be prompted to choose from three options:
   1. Stop the operation
   2. Delete existing directory and rerun
   3. Create a new directory in which to store the results
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918328206


   One way to handle this which might be nice would be by default, create a new directory for each verify run. That way the user would not have to be prompted and no results would be lost. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918337006


   Those options seem reasonable to me. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918478582


   After further investigation, it does seem safe to just delete the directory after each run or before the next run. Though a prompt to the user isn't a terrible idea, it might not be necessary. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 edited a comment on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918313294


   Currently, If you choose to keep the directory then it will exit the operation. I do think we could add other options such as overwriting the existing directory or create a new directory to hold the results of the new verify.  ~~I do think prompting the user is necessary. I do not want to accidentally overwrite my previous verify results and then have to run the entire map-reduce job over again.~~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-920051363


   Your last two commits are interesting. In d828f3d7b57535d341748c12ebbbd73ee5043b1f, you reverted the logging changes but 4b2ee0a brought those back. Was that intended?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918502956


   The changes in 9160790 deletes the output directory (if exists) before each run. With this, the user is not prompted and `verify` can be run multiple times in succession.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-920056156


   > Your last two commits are interesting. In [d828f3d](https://github.com/apache/accumulo-testing/commit/d828f3d7b57535d341748c12ebbbd73ee5043b1f), you reverted the logging changes but [4b2ee0a](https://github.com/apache/accumulo-testing/commit/4b2ee0a402e73f91f9489209738e507a7ec3d428) brought those back. Was that intended?
   
   Yes. I initially removed the logger since it was unused and then realized it would be helpful to have and use it so I added it back.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-917119963


   Changes look good so far. I think having a case for a user inputting 'n' would be nice as currently, it will throw an IllegalStateException instead of job closing cleanly. Screenshot below
   
   ![image](https://user-images.githubusercontent.com/29436247/132900461-a4aa563d-3c9a-4a72-8d16-bdba0441f355.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-917121455


   > Changes look good so far. I think having a case for a user inputting 'n' would be nice as currently, it will throw an IllegalStateException instead of job closing cleanly. Screenshot below
   
   Yea I wasn't too sure how to best handle that case. Closing cleanly would probably be best.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918481541


   > After further investigation, it does seem safe to just delete the directory after each run or before the next run. Though a prompt to the user isn't a terrible idea, it might not be necessary.
   
   I agree. It looks like the files produced by running verify are empty temporary files that don't have any use. To me it seems like the best course of action would to be to delete the directory before each run without a prompt to the user. I'll work on adding those changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] keith-turner commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
keith-turner commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918502301


   >  It looks like the files produced by running verify are empty temporary files that don't have any use.
   
   When the verify job does not detect any missing data the files are empty.  When missing data is detected, the files contain information about the missing data that is useful in debugging the problem of why Accumulo lost data.
   
   >  To me it seems like the best course of action would to be to delete the directory before each run without a prompt to the user.
   
   I like that behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-919437868


   It was suggested that a unique directory could be created for each run. This would allow for multiple runs to be compared to one another. I will push some changes that make a new sub-directory who's name reflects the unique name of the mapreduce job carried out by `verify`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] Manno15 edited a comment on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
Manno15 edited a comment on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-918330635


   > One way to handle this which might be nice would be by default, create a new directory for each verify run. That way the user would not have to be prompted and no results would be lost.
   
   ~~The only concern there is how large each directory could get when running verify on a larger cluster. I think prompting the user for each run is the safest approach.~~ 
   
   EDIT: I did not realize the files in the temp directory in hdfs gets cleared out after execution is completed.  
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo commented on pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo commented on pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152#issuecomment-919463821


   With the changes in 3e6926d, a new subdirectory gets created for each run:
   ```
   dgarguilo@thor:~/github/fluo-uno$ hadoop fs -ls /tmp/ci-verify
   Found 5 items
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:14 /tmp/ci-verify/ContinuousVerify_ci_1631646755241
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:18 /tmp/ci-verify/ContinuousVerify_ci_1631646992174
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:33 /tmp/ci-verify/ContinuousVerify_ci_1631647887518
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:38 /tmp/ci-verify/ContinuousVerify_ci_1631648160103
   drwxr-xr-x   - dgarguilo supergroup          0 2021-09-14 15:44 /tmp/ci-verify/ContinuousVerify_ci_1631648563940
   ```
   I also added a print statement letting the user know where the results will be stored.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [accumulo-testing] DomGarguilo merged pull request #152: Added check for existing output directory in /bin/cingest verify

Posted by GitBox <gi...@apache.org>.
DomGarguilo merged pull request #152:
URL: https://github.com/apache/accumulo-testing/pull/152


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org