You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/22 17:58:31 UTC

[GitHub] [beam] TheNeuralBit opened a new issue, #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

TheNeuralBit opened a new issue, #22413:
URL: https://github.com/apache/beam/issues/22413

   ### What happened?
   
   `:torchInferenceTest` is broken in Python 3.9 and 3.7 PostCommits. Interesting 3.8 seems to be ok.
   
   I dug into one of the failures (https://ci-beam.apache.org/job/beam_PostCommit_Python39/635/) and found the following error log:
   ```
   05:20:28 > Task :sdks:python:test-suites:direct:py39:torchInferenceTest
   05:20:28 
   05:20:28 [gw1] PASSED apache_beam/ml/inference/pytorch_inference_it_test.py::PyTorchInference::test_torch_run_inference_imagenet_mobilenetv2 
   05:20:41 [gw0] PASSED apache_beam/ml/inference/pytorch_inference_it_test.py::PyTorchInference::test_torch_run_inference_bert_for_masked_lm 
   05:20:43 [gw2] FAILED apache_beam/ml/inference/pytorch_inference_it_test.py::PyTorchInference::test_torch_run_inference_coco_maskrcnn_resnet50_fpn 
   05:20:43 
   05:20:43 =================================== FAILURES ===================================
   05:20:43 _____ PyTorchInference.test_torch_run_inference_coco_maskrcnn_resnet50_fpn _____
   05:20:43 [gw2] linux -- Python 3.9.10 /home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python39/src/build/gradleenv/1398941893/bin/python3.9
   05:20:43 
   05:20:43 self = <apache_beam.ml.inference.pytorch_inference_it_test.PyTorchInference testMethod=test_torch_run_inference_coco_maskrcnn_resnet50_fpn>
   05:20:43 
   05:20:43     @pytest.mark.uses_pytorch
   05:20:43     @pytest.mark.it_postcommit
   05:20:43     def test_torch_run_inference_coco_maskrcnn_resnet50_fpn(self):
   05:20:43       test_pipeline = TestPipeline(is_integration_test=True)
   05:20:43       # text files containing absolute path to the coco validation data on GCS
   05:20:43       file_of_image_names = 'gs://apache-beam-ml/testing/inputs/it_coco_validation_inputs.txt'  # pylint: disable=line-too-long
   05:20:43       output_file_dir = 'gs://apache-beam-ml/testing/predictions'
   05:20:43       output_file = '/'.join([output_file_dir, str(uuid.uuid4()), 'result.txt'])
   05:20:43     
   05:20:43       model_state_dict_path = 'gs://apache-beam-ml/models/torchvision.models.detection.maskrcnn_resnet50_fpn.pth'  # pylint: disable=line-too-long
   05:20:43       images_dir = 'gs://apache-beam-ml/datasets/coco/raw-data/val2017'
   05:20:43       extra_opts = {
   05:20:43           'input': file_of_image_names,
   05:20:43           'output': output_file,
   05:20:43           'model_state_dict_path': model_state_dict_path,
   05:20:43           'images_dir': images_dir,
   05:20:43       }
   05:20:43       pytorch_image_segmentation.run(
   05:20:43           test_pipeline.get_full_options_as_args(**extra_opts),
   05:20:43           save_main_session=False)
   05:20:43     
   05:20:43       self.assertEqual(FileSystems().exists(output_file), True)
   05:20:43       predictions = process_outputs(filepath=output_file)
   05:20:43       actuals_file = 'gs://apache-beam-ml/testing/expected_outputs/test_torch_run_inference_coco_maskrcnn_resnet50_fpn_actuals.txt'  # pylint: disable=line-too-long
   05:20:43       actuals = process_outputs(filepath=actuals_file)
   05:20:43     
   05:20:43       predictions_dict = {}
   05:20:43       for prediction in predictions:
   05:20:43         filename, prediction_labels = prediction.split(';')
   05:20:43         predictions_dict[filename] = prediction_labels
   05:20:43     
   05:20:43       for actual in actuals:
   05:20:43         filename, actual_labels = actual.split(';')
   05:20:43         prediction_labels = predictions_dict[filename]
   05:20:43 >       self.assertEqual(actual_labels, prediction_labels)
   05:20:43 E       AssertionError: "['refrigerator', 'oven', 'orange', 'dining [534 chars]or']" != "['traffic light', 'car', 'traffic light', '[15 chars]ar']"
   05:20:43 E       Diff is 650 characters long. Set self.maxDiff to None to see it.
   05:20:43 
   05:20:43 apache_beam/ml/inference/pytorch_inference_it_test.py:128: AssertionError
   ```
   
   ### Issue Priority
   
   Priority: 1
   
   ### Issue Component
   
   Component: sdk-py-core


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192823596

   We recently enabled  inference tests in https://github.com/apache/beam/pull/22324 but they were passing. possibly the test is flaky.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn closed issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py38:torchInferenceTest` failing in Python PostCommits

Posted by "tvalentyn (via GitHub)" <gi...@apache.org>.
tvalentyn closed issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py38:torchInferenceTest` failing in Python PostCommits
URL: https://github.com/apache/beam/issues/22413


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192819346

   I'm able to replicate locally with `./gradlew :sdks:python:test-suites:direct:py37:torchInferenceTest`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1654245152

   This happens again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py38:torchInferenceTest` failing in Python PostCommits

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1654247649

   CC: @AnandInguva @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] closed issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits
URL: https://github.com/apache/beam/issues/22413


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192812855

   @yenady could you take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yeandy commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
yeandy commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1234291509

   Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1234235800

   Can we close this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192853560

   May be. the model could have changed. I haven't checked it yet. but the test was passing before https://ci-beam.apache.org/job/beam_PostCommit_Python39/627/consoleText


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192853682

   I will take a look. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192957972

   @AnandInguva discovered that the input dataset was changed on GCS. We're skipping the test until we can determine why the dataset was changed and what we should do about it (likely either change the input data, or change the expected value).
   
   Another follow-up action might be to see if we can lock down these files somehow.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py38:torchInferenceTest` failing in Python PostCommits

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1654247321

   FAILED apache_beam/ml/inference/pytorch_inference_it_test.py::PyTorchInference::test_torch_run_inference_bert_for_masked_lm
   
   ```
           if len(error_msgs) > 0:
   08:22:26 >           raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
   08:22:26                                self.__class__.__name__, "\n\t".join(error_msgs)))
   08:22:26 E           RuntimeError: Error(s) in loading state_dict for BertForMaskedLM:
   08:22:26 E           	Unexpected key(s) in state_dict: "bert.embeddings.position_ids".
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yeandy commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
yeandy commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1234291603

   .close-issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1194315653

   Lowering to P2 since the test has been skipped. @yeandy can you help with:
   - Decide if we should restore the original input or change the assertion
   - Do you know why the input dataset changed? Was it intentional or accidental?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] tvalentyn commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192823114

   @AnandInguva can you look into this failure?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1192850111

   I've reproed it several times locally, have yet to see it pass. Is it possible an externally hosted model changed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yeandy commented on issue #22413: [Bug]: `:sdks:python:test-suites:direct:py39:torchInferenceTest` and `:sdks:python:test-suites:direct:py37:torchInferenceTest` failing in Python PostCommits

Posted by GitBox <gi...@apache.org>.
yeandy commented on issue #22413:
URL: https://github.com/apache/beam/issues/22413#issuecomment-1198095427

   Sorry about this. Made a change to that test [1], which required changing the expected outputs file. But I updated the file in GCS prematurely before the PR merge. (Is there a better mechanism to automatically change an input file upon merge?).
   
   [1] https://github.com/apache/beam/pull/22371


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org