You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@sdap.apache.org by gi...@apache.org on 2023/03/13 17:48:35 UTC

[incubator-sdap-in-situ-data-services] branch dependabot/pip/pyspark-3.2.2 updated (9862c31 -> 40b707c)

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch dependabot/pip/pyspark-3.2.2
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-in-situ-data-services.git


    omit 9862c31  chore(deps): bump pyspark from 3.1.2 to 3.2.2
     add aba83f0  feat: add CLI script
     add 4f72eb3  chore: update readme
     add 8d78b1b  chore: add changelog
     add 0c1c4d1  fix: added `meta` column as defaul column
     add 7221ffe  fix: add default column in correct place
     add c2a6d2a  chore: update changelog + swagger
     add c97b9b2  chore: merged from apache master
     add 65c4651  chore: merge from apache master
     add bc2e14a  chore: switch to SDAP ticket
     add 57ed8dc  Merge branch 'master' of github.com:wphyojpl/incubator-sdap-in-situ-data-services
     add 786d92a  Merge branch 'master' of github.com:apache/incubator-sdap-in-situ-data-services
     add 982d498  feat: add ci/cd to build lambda zip file
     add 3d0fe1c  breaking:Elasticsearch Logic (#1)
     add 576b1c1  fix: get lambda function working
     add 8d374b0  fix: throw runtime err when ES ingest fail
     add f2095f3  chore: add lambda logger
     add e304425  feat: use cdms_schema json to create spark struct obj
     add 9824c6e  chore: merge from master
     add c45898c  chore: use insitu schema json to get column names
     add c001092  feat: add observation type counts
     add f33bf06  fix: add missig param in calling stats retriever
     add eeff2b9  fix: add observation agg in query
     add 6398704  fix: get parquet stat to ES working for SQS multiple records input
     add 69c4e39  fix: not throwing error when deleting items that do not exist in ES
     add 9e30423  fix: allow config w/o checking mandatory variables
     add 0fbb6a0  feat: download small parquet file to extract stat locally
     add 07f6000  fix: use singleton to re-use session to reduce time
     add e31e246  fix: return NULL if query not found in ES + enhance statistics endpoint to include depth, time and bbox
     add c77c943  fix: use unquote_plus to replace `+` to ` ` for s3 url
     add acc410f  feat: not waiting for ingest to finish
     add 9800a38  fix: ci/cd for lambda-docker
     add fbd7f20  fix: get lambda with ECS working with pyspark
     add f7d28d6  fix: add missing files
     add 040776c  fix: update makefile
     add 3adf08a  fix: use pandas to avoid int + float in source data
     add 1d83a52  fix:s3 ingestion lambda works as a zip now
     add 09f82be  feat: make 30x30 tiles + different log level for spark vs cdms flask
     add 4d8a2e8  fix: query-missing-depth = float, log statement when deleting ES, ingest w/o pandas + mandatory depth conversion to float
     add 8b046e0  fix: UnsupportedOperationException:org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainLongDictionary workaround
     add 858e23a  fix: wind from & to is in long which screws with the schema
     add feaf344  fix: make unique spark-app-name
     add cefb0f0  feat: use alias instead of real index
     add f3d7466  fix: update test constants to use alias
     add 2daba45  feat: Configurable partitioning (#3)
     add 69da174  Merge branch 'es.branch' of github.com:wphyojpl/incubator-sdap-in-situ-data-services into es.branch
     add 5c70764  fix: config is in string form. not in int form
     add e03a862  fix: config is in string form. not in int form
     add c8779b1  feat: add elasticsearch index for ddb table
     add 7eab73f  chore: add missing test file
     add 140d017  feat: add ES based metadata table
     add 32de57e  feat: using ES for metadata
     add 8c6b786  fix: need to pass empty str, not None
     add 1209603  fix: parallel validator bug
     add c59d8db  Merge pull request #12 from wphyojpl/es.branch
     add 40b707c  chore(deps): bump pyspark from 3.1.2 to 3.2.2

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (9862c31)
            \
             N -- N -- N   refs/heads/dependabot/pip/pyspark-3.2.2 (40b707c)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 .gitignore                                         |    4 +-
 ci.cd/Makefile                                     |   41 +
 ci.cd/create_s3_zip.sh                             |   27 +
 ci.cd/lambda_docker_upload.sh                      |    6 +
 ci.cd/local_upload.sh                              |    8 +
 docker/parquet.lambda.Dockerfile                   |   43 +
 documentations/navair.demo.md                      |  106 ++
 etc/elasticsearch/all_alias.json                   |    7 +
 etc/elasticsearch/entry_file_records.json          |   20 +
 etc/elasticsearch/parquet_stats_v1.json            |   64 +
 etc/elasticsearch/setup_es.txt                     |   15 +
 etc/lambda-spark/spark-class                       |    8 +
 etc/lambda-spark/spark-defaults.conf               |    4 +
 k8s_spark/k8s_spark/org.alues.yaml                 |  731 ++++++++
 k8s_spark/nohup.out                                |    4 +
 .../parquet.spark.helm/charts/spark-5.9.4.tgz      |  Bin 0 -> 36223 bytes
 k8s_spark/parquet.spark.helm/nohup.out             | 1973 ++++++++++++++++++++
 .../parquet.spark.helm/templates/deployment.yaml   |    4 +
 k8s_spark/parquet.spark.helm/values.yaml           |   10 +-
 nohup.out                                          |    4 +
 one_offs/local_flask.py                            |   16 +
 one_offs/py_geo_hash_test.py                       |   12 +
 one_offs/trigger.s3.ingest.py                      |   43 +
 parquet_flask/__init__.py                          |    5 +-
 parquet_flask/__main__.py                          |    9 +-
 parquet_flask/aws/aws_cred.py                      |   23 +-
 parquet_flask/aws/es_abstract.py                   |   55 +
 parquet_flask/aws/es_factory.py                    |   16 +
 parquet_flask/aws/es_middleware.py                 |  202 ++
 parquet_flask/aws/es_middleware_aws.py             |   30 +
 .../cdms_lambda_func/cdms_lambda_constants.py      |    8 +
 .../cdms_lambda_func/index_to_es}/__init__.py      |    0
 .../cdms_lambda_func/index_to_es/execute_lambda.py |   18 +
 .../index_to_es/parquet_file_es_indexer.py         |   85 +
 .../index_to_es/parquet_stat_extractor.py          |   35 +
 .../index_to_es/s3_stat_extractor.py               |  202 ++
 .../ingest_s3_to_cdms/ingest_s3_to_cdms.py         |   66 +-
 parquet_flask/cdms_lambda_func/lambda_func_env.py  |    3 +
 .../cdms_lambda_func/lambda_logger_generator.py    |   30 +
 .../cdms_lambda_func/s3_records}/__init__.py       |    0
 .../cdms_lambda_func/s3_records/s3_2_sqs.py        |  165 ++
 .../s3_records/s3_event_validator_abstract.py      |   19 +
 parquet_flask/io_logic/cdms_constants.py           |   21 +-
 parquet_flask/io_logic/cdms_schema.py              |   89 +
 parquet_flask/io_logic/ingest_new_file.py          |   92 +-
 .../{metadata_tbl_io.py => metadata_tbl_es.py}     |   44 +-
 parquet_flask/io_logic/metadata_tbl_interface.py   |    4 +
 parquet_flask/io_logic/metadata_tbl_io.py          |    4 +
 .../io_logic/parquet_paths_es_retriever.py         |  114 ++
 .../parquet_query_condition_management_v3.py       |   19 +-
 ...py => parquet_query_condition_management_v4.py} |   79 +-
 parquet_flask/io_logic/partitioned_parquet_path.py |   48 +
 parquet_flask/io_logic/query_v4.py                 |   51 +-
 parquet_flask/io_logic/raw_query.py                |    2 +-
 parquet_flask/io_logic/replace_file.py             |    2 +-
 .../io_logic/sub_collection_statistics.py          |  290 +++
 .../parquet_stat_extractor}/__init__.py            |    0
 .../parquet_stat_extractor/local_spark_session.py  |   16 +
 .../local_statistics_retriever.py                  |   34 +
 .../parquet_stat_extractor/statistics_retriever.py |  206 ++
 .../statistics_retriever_wrapper.py                |   39 +
 parquet_flask/utils/config.py                      |   16 +-
 parquet_flask/utils/factory_abstract.py            |    7 +
 parquet_flask/utils/general_utils.py               |    8 +
 parquet_flask/utils/parallel_json_validator.py     |   15 +-
 parquet_flask/utils/spatial_utils.py               |   30 +
 parquet_flask/utils/time_utils.py                  |   12 +-
 parquet_flask/v1/__init__.py                       |   10 +-
 .../v1/extract_statistics_from_parquet_file.py     |   47 +
 parquet_flask/v1/ingest_aws_json.py                |   17 +-
 .../v1/insitu_query_swagger/insitu-spec-0.0.1.yml  |    2 +-
 .../v1/query_data_doms_custom_pagination.py        |   18 +-
 .../v1/sub_collection_statistics_endpoint.py       |   80 +
 rotate_keys.bash                                   |   28 +
 setup.py                                           |    5 +-
 setup.py => setup_lambda.py                        |   18 +-
 tests/back_to_basis/Test1/._SUCCESS.crc            |  Bin 0 -> 8 bytes
 .../back_to_basis/Test1/_SUCCESS                   |    0
 {parquet_cli => tests/back_to_basis}/__init__.py   |    0
 tests/back_to_basis/local_spark.py                 |   54 +
 tests/back_to_basis/s3_read.py                     |   28 +
 tests/back_to_basis/s3_spark.py                    |   51 +
 tests/bench_mark/bench_mark.py                     |   52 +-
 tests/bench_mark/bench_parallel_process.py         |   32 +-
 tests/get_aws_creds.py                             |   16 +
 .../parquet_flask/aws}/__init__.py                 |    0
 .../aws/manual_test_es_middleware_aws.py           |   31 +
 .../parquet_flask/cdms_lambda_func}/__init__.py    |    0
 .../cdms_lambda_func/index_to_es}/__init__.py      |    0
 .../manual_test_parquet_file_es_indexer.py         |   75 +
 .../index_to_es/test_parquet_stat_extractor.py     |   37 +
 .../index_to_es/test_s3_stat_extractor.py          |   44 +
 .../cdms_lambda_func/s3_records}/__init__.py       |    0
 .../cdms_lambda_func/s3_records/test_s3_s2_sqs.py  |   32 +
 .../manual_test_parquet_paths_es_retriever.py      |   33 +
 tests/parquet_flask/io_logic/test_cdms_schema.py   |   29 +
 .../parquet_flask/io_logic/test_ingest_new_file.py |   20 +
 .../parquet_flask/io_logic/test_metadata_tbl_es.py |   54 +
 .../test_parquet_query_condition_management_v3.py  |  147 +-
 .../io_logic/test_partitioned_parquet_path.py      |   53 +-
 .../parquet_stat_extractor}/__init__.py            |    0
 .../parquet_stat_extractor/in_situ_schema.json     |    0
 ...882-3536-435b-b736-96bf3be9ee29.c000.gz.parquet |  Bin 0 -> 17393 bytes
 .../test_local_statistics_retriever.py             |  170 ++
 .../test_statistics_retriever.py                   |   63 +
 tests/parquet_flask/utils/test_general_utils.py    |    7 +
 tests/parquet_flask/utils/test_spatial_utils.py    |   23 +
 107 files changed, 6367 insertions(+), 272 deletions(-)
 create mode 100644 ci.cd/Makefile
 create mode 100755 ci.cd/create_s3_zip.sh
 create mode 100644 ci.cd/lambda_docker_upload.sh
 create mode 100755 ci.cd/local_upload.sh
 create mode 100644 docker/parquet.lambda.Dockerfile
 create mode 100644 documentations/navair.demo.md
 create mode 100644 etc/elasticsearch/all_alias.json
 create mode 100644 etc/elasticsearch/entry_file_records.json
 create mode 100644 etc/elasticsearch/parquet_stats_v1.json
 create mode 100644 etc/elasticsearch/setup_es.txt
 create mode 100644 etc/lambda-spark/spark-class
 create mode 100644 etc/lambda-spark/spark-defaults.conf
 create mode 100644 k8s_spark/k8s_spark/org.alues.yaml
 create mode 100644 k8s_spark/nohup.out
 create mode 100644 k8s_spark/parquet.spark.helm/charts/spark-5.9.4.tgz
 create mode 100644 k8s_spark/parquet.spark.helm/nohup.out
 create mode 100644 nohup.out
 create mode 100644 one_offs/local_flask.py
 create mode 100644 one_offs/py_geo_hash_test.py
 create mode 100644 one_offs/trigger.s3.ingest.py
 create mode 100644 parquet_flask/aws/es_abstract.py
 create mode 100644 parquet_flask/aws/es_factory.py
 create mode 100644 parquet_flask/aws/es_middleware.py
 create mode 100644 parquet_flask/aws/es_middleware_aws.py
 create mode 100644 parquet_flask/cdms_lambda_func/cdms_lambda_constants.py
 copy {parquet_cli => parquet_flask/cdms_lambda_func/index_to_es}/__init__.py (100%)
 create mode 100644 parquet_flask/cdms_lambda_func/index_to_es/execute_lambda.py
 create mode 100644 parquet_flask/cdms_lambda_func/index_to_es/parquet_file_es_indexer.py
 create mode 100644 parquet_flask/cdms_lambda_func/index_to_es/parquet_stat_extractor.py
 create mode 100644 parquet_flask/cdms_lambda_func/index_to_es/s3_stat_extractor.py
 create mode 100644 parquet_flask/cdms_lambda_func/lambda_logger_generator.py
 copy {parquet_cli => parquet_flask/cdms_lambda_func/s3_records}/__init__.py (100%)
 create mode 100644 parquet_flask/cdms_lambda_func/s3_records/s3_2_sqs.py
 create mode 100644 parquet_flask/cdms_lambda_func/s3_records/s3_event_validator_abstract.py
 copy parquet_flask/io_logic/{metadata_tbl_io.py => metadata_tbl_es.py} (57%)
 create mode 100644 parquet_flask/io_logic/parquet_paths_es_retriever.py
 copy parquet_flask/io_logic/{parquet_query_condition_management_v3.py => parquet_query_condition_management_v4.py} (64%)
 create mode 100644 parquet_flask/io_logic/sub_collection_statistics.py
 copy {parquet_cli => parquet_flask/parquet_stat_extractor}/__init__.py (100%)
 create mode 100644 parquet_flask/parquet_stat_extractor/local_spark_session.py
 create mode 100644 parquet_flask/parquet_stat_extractor/local_statistics_retriever.py
 create mode 100644 parquet_flask/parquet_stat_extractor/statistics_retriever.py
 create mode 100644 parquet_flask/parquet_stat_extractor/statistics_retriever_wrapper.py
 create mode 100644 parquet_flask/utils/factory_abstract.py
 create mode 100644 parquet_flask/utils/spatial_utils.py
 create mode 100644 parquet_flask/v1/extract_statistics_from_parquet_file.py
 create mode 100644 parquet_flask/v1/sub_collection_statistics_endpoint.py
 create mode 100755 rotate_keys.bash
 copy setup.py => setup_lambda.py (65%)
 create mode 100644 tests/back_to_basis/Test1/._SUCCESS.crc
 copy parquet_cli/__init__.py => tests/back_to_basis/Test1/_SUCCESS (100%)
 copy {parquet_cli => tests/back_to_basis}/__init__.py (100%)
 create mode 100644 tests/back_to_basis/local_spark.py
 create mode 100644 tests/back_to_basis/s3_read.py
 create mode 100644 tests/back_to_basis/s3_spark.py
 create mode 100644 tests/get_aws_creds.py
 copy {parquet_cli => tests/parquet_flask/aws}/__init__.py (100%)
 create mode 100644 tests/parquet_flask/aws/manual_test_es_middleware_aws.py
 copy {parquet_cli => tests/parquet_flask/cdms_lambda_func}/__init__.py (100%)
 copy {parquet_cli => tests/parquet_flask/cdms_lambda_func/index_to_es}/__init__.py (100%)
 create mode 100644 tests/parquet_flask/cdms_lambda_func/index_to_es/manual_test_parquet_file_es_indexer.py
 create mode 100644 tests/parquet_flask/cdms_lambda_func/index_to_es/test_parquet_stat_extractor.py
 create mode 100644 tests/parquet_flask/cdms_lambda_func/index_to_es/test_s3_stat_extractor.py
 copy {parquet_cli => tests/parquet_flask/cdms_lambda_func/s3_records}/__init__.py (100%)
 create mode 100644 tests/parquet_flask/cdms_lambda_func/s3_records/test_s3_s2_sqs.py
 create mode 100644 tests/parquet_flask/io_logic/manual_test_parquet_paths_es_retriever.py
 create mode 100644 tests/parquet_flask/io_logic/test_cdms_schema.py
 create mode 100644 tests/parquet_flask/io_logic/test_ingest_new_file.py
 create mode 100644 tests/parquet_flask/io_logic/test_metadata_tbl_es.py
 copy {parquet_cli => tests/parquet_flask/parquet_stat_extractor}/__init__.py (100%)
 copy in_situ_schema.json => tests/parquet_flask/parquet_stat_extractor/in_situ_schema.json (100%)
 create mode 100644 tests/parquet_flask/parquet_stat_extractor/part-00000-74ebb882-3536-435b-b736-96bf3be9ee29.c000.gz.parquet
 create mode 100644 tests/parquet_flask/parquet_stat_extractor/test_local_statistics_retriever.py
 create mode 100644 tests/parquet_flask/parquet_stat_extractor/test_statistics_retriever.py
 create mode 100644 tests/parquet_flask/utils/test_spatial_utils.py