You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@wayang.apache.org by be...@apache.org on 2021/09/05 12:48:09 UTC

[incubator-wayang] branch WAYANG-36 created (now b174f9c)

This is an automated email from the ASF dual-hosted git repository.

bertty pushed a change to branch WAYANG-36
in repository https://gitbox.apache.org/repos/asf/incubator-wayang.git.


      at b174f9c  [WAYANG-36] Addition of the benchmark from "https://github.com/rheem-ecosystem/rheem-benchmark.git" and rebranding

This branch includes the following new commits:

     new 0ed02d7  add TPC-H Query 1
     new c6f2709  feat(wordcount): add WordCount app
     new 5016d57  fix(RheemApps): set Rheem version correctly and register UDF jars in apps
     new 9afc0e0  refactor(wordcount): use stdout sink instead of collecting sink
     new ee22c38  fix(wordcount): remove target Platforms from RheemPlan
     new 16e0485  fix(wordcount,tpch): add rheem-java JAR as UDF JAR
     new cffeb84  feat(kmeans): implement basic k-means
     new 2aecc6e  feat(kmeans): improve output on faulty platform
     new bd73125  test(kmeans): improve k-means test
     new 7729c96  feat(kmeans): "resurrect" lost centroids
     new bc2eb95  Merge branch 'master' of https://github.com/daqcri/RheemApps
     new 9ce990a  chore(RheemApps): remove Flink dependencies
     new d88283a  chore(RheemApps): use Scala Maven Plugin
     new ac095e7  fix(kmeans): fix main method signature
     new 48b7257  fix(kmeans): use random cluster IDs
     new 44f850d  fix(kmeans): print out usage only if no args are passed
     new b6f1c42  fix(kmeans): annotate UDF JARs
     new 9edcdf7  feat(simwords): build tool to find similar words
     new 5511055  fix(simwords): fix sampling and main method
     new 8afea62  feat(simwords): generate centroids instead of sampling them
     new 1669fc9  feat(wordcount): add Scala implementation
     new 8e3cc36  fix(wordcount): add UDF JAR to Scala implementation
     new f0235b4  feat(crocopr): add cross-community PageRank
     new 2f818ff  feat(kmeans): make point resurrection optional
     new dd4405b  feat(RheemApps): add +spark as warmed up Spark
     new 4b3dc35  feat(rheem-core,rheem-spark,rheem-java): keep track of ChannelInstance lineage
     new 67c1e1a  feat(util): create Parameters utility class
     new 4554572  fix(util): remove optional argument from overloaded method
     new 6819fbb  feat(*): add Job names for all Scala-based apps
     new a308e91  feat(*): add optimization hints here and there
     new f2daef8  fix(kmeans): use correct UDF load function
     new e680b70  feat(simwords): improve plan
     new 7e8ad17  feat(kmeans): override CardinalityEstimator for centroid resurrection
     new c1364e2  feat(wordcount): let user specify words per line
     new d266de4  feat(simwords): improve optimization hints
     new 71a6b5a  feat(wordcount): specify selectivity of "Filter empty words"
     new a740441  feat(rheem-api): add SQL support
     new 7060bbb  feat(tpch): add Scala/HDFS implementation of TPC-H query 3
     new 866e8a1  fix(tpch): register UDF JAR
     new ab9e086  feat(tpch): add Query3Sqlite
     new 046e0d9  feat(tpch): add Query3Hybrid
     new 0662623  feat(tpch): make Query3Hybrid executable via TpcH
     new b2588eb  feat(tpch): add Record-based projection
     new 8d7ff4b  refactor(rheem): introduce plug-ins
     new eec0c17  feat(Parameters): allow yaml(...) as plugin parameter
     new 0f679d3  feat(crocopr): limit the printed page ranks
     new ff3a671  fix(util): fix StdOut.printLimited signature
     new 708ce24  refactor(rheem-graphchi): refactor module
     new ec10927  feat(*): use novel Spark/Java/Sqlite3/... objects
     new 20d30c3  refactor(rheem-basic,rheem-java,rheem-tests): use Long as vertex IDs
     new c01068e  refactor(crocopr): use pageRank(...) function
     new a5ac421  feat(util): register java-conversions as plugin name
     new 00896c6  feat(util): register spark-graph as plugin
     new 852c85d  feat(*): integrate with profiledb
     new cddf06c  feat(util): enhance experiment parameter
     new bd7f0c9  feat(crocopr): add configuration to experiment
     new b8ae5c3  feat(*): add configurations for ProfileDB
     new 10d3bf3  refactor(*): adapt to Rheem API changes
     new 54aaae1  fix(kmeans): use only a single PlanBuilder
     new 35bfd4e  feat(sindy): add Rheem-based SINDY implementation
     new 64d7f1a  chore(*): update Rheem and Scala version
     new 712d08c  chore(*): switch to stable ProfileDB version
     new fadce8d  fix(sindy): set name, experiment, and jars of job correctly
     new 0657374  refactor(*): do some adjustments according to dependencies
     new 7176a9a  feat(crocopr,simwords,wordcount): add input file size as experiment configuration
     new 051f661  feat(crocopr): add input file size as experiment configuration
     new b536725  feat(*): bounce to Rheem 0.2.1-SNAPSHOT
     new 6604581  feat(sgd): add SGD implementation as new Rheem app
     new eb81b58  improved SGD with map partitions and pre-aggregation
     new de85d5c  fix(sgd): force the sampling to be executed inside of the loop
     new 2807280  feat(sgd): allow to choose SGD implementation from command line
     new 1639f26  feat(*): enable conversion plugins for Spark, SQLite3, and PostgreSQL
     new 3dd16da  feat(simwords): allow to specify wordsPerLine confidence
     new c0aa382  feat(wordcount): allow to specify confidence of the words per line
     new e1ba7bb  feat(simwords): add UDF CPU load for the word vector creation
     new 1eba944  refactor(kmeans,simwords): tag UDF load functions via keys
     new 6341bc6  allow to use Postgres for the TPC-H queries
     new dbf47ac  add TPC-H query 1
     new abd6223  store input sizes for most apps
     new ae31c63  use pipe character to parse TPC-H files (rather than semicolon)
     new dfbb994  allow no databases in TpcH for file-based queries
     new 93e5ca5  fix parsing error in TpcH
     new 0dbab3f  fix parsing error in TpcH
     new 87f519b  feat(sindy): allow to select CSV separator
     new 507a72f  feat(tpch): allow to specify DB schema
     new 4fe6853  declare UDF jar files in SGD
     new 4b779e6  pass Experiment to SGD to obtain measurements
     new 13f13a0  feat(word2nvec): add app to turn words into vectors
     new f82ae10  feat(optimizer-scalability): add app to measure the optimizer scalability
     new 375eb5d  feat(optimizer-scalability): add plugin and plan type to experiment data
     new 9ea24e8  make SGD robust for skipped executions
     new 1dd25b6  add PostgreSQL-based version of k-means
     new 931345b  adapt command-line for PSQL-based k-means
     new 8542688  add expected number of iterations to SGD
     new 8c53770  store experiments in k-means (PostgreSQL version)
     new 7daba41  amend simplelogger.properties
     new 252600e  Create README.md
     new d6451f0  Update README.md
     new dc2feae  Update README.md
     new 2594d2b  Update README.md
     new 11f5317  Update README.md
     new 049ba53  Update README.md
     new 2e77606  Update README.md
     new 338b344  Update README.md
     new b2b8820  Update README.md
     new 732aea1  Update README.md
     new 78d38bb  rename artifactId to rheem-benchmark
     new b3b66f1  Added rheem.properties so that tests pass
     new a9ecdaf  Update README.md
     new b45fd11  Update readme
     new 5be2b58  everything to one folder
     new 58e2efa  Merge branch 'moved' into WAYANG-36
     new c7594c3  [WAYANG-36] Change name of input file to "*.input"
     new b174f9c  [WAYANG-36] Addition of the benchmark from "https://github.com/rheem-ecosystem/rheem-benchmark.git" and rebranding

The 114 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.