You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@wayang.apache.org by be...@apache.org on 2021/09/05 12:48:09 UTC
[incubator-wayang] branch WAYANG-36 created (now b174f9c)
This is an automated email from the ASF dual-hosted git repository.
bertty pushed a change to branch WAYANG-36
in repository https://gitbox.apache.org/repos/asf/incubator-wayang.git.
at b174f9c [WAYANG-36] Addition of the benchmark from "https://github.com/rheem-ecosystem/rheem-benchmark.git" and rebranding
This branch includes the following new commits:
new 0ed02d7 add TPC-H Query 1
new c6f2709 feat(wordcount): add WordCount app
new 5016d57 fix(RheemApps): set Rheem version correctly and register UDF jars in apps
new 9afc0e0 refactor(wordcount): use stdout sink instead of collecting sink
new ee22c38 fix(wordcount): remove target Platforms from RheemPlan
new 16e0485 fix(wordcount,tpch): add rheem-java JAR as UDF JAR
new cffeb84 feat(kmeans): implement basic k-means
new 2aecc6e feat(kmeans): improve output on faulty platform
new bd73125 test(kmeans): improve k-means test
new 7729c96 feat(kmeans): "resurrect" lost centroids
new bc2eb95 Merge branch 'master' of https://github.com/daqcri/RheemApps
new 9ce990a chore(RheemApps): remove Flink dependencies
new d88283a chore(RheemApps): use Scala Maven Plugin
new ac095e7 fix(kmeans): fix main method signature
new 48b7257 fix(kmeans): use random cluster IDs
new 44f850d fix(kmeans): print out usage only if no args are passed
new b6f1c42 fix(kmeans): annotate UDF JARs
new 9edcdf7 feat(simwords): build tool to find similar words
new 5511055 fix(simwords): fix sampling and main method
new 8afea62 feat(simwords): generate centroids instead of sampling them
new 1669fc9 feat(wordcount): add Scala implementation
new 8e3cc36 fix(wordcount): add UDF JAR to Scala implementation
new f0235b4 feat(crocopr): add cross-community PageRank
new 2f818ff feat(kmeans): make point resurrection optional
new dd4405b feat(RheemApps): add +spark as warmed up Spark
new 4b3dc35 feat(rheem-core,rheem-spark,rheem-java): keep track of ChannelInstance lineage
new 67c1e1a feat(util): create Parameters utility class
new 4554572 fix(util): remove optional argument from overloaded method
new 6819fbb feat(*): add Job names for all Scala-based apps
new a308e91 feat(*): add optimization hints here and there
new f2daef8 fix(kmeans): use correct UDF load function
new e680b70 feat(simwords): improve plan
new 7e8ad17 feat(kmeans): override CardinalityEstimator for centroid resurrection
new c1364e2 feat(wordcount): let user specify words per line
new d266de4 feat(simwords): improve optimization hints
new 71a6b5a feat(wordcount): specify selectivity of "Filter empty words"
new a740441 feat(rheem-api): add SQL support
new 7060bbb feat(tpch): add Scala/HDFS implementation of TPC-H query 3
new 866e8a1 fix(tpch): register UDF JAR
new ab9e086 feat(tpch): add Query3Sqlite
new 046e0d9 feat(tpch): add Query3Hybrid
new 0662623 feat(tpch): make Query3Hybrid executable via TpcH
new b2588eb feat(tpch): add Record-based projection
new 8d7ff4b refactor(rheem): introduce plug-ins
new eec0c17 feat(Parameters): allow yaml(...) as plugin parameter
new 0f679d3 feat(crocopr): limit the printed page ranks
new ff3a671 fix(util): fix StdOut.printLimited signature
new 708ce24 refactor(rheem-graphchi): refactor module
new ec10927 feat(*): use novel Spark/Java/Sqlite3/... objects
new 20d30c3 refactor(rheem-basic,rheem-java,rheem-tests): use Long as vertex IDs
new c01068e refactor(crocopr): use pageRank(...) function
new a5ac421 feat(util): register java-conversions as plugin name
new 00896c6 feat(util): register spark-graph as plugin
new 852c85d feat(*): integrate with profiledb
new cddf06c feat(util): enhance experiment parameter
new bd7f0c9 feat(crocopr): add configuration to experiment
new b8ae5c3 feat(*): add configurations for ProfileDB
new 10d3bf3 refactor(*): adapt to Rheem API changes
new 54aaae1 fix(kmeans): use only a single PlanBuilder
new 35bfd4e feat(sindy): add Rheem-based SINDY implementation
new 64d7f1a chore(*): update Rheem and Scala version
new 712d08c chore(*): switch to stable ProfileDB version
new fadce8d fix(sindy): set name, experiment, and jars of job correctly
new 0657374 refactor(*): do some adjustments according to dependencies
new 7176a9a feat(crocopr,simwords,wordcount): add input file size as experiment configuration
new 051f661 feat(crocopr): add input file size as experiment configuration
new b536725 feat(*): bounce to Rheem 0.2.1-SNAPSHOT
new 6604581 feat(sgd): add SGD implementation as new Rheem app
new eb81b58 improved SGD with map partitions and pre-aggregation
new de85d5c fix(sgd): force the sampling to be executed inside of the loop
new 2807280 feat(sgd): allow to choose SGD implementation from command line
new 1639f26 feat(*): enable conversion plugins for Spark, SQLite3, and PostgreSQL
new 3dd16da feat(simwords): allow to specify wordsPerLine confidence
new c0aa382 feat(wordcount): allow to specify confidence of the words per line
new e1ba7bb feat(simwords): add UDF CPU load for the word vector creation
new 1eba944 refactor(kmeans,simwords): tag UDF load functions via keys
new 6341bc6 allow to use Postgres for the TPC-H queries
new dbf47ac add TPC-H query 1
new abd6223 store input sizes for most apps
new ae31c63 use pipe character to parse TPC-H files (rather than semicolon)
new dfbb994 allow no databases in TpcH for file-based queries
new 93e5ca5 fix parsing error in TpcH
new 0dbab3f fix parsing error in TpcH
new 87f519b feat(sindy): allow to select CSV separator
new 507a72f feat(tpch): allow to specify DB schema
new 4fe6853 declare UDF jar files in SGD
new 4b779e6 pass Experiment to SGD to obtain measurements
new 13f13a0 feat(word2nvec): add app to turn words into vectors
new f82ae10 feat(optimizer-scalability): add app to measure the optimizer scalability
new 375eb5d feat(optimizer-scalability): add plugin and plan type to experiment data
new 9ea24e8 make SGD robust for skipped executions
new 1dd25b6 add PostgreSQL-based version of k-means
new 931345b adapt command-line for PSQL-based k-means
new 8542688 add expected number of iterations to SGD
new 8c53770 store experiments in k-means (PostgreSQL version)
new 7daba41 amend simplelogger.properties
new 252600e Create README.md
new d6451f0 Update README.md
new dc2feae Update README.md
new 2594d2b Update README.md
new 11f5317 Update README.md
new 049ba53 Update README.md
new 2e77606 Update README.md
new 338b344 Update README.md
new b2b8820 Update README.md
new 732aea1 Update README.md
new 78d38bb rename artifactId to rheem-benchmark
new b3b66f1 Added rheem.properties so that tests pass
new a9ecdaf Update README.md
new b45fd11 Update readme
new 5be2b58 everything to one folder
new 58e2efa Merge branch 'moved' into WAYANG-36
new c7594c3 [WAYANG-36] Change name of input file to "*.input"
new b174f9c [WAYANG-36] Addition of the benchmark from "https://github.com/rheem-ecosystem/rheem-benchmark.git" and rebranding
The 114 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "add" were already present in the repository and have only
been added to this reference.