You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/05/14 20:27:04 UTC
[GitHub] [incubator-gobblin] arjun4084346 opened a new pull request #2988: spec catalog new table
arjun4084346 opened a new pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988
added migration configs
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
### JIRA
- [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-XXX
### Description
- [ ] Here are some details about my PR, including screenshots (if applicable):
### Tests
- [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
### Commits
- [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
1. Subject is separated from body by a blank line
2. Subject is limited to 50 characters
3. Subject does not end with a period
4. Subject uses the imperative mood ("add", not "adding")
5. Body wraps at 72 characters
6. Body explains "what" and "why", not "how"
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426310141
##########
File path: .travis.yml
##########
@@ -46,7 +44,7 @@ env:
- RUN_TEST_GROUP=coverage
jdk:
- - oraclejdk8
+ - openjdk8
Review comment:
Ok. I am not opinionated about this. Would like to hear what others think @htran1 ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426214966
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -62,32 +62,34 @@
public class MysqlSpecStore implements SpecStore {
public static final String CONFIG_PREFIX = "mysqlSpecStore";
public static final String DEFAULT_TAG_VALUE = "";
+ private static final String NEW_COLUMN = "spec_json";
private static final String CREATE_TABLE_STATEMENT =
- "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
- private static final String CREATE_TABLE_STATEMENT_V2 =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
- + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), destination_identifier VARCHAR(128), "
- + "schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
- + "spec JSON, PRIMARY KEY (spec_uri))";
+ + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), "
+ + "destination_identifier VARCHAR(128), schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, "
+ + "modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
+ + "isRunImmediately BOOLEAN, timezone VARCHAR(128), owning_group VARCHAR(128), "
+ + "spec LONGBLOB, " + NEW_COLUMN + " JSON, PRIMARY KEY (spec_uri))";
private static final String EXISTS_STATEMENT = "SELECT EXISTS(SELECT * FROM %s WHERE spec_uri = ?)";
- private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec) VALUES (?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec)";
+ private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec, " + NEW_COLUMN + ") "
+ + "VALUES (?, ?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec), " + NEW_COLUMN + " = VALUES(" + NEW_COLUMN + ")";
private static final String DELETE_STATEMENT = "DELETE FROM %s WHERE spec_uri = ?";
- private static final String GET_STATEMENT = "SELECT spec FROM %s WHERE spec_uri = ?";
- private static final String GET_ALL_STATEMENT = "SELECT spec_uri, spec FROM %s";
- private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri, spec FROM %s WHERE tag = ?";
- static final String WRITE_TO_OLD_TABLE_KEY = "write.to.old.table";
- static final String READ_FROM_OLD_TABLE_KEY = "read.from.old.table";
+ private static final String GET_STATEMENT = "SELECT %s FROM %s WHERE spec_uri = ?";
+ private static final String GET_ALL_STATEMENT = "SELECT spec_uri, %s FROM %s";
+ private static final String GET_ALL_URIS_STATEMENT = "SELECT spec_uri FROM %s";
+ private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri FROM %s WHERE tag = ?";
+ static final String WRITE_TO_OLD_COLUMN = "write.to.old.column";
Review comment:
Sorry - my earlier comment could have been clearer. Yeah, writes to both columns is simpler. For read, we can check if the new column is not null and read from the new column else default to the old column. Would be nice to avoid having to pass configurations.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] codecov-io edited a comment on pull request #2988: spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#issuecomment-628880052
# [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=h1) Report
> Merging [#2988](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/76d5a3bab26b471b722787d735f3df696beecd68&el=desc) will **increase** coverage by `0.04%`.
> The diff coverage is `87.50%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2988 +/- ##
============================================
+ Coverage 45.63% 45.67% +0.04%
- Complexity 9230 9247 +17
============================================
Files 1947 1947
Lines 73968 73999 +31
Branches 8180 8189 +9
============================================
+ Hits 33757 33801 +44
+ Misses 37049 37035 -14
- Partials 3162 3163 +1
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...rg/apache/gobblin/runtime/AbstractJobLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvQWJzdHJhY3RKb2JMYXVuY2hlci5qYXZh) | `60.36% <ø> (ø)` | `37.00 <0.00> (ø)` | |
| [...n/runtime/job\_exec/JobLauncherExecutionDriver.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvam9iX2V4ZWMvSm9iTGF1bmNoZXJFeGVjdXRpb25Ecml2ZXIuamF2YQ==) | `66.66% <ø> (ø)` | `14.00 <0.00> (ø)` | |
| [...he/gobblin/util/callbacks/CallbacksDispatcher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvY2FsbGJhY2tzL0NhbGxiYWNrc0Rpc3BhdGNoZXIuamF2YQ==) | `69.56% <ø> (ø)` | `14.00 <0.00> (ø)` | |
| [...main/java/org/apache/gobblin/util/ConfigUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvQ29uZmlnVXRpbHMuamF2YQ==) | `60.66% <66.66%> (+0.26%)` | `41.00 <3.00> (+1.00)` | |
| [...che/gobblin/runtime/spec\_store/MysqlSpecStore.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zdG9yZS9NeXNxbFNwZWNTdG9yZS5qYXZh) | `75.20% <88.88%> (+5.64%)` | `22.00 <2.00> (+9.00)` | |
| [...ache/gobblin/runtime/spec\_serde/JavaSpecSerDe.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zZXJkZS9KYXZhU3BlY1NlckRlLmphdmE=) | `42.85% <0.00%> (-28.58%)` | `3.00% <0.00%> (ø%)` | |
| [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `85.71% <0.00%> (-7.15%)` | `3.00% <0.00%> (ø%)` | |
| [...lin/service/modules/core/GitMonitoringService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9jb3JlL0dpdE1vbml0b3JpbmdTZXJ2aWNlLmphdmE=) | `57.06% <0.00%> (-0.53%)` | `9.00% <0.00%> (ø%)` | |
| [...apache/gobblin/kafka/tool/SimpleKafkaConsumer.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtMDgvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4va2Fma2EvdG9vbC9TaW1wbGVLYWZrYUNvbnN1bWVyLmphdmE=) | `0.00% <0.00%> (ø)` | `0.00% <0.00%> (ø%)` | |
| [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `80.37% <0.00%> (+0.93%)` | `24.00% <0.00%> (ø%)` | |
| ... and [4 more](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=footer). Last update [76d5a3b...539f98a](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] codecov-io commented on pull request #2988: spec catalog new table
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#issuecomment-628880052
# [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=h1) Report
> Merging [#2988](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/76d5a3bab26b471b722787d735f3df696beecd68&el=desc) will **decrease** coverage by `38.46%`.
> The diff coverage is `0.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/graphs/tree.svg?width=650&height=150&src=pr&token=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #2988 +/- ##
============================================
- Coverage 45.63% 7.16% -38.47%
+ Complexity 9230 1246 -7984
============================================
Files 1947 1947
Lines 73968 73998 +30
Branches 8180 8189 +9
============================================
- Hits 33757 5304 -28453
- Misses 37049 68186 +31137
+ Partials 3162 508 -2654
```
| [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=tree) | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| [...gobblin/runtime/spec\_serde/FlowSpecSerializer.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zZXJkZS9GbG93U3BlY1NlcmlhbGl6ZXIuamF2YQ==) | `4.00% <ø> (-88.00%)` | `1.00 <0.00> (-5.00)` | |
| [...che/gobblin/runtime/spec\_store/MysqlSpecStore.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zdG9yZS9NeXNxbFNwZWNTdG9yZS5qYXZh) | `0.00% <0.00%> (-69.57%)` | `0.00 <0.00> (-13.00)` | |
| [...main/java/org/apache/gobblin/util/ConfigUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvQ29uZmlnVXRpbHMuamF2YQ==) | `3.33% <0.00%> (-57.07%)` | `3.00 <0.00> (-37.00)` | |
| [...c/main/java/org/apache/gobblin/util/FileUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvRmlsZVV0aWxzLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-4.00%)` | |
| [...n/java/org/apache/gobblin/fork/CopyableSchema.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2ZvcmsvQ29weWFibGVTY2hlbWEuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-2.00%)` | |
| [...java/org/apache/gobblin/stream/ControlMessage.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc3RyZWFtL0NvbnRyb2xNZXNzYWdlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...va/org/apache/gobblin/dataset/DatasetResolver.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YXNldC9EYXRhc2V0UmVzb2x2ZXIuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [...va/org/apache/gobblin/converter/EmptyIterable.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbnZlcnRlci9FbXB0eUl0ZXJhYmxlLmphdmE=) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| [...org/apache/gobblin/ack/BasicAckableForTesting.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vYWNrL0Jhc2ljQWNrYWJsZUZvclRlc3RpbmcuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-3.00%)` | |
| [.../org/apache/gobblin/yarn/HelixMessageSubTypes.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vSGVsaXhNZXNzYWdlU3ViVHlwZXMuamF2YQ==) | `0.00% <0.00%> (-100.00%)` | `0.00% <0.00%> (-1.00%)` | |
| ... and [1070 more](https://codecov.io/gh/apache/incubator-gobblin/pull/2988/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=footer). Last update [76d5a3b...f4edc97](https://codecov.io/gh/apache/incubator-gobblin/pull/2988?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] asfgit closed pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426310400
##########
File path: .travis.yml
##########
@@ -46,7 +44,7 @@ env:
- RUN_TEST_GROUP=coverage
jdk:
- - oraclejdk8
+ - openjdk8
Review comment:
Other option is to remain at trusty , but do not use pre installed mysql.
My opinion is that we should make efforts to support newer version of OS and java.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426288259
##########
File path: .travis.yml
##########
@@ -46,7 +44,7 @@ env:
- RUN_TEST_GROUP=coverage
jdk:
- - oraclejdk8
+ - openjdk8
Review comment:
xenial wanted oraclejdk9+
by chance it accepted openjdk8 though.
we should start supporting jdk9+ also, but i found a small portion of gobblin code is not compatible with jdk9
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426206992
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -140,11 +172,22 @@ public boolean deleteSpec(Spec spec) throws IOException {
@Override
public boolean deleteSpec(URI specUri) throws IOException {
try (Connection connection = this.dataSource.getConnection();
- PreparedStatement statement = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableName))) {
+ PreparedStatement statement = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableName));
+ PreparedStatement statementV2 = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableNameV2))) {
+
statement.setString(1, specUri.toString());
- int result = statement.executeUpdate();
+ statementV2.setString(1, specUri.toString());
+
+ int resultV2 = statementV2.executeUpdate();
+ if (this.writeToOldTable) {
+ int result = statement.executeUpdate();
+ if (resultV2 != result) {
Review comment:
yes
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426203594
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -65,31 +65,49 @@
private static final String CREATE_TABLE_STATEMENT =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
+ private static final String CREATE_TABLE_STATEMENT_V2 =
Review comment:
Isn't this schema backwards compatible with the V1 schema i.e. you are simply adding new columns? We can retain the blob format for the "spec" column as before and define a new column "specV2" which stores the spec in JSON format. If so, do we need to support two different versions of the table?
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -65,31 +65,49 @@
private static final String CREATE_TABLE_STATEMENT =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
+ private static final String CREATE_TABLE_STATEMENT_V2 =
+ "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
Review comment:
I think the schema can be enhanced with a few additional fields such as requesterId, a boolean field "isRunImmediately" to indicate if the scheduled flow is also a run immediately flow, a futuristic field "owningGroup" for flows which are owned by a group, "timezone" to indicate the timezone the cron schedule applies to. Also, does user_to_proxy need to be a top level field? It applies only when the orchestrator is Azkaban and can probably be queried from the JSON spec.
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -140,11 +172,22 @@ public boolean deleteSpec(Spec spec) throws IOException {
@Override
public boolean deleteSpec(URI specUri) throws IOException {
try (Connection connection = this.dataSource.getConnection();
- PreparedStatement statement = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableName))) {
+ PreparedStatement statement = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableName));
+ PreparedStatement statementV2 = connection.prepareStatement(String.format(DELETE_STATEMENT, this.tableNameV2))) {
+
statement.setString(1, specUri.toString());
- int result = statement.executeUpdate();
+ statementV2.setString(1, specUri.toString());
+
+ int resultV2 = statementV2.executeUpdate();
+ if (this.writeToOldTable) {
+ int result = statement.executeUpdate();
+ if (resultV2 != result) {
Review comment:
Should we throw an exception and fail the deleteSpec, instead of logging and moving on?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426215484
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -62,32 +62,34 @@
public class MysqlSpecStore implements SpecStore {
public static final String CONFIG_PREFIX = "mysqlSpecStore";
public static final String DEFAULT_TAG_VALUE = "";
+ private static final String NEW_COLUMN = "spec_json";
private static final String CREATE_TABLE_STATEMENT =
- "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
- private static final String CREATE_TABLE_STATEMENT_V2 =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
- + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), destination_identifier VARCHAR(128), "
- + "schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
- + "spec JSON, PRIMARY KEY (spec_uri))";
+ + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), "
+ + "destination_identifier VARCHAR(128), schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, "
+ + "modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
+ + "isRunImmediately BOOLEAN, timezone VARCHAR(128), owning_group VARCHAR(128), "
+ + "spec LONGBLOB, " + NEW_COLUMN + " JSON, PRIMARY KEY (spec_uri))";
private static final String EXISTS_STATEMENT = "SELECT EXISTS(SELECT * FROM %s WHERE spec_uri = ?)";
- private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec) VALUES (?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec)";
+ private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec, " + NEW_COLUMN + ") "
+ + "VALUES (?, ?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec), " + NEW_COLUMN + " = VALUES(" + NEW_COLUMN + ")";
private static final String DELETE_STATEMENT = "DELETE FROM %s WHERE spec_uri = ?";
- private static final String GET_STATEMENT = "SELECT spec FROM %s WHERE spec_uri = ?";
- private static final String GET_ALL_STATEMENT = "SELECT spec_uri, spec FROM %s";
- private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri, spec FROM %s WHERE tag = ?";
- static final String WRITE_TO_OLD_TABLE_KEY = "write.to.old.table";
- static final String READ_FROM_OLD_TABLE_KEY = "read.from.old.table";
+ private static final String GET_STATEMENT = "SELECT %s FROM %s WHERE spec_uri = ?";
+ private static final String GET_ALL_STATEMENT = "SELECT spec_uri, %s FROM %s";
+ private static final String GET_ALL_URIS_STATEMENT = "SELECT spec_uri FROM %s";
+ private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri FROM %s WHERE tag = ?";
+ static final String WRITE_TO_OLD_COLUMN = "write.to.old.column";
Review comment:
Okay. Yea it will reduce complexity, but we would be reading 2 columns till we remove the old column.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426214202
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -164,17 +182,17 @@ public Spec updateSpec(Spec spec) throws IOException, SpecNotFoundException {
@Override
public Spec getSpec(URI specUri) throws IOException, SpecNotFoundException {
try (Connection connection = this.dataSource.getConnection();
- PreparedStatement statement = connection.prepareStatement(String.format(GET_STATEMENT, this.tableName))) {
-
+ PreparedStatement statement = connection.prepareStatement(
+ String.format(GET_STATEMENT, this.readFromOldColumn ? "spec" : NEW_COLUMN, this.tableName))) {
statement.setString(1, specUri.toString());
try (ResultSet rs = statement.executeQuery()) {
if (!rs.next()) {
throw new SpecNotFoundException(specUri);
}
-
- Blob blob = rs.getBlob(1);
- return this.specSerDe.deserialize(ByteStreams.toByteArray(blob.getBinaryStream()));
+ return this.readFromOldColumn
Review comment:
Same comment as earlier.
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -62,32 +62,34 @@
public class MysqlSpecStore implements SpecStore {
public static final String CONFIG_PREFIX = "mysqlSpecStore";
public static final String DEFAULT_TAG_VALUE = "";
+ private static final String NEW_COLUMN = "spec_json";
private static final String CREATE_TABLE_STATEMENT =
- "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
- private static final String CREATE_TABLE_STATEMENT_V2 =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
- + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), destination_identifier VARCHAR(128), "
- + "schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
- + "spec JSON, PRIMARY KEY (spec_uri))";
+ + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), "
+ + "destination_identifier VARCHAR(128), schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, "
+ + "modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
+ + "isRunImmediately BOOLEAN, timezone VARCHAR(128), owning_group VARCHAR(128), "
+ + "spec LONGBLOB, " + NEW_COLUMN + " JSON, PRIMARY KEY (spec_uri))";
private static final String EXISTS_STATEMENT = "SELECT EXISTS(SELECT * FROM %s WHERE spec_uri = ?)";
- private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec) VALUES (?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec)";
+ private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec, " + NEW_COLUMN + ") "
+ + "VALUES (?, ?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec), " + NEW_COLUMN + " = VALUES(" + NEW_COLUMN + ")";
private static final String DELETE_STATEMENT = "DELETE FROM %s WHERE spec_uri = ?";
- private static final String GET_STATEMENT = "SELECT spec FROM %s WHERE spec_uri = ?";
- private static final String GET_ALL_STATEMENT = "SELECT spec_uri, spec FROM %s";
- private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri, spec FROM %s WHERE tag = ?";
- static final String WRITE_TO_OLD_TABLE_KEY = "write.to.old.table";
- static final String READ_FROM_OLD_TABLE_KEY = "read.from.old.table";
+ private static final String GET_STATEMENT = "SELECT %s FROM %s WHERE spec_uri = ?";
+ private static final String GET_ALL_STATEMENT = "SELECT spec_uri, %s FROM %s";
+ private static final String GET_ALL_URIS_STATEMENT = "SELECT spec_uri FROM %s";
+ private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri FROM %s WHERE tag = ?";
+ static final String WRITE_TO_OLD_COLUMN = "write.to.old.column";
Review comment:
Do you think we can mark the old column as deprecated and write to both old and new columns? Eventually, in a future release, the old column can be dropped. We won't need the user to provide configs to write to old/new columns.
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -87,6 +101,9 @@ public MysqlSpecStore(Config config, SpecSerDe specSerDe) throws IOException {
this.specStoreURI = URI.create(config.getString(ConfigurationKeys.STATE_STORE_DB_URL_KEY));
this.specSerDe = specSerDe;
+ this.writeToOldColumn = ConfigUtils.getBoolean(config, WRITE_TO_OLD_COLUMN, true);
Review comment:
See comment earlier about writing to both old and new columns.
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -194,15 +212,18 @@ public Spec getSpec(URI specUri, String version) throws IOException, SpecNotFoun
@Override
public Collection<Spec> getSpecs() throws IOException {
try (Connection connection = this.dataSource.getConnection();
- PreparedStatement statement = connection.prepareStatement(String.format(GET_ALL_STATEMENT, this.tableName))) {
-
+ PreparedStatement statement = connection.prepareStatement(
+ String.format(GET_ALL_STATEMENT, this.readFromOldColumn ? "spec" : NEW_COLUMN, this.tableName))) {
List<Spec> specs = new ArrayList<>();
try (ResultSet rs = statement.executeQuery()) {
while (rs.next()) {
try {
- Blob blob = rs.getBlob(2);
- specs.add(this.specSerDe.deserialize(ByteStreams.toByteArray(blob.getBinaryStream())));
+ specs.add(
+ this.readFromOldColumn
Review comment:
same comment as above.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
sv2000 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426276162
##########
File path: .travis.yml
##########
@@ -46,7 +44,7 @@ env:
- RUN_TEST_GROUP=coverage
jdk:
- - oraclejdk8
+ - openjdk8
Review comment:
Curious: can we make do without this change?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426206547
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -65,31 +65,49 @@
private static final String CREATE_TABLE_STATEMENT =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
+ private static final String CREATE_TABLE_STATEMENT_V2 =
Review comment:
That's right. will change
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426214502
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -62,32 +62,34 @@
public class MysqlSpecStore implements SpecStore {
public static final String CONFIG_PREFIX = "mysqlSpecStore";
public static final String DEFAULT_TAG_VALUE = "";
+ private static final String NEW_COLUMN = "spec_json";
private static final String CREATE_TABLE_STATEMENT =
- "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
- private static final String CREATE_TABLE_STATEMENT_V2 =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
- + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), destination_identifier VARCHAR(128), "
- + "schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
- + "spec JSON, PRIMARY KEY (spec_uri))";
+ + "template_uri VARCHAR(128), user_to_proxy VARCHAR(128), source_identifier VARCHAR(128), "
+ + "destination_identifier VARCHAR(128), schedule VARCHAR(128), tag VARCHAR(128) NOT NULL, "
+ + "modified_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, "
+ + "isRunImmediately BOOLEAN, timezone VARCHAR(128), owning_group VARCHAR(128), "
+ + "spec LONGBLOB, " + NEW_COLUMN + " JSON, PRIMARY KEY (spec_uri))";
private static final String EXISTS_STATEMENT = "SELECT EXISTS(SELECT * FROM %s WHERE spec_uri = ?)";
- private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec) VALUES (?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec)";
+ private static final String INSERT_STATEMENT = "INSERT INTO %s (spec_uri, tag, spec, " + NEW_COLUMN + ") "
+ + "VALUES (?, ?, ?, ?) ON DUPLICATE KEY UPDATE spec = VALUES(spec), " + NEW_COLUMN + " = VALUES(" + NEW_COLUMN + ")";
private static final String DELETE_STATEMENT = "DELETE FROM %s WHERE spec_uri = ?";
- private static final String GET_STATEMENT = "SELECT spec FROM %s WHERE spec_uri = ?";
- private static final String GET_ALL_STATEMENT = "SELECT spec_uri, spec FROM %s";
- private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri, spec FROM %s WHERE tag = ?";
- static final String WRITE_TO_OLD_TABLE_KEY = "write.to.old.table";
- static final String READ_FROM_OLD_TABLE_KEY = "read.from.old.table";
+ private static final String GET_STATEMENT = "SELECT %s FROM %s WHERE spec_uri = ?";
+ private static final String GET_ALL_STATEMENT = "SELECT spec_uri, %s FROM %s";
+ private static final String GET_ALL_URIS_STATEMENT = "SELECT spec_uri FROM %s";
+ private static final String GET_ALL_STATEMENT_WITH_TAG = "SELECT spec_uri FROM %s WHERE tag = ?";
+ static final String WRITE_TO_OLD_COLUMN = "write.to.old.column";
Review comment:
Yea, these configs are only for transition period, will be removed later.
Do we want to always write to both the columns and one day stop writing to old column?
similarly do we want to start reading from the new column immediately?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2988: [GOBBLIN-1150] spec catalog table schema change
Posted by GitBox <gi...@apache.org>.
arjun4084346 commented on a change in pull request #2988:
URL: https://github.com/apache/incubator-gobblin/pull/2988#discussion_r426206969
##########
File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_store/MysqlSpecStore.java
##########
@@ -65,31 +65,49 @@
private static final String CREATE_TABLE_STATEMENT =
"CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, tag VARCHAR(128) NOT NULL, spec LONGBLOB, PRIMARY KEY (spec_uri))";
+ private static final String CREATE_TABLE_STATEMENT_V2 =
+ "CREATE TABLE IF NOT EXISTS %s (spec_uri VARCHAR(128) NOT NULL, flow_group VARCHAR(128), flow_name VARCHAR(128), "
Review comment:
will 'owningGroup' be a string, or a json having some more information like a requester list? if we are not sure, we can leave this column for implementation time, otherwise we might have to add additional column for 'owningGroup' if we choose a wrong type now?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org