You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2021/06/08 02:37:11 UTC

[GitHub] [nifi] chihhanyu opened a new pull request #5135: GCP BigQuery processors support using designate project resource for ingestion

chihhanyu opened a new pull request #5135:
URL: https://github.com/apache/nifi/pull/5135


   <!--
     Licensed to the Apache Software Foundation (ASF) under one or more
     contributor license agreements.  See the NOTICE file distributed with
     this work for additional information regarding copyright ownership.
     The ASF licenses this file to You under the Apache License, Version 2.0
     (the "License"); you may not use this file except in compliance with
     the License.  You may obtain a copy of the License at
         http://www.apache.org/licenses/LICENSE-2.0
     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
   -->
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
   #### Description of PR
   
   Please refer to JIRA ticket for details: 
   https://issues.apache.org/jira/browse/NIFI-8611
   This change is already used in our production environment which can properly support using different source project and ingestion project in the property of GCP processors. 
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [x] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [x] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [x] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [x] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] chihhanyu commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
chihhanyu commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-856639421


   > I'm not sure to fully understand the difference between project ID and designate project ID. Can you give me an example to understand the difference on how resources will be used on the two projects?
   
   Hi @pvillard31, thanks for your reply, the major purpose is that, when creating a job to load data of a project into BigQuery, this job will use the resources of the project. In most cases, a project can have data in BigQuery dataset and its limited resources for loading job. But in our cases, we split GCP projects into two layers for security control, data project and resource project: 
   * Data project can only store data and provide data access, it doesn't have any resources to create any jobs. 
   * Resource project is used to create jobs, users can only create job by resource projects instead of data project. 
   
   That's the reason why we needs to config two project ID for BigQuery processors, the designate project ID is used to specify the target project to store data, and the other project ID is to specify the resource project. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] chihhanyu commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
chihhanyu commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-865498616


   > Is "Designate Project ID" a common formulation for BigQuery? If I understand correctly the new property would be used to define the project ID where the data will be loaded while the existing property is used to determine where the BigQuery "job" will be running to load the data and where associated costs will be charged. Correct?
   
   Yes, you're correct. As I know, there's no specific formulation for BigQuery to describe something like designate project ID, if we'd like to use a meaningful name, it can also be called as "Target Project ID" or "Data Project ID", and the existing property may be called "Resource Project ID". 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] joewitt commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
joewitt commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-874928814


   @pvillard31 are we good to go on this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] chihhanyu commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
chihhanyu commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-1046048053


   @joewitt @pvillard31 Sorry to bother, since it's been a while, just like to know how can I make this PR keep going on the RC and is there possible to release in the further version?  I checked the contribution guide but I'm not sure what's the next step. 
   Any suggestions are appreciated! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on a change in pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on a change in pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#discussion_r647219125



##########
File path: nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/PutBigQueryBatch.java
##########
@@ -257,17 +257,19 @@ public void onTrigger(ProcessContext context, ProcessSession session) throws Pro
         }
 
         final String projectId = context.getProperty(PROJECT_ID).evaluateAttributeExpressions().getValue();
+        final String designateProjectId = context.getProperty(DESIGNATE_PROJECT_ID).evaluateAttributeExpressions().getValue();

Review comment:
       ```suggestion
           final String designateProjectId = context.getProperty(DESIGNATE_PROJECT_ID).evaluateAttributeExpressions(flowFile).getValue();
   ```
   
   I believe this is required since you defined FlowFile Attributes scope in the property definition.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] chihhanyu commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
chihhanyu commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-878943101


   Thanks @joewitt and @pvillard31! Hope this feature can be merged which should provide more flexibility for users when ingesting to BigQuery. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] closed pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #5135:
URL: https://github.com/apache/nifi/pull/5135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-965860700


   We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable.  If you would like this PR re-opened you can do so and a committer can remove the stale tag.  Or you can open a new PR.  Try to help review other PRs to increase PR review bandwidth which in turn helps yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on a change in pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on a change in pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#discussion_r647215118



##########
File path: nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/BigQueryAttributes.java
##########
@@ -51,6 +51,10 @@ private BigQueryAttributes() {
             + "the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. By default "
             + "no bad record is ignored.";
 
+    public static final String DESIGNATE_PROJECT_ID_ATTR = "bq.designate.project.id";
+    public static final String DESIGNATE_PROJECT_ID_DESC = "Sets the designate project id that BigqQuery can use current project resource to "
+            + "manipulate the designate proejct.";

Review comment:
       ```suggestion
               + "manipulate the designate project.";
   ```

##########
File path: nifi-nar-bundles/nifi-gcp-bundle/nifi-gcp-processors/src/main/java/org/apache/nifi/processors/gcp/bigquery/BigQueryAttributes.java
##########
@@ -51,6 +51,10 @@ private BigQueryAttributes() {
             + "the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. By default "
             + "no bad record is ignored.";
 
+    public static final String DESIGNATE_PROJECT_ID_ATTR = "bq.designate.project.id";
+    public static final String DESIGNATE_PROJECT_ID_DESC = "Sets the designate project id that BigqQuery can use current project resource to "

Review comment:
       ```suggestion
       public static final String DESIGNATE_PROJECT_ID_DESC = "Sets the designate project id that BigQuery can use current project resource to "
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] joewitt commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
joewitt commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-874955160


   i'm going to remove this from 1.14 for now so we can keep going on the RC.  but by all means lets keep progressing this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-861728334


   Is "Designate Project ID" a common formulation for BigQuery? If I understand correctly the new property would be used to define the project ID where the data will be loaded while the existing property is used to determine where the BigQuery "job" will be running to load the data and where associated costs will be charged. Correct?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] chihhanyu commented on pull request #5135: NIFI-8611: GCP BigQuery processors support using designate project resource for ingestion

Posted by GitBox <gi...@apache.org>.
chihhanyu commented on pull request #5135:
URL: https://github.com/apache/nifi/pull/5135#issuecomment-865498616


   > Is "Designate Project ID" a common formulation for BigQuery? If I understand correctly the new property would be used to define the project ID where the data will be loaded while the existing property is used to determine where the BigQuery "job" will be running to load the data and where associated costs will be charged. Correct?
   
   Yes, you're correct. As I know, there's no specific formulation for BigQuery to describe something like designate project ID, if we'd like to use a meaningful name, it can also be called as "Target Project ID" or "Data Project ID", and the existing property may be called "Resource Project ID". 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org