You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/03 02:31:05 UTC

[GitHub] [airflow] bugraoz93 opened a new pull request #19377: Providers facebook hook multiple account

bugraoz93 opened a new pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   closes: #19269 
   Following changes have made in this PR.
   - Multiple Facebook Ads account_id for Facebook Ads Hook support added. 
   - To provide a better use of multiple account_id, FacebookAdsReportToGcsOperator has now supported file export for each account_id. (upload_as_account option presented to control this feature. The default value is False which mean all the account_id data will be exported to the same file)
   
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753331206



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +125,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type

Review comment:
       I have made the changes, I hope this will be enough to explain it. If it isn't, please let me know.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742028280



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the changes. I think I was too focused on the feature itself, I missed this one. Thank you very much.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r741922209



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       It looks like a breaking change. What can we do to prevent this from happening?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958605991


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702692


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753616453



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -105,24 +110,79 @@ def bulk_facebook_report(
         params: Dict[str, Any],
         fields: List[str],
         sleep_time: int = 5,
-    ) -> List[AdsInsights]:
+    ) -> Union[List[AdsInsights], Dict[str, List[AdsInsights]]]:
         """
-        Pulls data from the Facebook Ads API
+        Pulls data from the Facebook Ads API regarding Account ID with matching return type
+
+        .. seealso::
+            If the type of Account ID is a List account_id -> list

Review comment:
       I am grateful to you for your time to review too many changes and your help. I have changed the comment same as your comment since you have already described it very well.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958605991


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702692


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702692


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958758081


   Hey @potiuk, thank you very much for your review and your comment. It is strange that static checks give errors for the flake and the black. I have run for all my changes before opening the PR. I have checked again the static code problem in the operator and rerun it. It should be fine now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742028280



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the change. I think I was too focused on the feature itself, I missed this one. Thank you very much.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r741921242



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       ```suggestion
               self.log.info(
                   "%s Account Id used to extract data from Facebook Ads Iterators successfully", account_id
               )
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r744543108



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       You are right. Your comment about returning two different data structures made me realize some of these outcomes and these use cases. I was thinking of a platform that uses operators to run a certain task, in the end, it is not just a tool (It is also a set of useful decoupled tools and libraries). Also, we are using some of those hooks in our stack. From now on, I will be more careful about backward compatibility even if that means a method return two different data structure.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r747352222



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +122,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type
+        :rtype: List[AdsInsights] or Dict[str, List[AdsInsights]]
         """
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
+        if self.is_backward:
+            return self._facebook_report(
+                account_id=self.facebook_ads_config["account_id"],
+                api=api,
+                params=params,
+                fields=fields,
+                sleep_time=sleep_time,
+            )

Review comment:
       This logic does not look… backward to me. Why is the flag called `is_backward` in the first place? What does backward mean in this context?

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -98,14 +99,17 @@ def facebook_ads_config(self) -> Dict:
         if missing_keys:
             message = f"{missing_keys} fields are missing"
             raise AirflowException(message)
+        # Solve a way to check in bulk_facebook_report to handle list and dict return there
+        if type(config["account_id"]) is not list:

Review comment:
       ```suggestion
           if isinstance(config["account_id"], list):
   ```
   
   This `is_backward` logic is nebulous to me though. Is this “backward if `account_id` is specified” behaviour specified in Facebook’s API documentation?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-963768252


   > As discussed - it needs backwards compatibiltty - so i am changing it to "request changes" so that it does not get merged accidentally.
   
   Hi @potiuk, I have made the changes regarding backward compatibility. I have made little changes in the hook and operator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r749117689



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +122,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type
+        :rtype: List[AdsInsights] or Dict[str, List[AdsInsights]]
         """
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
+        if self.is_backward:
+            return self._facebook_report(
+                account_id=self.facebook_ads_config["account_id"],
+                api=api,
+                params=params,
+                fields=fields,
+                sleep_time=sleep_time,
+            )

Review comment:
       The implementation still looks wrong to me. The main issue is that `_get_service` should not _mutate` the hook object.
   
   A better implementation would tie the backward compatibility switch directly against `facebook_ads_config`, instead of relying on calling `_get_service` to modify the flag. One way to achieve this is to make the flag a `cached_property` like `facebook_ads_config`:
   
   ```python
   @cached_property
   def multiple_accounts(self) -> bool:
       return isinstance(self.config["account_id"], list)
   ```
   
   `is_backwards` is also a very bad name because it is unclear what “backwards” means in the context. It would be better to directly describe what the difference is instead. The “new” behaviour allows using multiple account, while the existing one allows only one account, so the property can be called `multiple_accounts` to signify the difference. (Or it can be called `single_account` with the semantics flipped.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r747624569



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +122,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type
+        :rtype: List[AdsInsights] or Dict[str, List[AdsInsights]]
         """
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
+        if self.is_backward:
+            return self._facebook_report(
+                account_id=self.facebook_ads_config["account_id"],
+                api=api,
+                params=params,
+                fields=fields,
+                sleep_time=sleep_time,
+            )

Review comment:
       Hi @uranusjr, In this context, backward means it is returning the same as the old version which is `List[AdsInsights]`. In the current version, Facebook Ads Hook only supports one `account_id` (string) at the time meaning if you have multiple `account_id` (array|list) for ads, you have to maintain multiple Facebook Ads Connection in the Airflow.  To support multiple `account_id` in the same Facebook Ads Connection, I corollated both `account_id` and the list of AdsInsights with returning `Dict[str, List[AdsInsights]]` but this was a breaking change since the return type was changed. After that, I updated the logic with this check if the `account_id` in the provided Facebook Connection as a string (single `account_id`), it returns the same format which is `List[AdsInsights]` and in the operator this situation also handled. If the multiple `account_id` is provided in the Facebook Ads Connection as an array, then the check provide to return `Dict[str, List[AdsInsights]]` which ev
 ery report correlated with `account_id` to export them regarding `account_id`. For that reason, I called that check `is_backward` which provides the check of the old connection format to work as it is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702187






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702187


   There were some static code cancelled so I reopened it, but it looks pretty good


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r744524493



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       We are VERY strict about backwards compatibility. And Hooks are equally part of the "public" API of providers as operators. Airflow is a very specific project in the sense that users are supposed to write code and use our Hooks and Operators in the way they think is the best. Due to their nature operators are very limited in what they can do but Hooks provide a much better interface for the users especially if they want to build some kind of transfer between two completely different services. I'd even say that in "modern" Airflow, Hoooks are WAY more important than Operators.
   
   Especially with the TaskFlow Api, where you can do something like that:
   
   ```
   @task 
   def trasnfer_from_fb_to_whatever():
      fbHook = FBHook()
      local_data = fbhook.read_data()
      transform_the_data( local_data)
      whatever_hook = WhateverHook()
      whatever_hook.send_data(local_data)
   ```
   
   This is fully-fledged Python task that uses Hooks and no operators and is the way modern tasks should be written in Airflow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958758081


   Hey @potiuk, thank you very much for your review and your comment. It is strange that static checks give errors for the flake and the black. I have run for all my changes before opening the PR. I have checked again the static code problem in the operator and rerun it. It should be fine now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r747624569



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +122,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type
+        :rtype: List[AdsInsights] or Dict[str, List[AdsInsights]]
         """
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
+        if self.is_backward:
+            return self._facebook_report(
+                account_id=self.facebook_ads_config["account_id"],
+                api=api,
+                params=params,
+                fields=fields,
+                sleep_time=sleep_time,
+            )

Review comment:
       Hi @uranusjr, In this context, backward means it is returning the same as the old version which is List[AdsInsights]. In the current version, Facebook Ads Hook only supports one account_id (string) at the time meaning if you have multiple account_id (array|list) for ads, you have to maintain multiple Facebook Ads Connection in the Airflow.  To support multiple account_id in the same Facebook Ads Connection, I corollated both account_id and the list of AdsInsights with returning Dict[str, List[AdsInsights]] but this was a breaking change since the return type was changed. After that, I updated the logic with this check if the account_id in the provided Facebook Connection as a string (single account_id), it returns the same format which is List[AdsInsights] and in the operator this situation also handled. If the multiple account_id is provided in the Facebook Ads Connection as an array, then the check provide to return Dict[str, List[AdsInsights]] which every report correlated 
 with account_id to export them regarding account_id. For that reason, I called that check as is_backward which provides the check of the old connection format to work as it is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r744334285



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Well, not necessarily - you could easily make it backwards-compatible. You could return List in case just one account is passed, but if you send more than one, you would return a Dict. That sounds rather easy to implement and rather logical.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753332173



##########
File path: airflow/providers/google/cloud/transfers/facebook_ads_to_gcs.py
##########
@@ -128,11 +145,79 @@ def execute(self, context: dict):
         service = FacebookAdsReportingHook(
             facebook_conn_id=self.facebook_conn_id, api_version=self.api_version
         )
-        rows = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
+        bulk_report = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
 
+        if isinstance(bulk_report, list):
+            converted_rows_with_action = self._generate_rows_with_action(False)
+            converted_rows_with_action = self._prepare_rows_for_upload(
+                rows=bulk_report, converted_rows_with_action=converted_rows_with_action, account_id=None
+            )
+        elif isinstance(bulk_report, dict):
+            converted_rows_with_action = self._generate_rows_with_action(True)
+            for account_id in bulk_report.keys():
+                rows = bulk_report.get(account_id, [])
+                if rows:
+                    converted_rows_with_action = self._prepare_rows_for_upload(
+                        rows=rows,
+                        converted_rows_with_action=converted_rows_with_action,
+                        account_id=account_id,
+                    )
+                else:
+                    self.log.warning("account_id: %s returned empty report", str(account_id))
+        else:
+            message = "Facebook Ads Hook returned different type than expected"
+            raise AirflowException(message)
+        total_row_count = self._decide_and_flush(converted_rows_with_action=converted_rows_with_action)
+        self.log.info("Facebook Returned %s data points in total: ", total_row_count)
+
+    def _generate_rows_with_action(self, type_check: bool):
+        if type_check and self.upload_as_account:
+            return {FlushAction.EXPORT_EVERY_ACCOUNT: []}
+        else:
+            return {FlushAction.EXPORT_ONCE: []}
+
+    def _prepare_rows_for_upload(
+        self,
+        rows: List[AdsInsights],
+        converted_rows_with_action: Dict[FlushAction, list],
+        account_id: Optional[str],
+    ):
         converted_rows = [dict(row) for row in rows]
-        self.log.info("Facebook Returned %s data points", len(converted_rows))
+        if account_id is not None and self.upload_as_account:
+            converted_rows_with_action[FlushAction.EXPORT_EVERY_ACCOUNT].append(
+                {"account_id": account_id, "converted_rows": converted_rows}
+            )
+            self.log.info(
+                "Facebook Returned %s data points for account_id: %s", len(converted_rows), account_id
+            )
+        else:
+            converted_rows_with_action[FlushAction.EXPORT_ONCE].extend(converted_rows)
+            self.log.info("Facebook Returned %s data points ", len(converted_rows))
+        return converted_rows_with_action
 
+    def _decide_and_flush(self, converted_rows_with_action: Dict[FlushAction, list]):
+        total_data_count = 0
+        if FlushAction.EXPORT_ONCE in converted_rows_with_action:
+            self._flush_rows(
+                converted_rows=converted_rows_with_action.get(FlushAction.EXPORT_ONCE),
+                object_name=self.object_name,
+            )
+            total_data_count += len(converted_rows_with_action.get(FlushAction.EXPORT_ONCE))
+        elif FlushAction.EXPORT_EVERY_ACCOUNT in converted_rows_with_action:
+            for converted_rows in converted_rows_with_action.get(FlushAction.EXPORT_EVERY_ACCOUNT):
+                self._flush_rows(
+                    converted_rows=converted_rows.get("converted_rows"),
+                    object_name=self._transform_object_name_with_account_id(
+                        account_id=converted_rows.get("account_id")
+                    ),
+                )
+                total_data_count += len(converted_rows.get("converted_rows"))
+        else:
+            message = "FlushAction not found in the data"
+            raise AirflowException(message)

Review comment:
       I have updated the message and added the data to investigate whether it returned the correct keys or not.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r747630615



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -98,14 +99,17 @@ def facebook_ads_config(self) -> Dict:
         if missing_keys:
             message = f"{missing_keys} fields are missing"
             raise AirflowException(message)
+        # Solve a way to check in bulk_facebook_report to handle list and dict return there
+        if type(config["account_id"]) is not list:

Review comment:
       Normally, the `account_id` was given as a string in the Facebook Ads Connection in the Airflow, with this check it is controlling rather the connection is the old version or is a new version which is an array of `account_id`. Also, I have updated this part regarding your suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753922348



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -105,24 +110,74 @@ def bulk_facebook_report(
         params: Dict[str, Any],
         fields: List[str],
         sleep_time: int = 5,
-    ) -> List[AdsInsights]:
+    ) -> Union[List[AdsInsights], Dict[str, List[AdsInsights]]]:
         """
-        Pulls data from the Facebook Ads API
+        Pulls data from the Facebook Ads API regarding Account ID with matching return type.
+        The return type and value depends on the ``account_id`` configuration. If the
+        configuration is a str representing a single Account ID, the return value is the
+        list of reports for that ID. If the configuration is a list of str representing
+        multiple Account IDs, the return value is a dict of Account IDs and their
+        respective list of reports.

Review comment:
       ```suggestion
           """Pulls data from the Facebook Ads API regarding Account ID with matching return type.
           
           The return type and value depends on the ``account_id`` configuration. If the
           configuration is a str representing a single Account ID, the return value is the
           list of reports for that ID. If the configuration is a list of str representing
           multiple Account IDs, the return value is a dict of Account IDs and their
           respective list of reports.
   ```
   
   > Multi-line docstrings consist of a summary line just like a one-line docstring, followed by a blank line, followed by a more elaborate description.
   
   https://www.python.org/dev/peps/pep-0257/




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r749935017



##########
File path: airflow/providers/google/cloud/transfers/facebook_ads_to_gcs.py
##########
@@ -128,11 +145,79 @@ def execute(self, context: dict):
         service = FacebookAdsReportingHook(
             facebook_conn_id=self.facebook_conn_id, api_version=self.api_version
         )
-        rows = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
+        bulk_report = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
 
+        if isinstance(bulk_report, list):
+            converted_rows_with_action = self._generate_rows_with_action(False)
+            converted_rows_with_action = self._prepare_rows_for_upload(
+                rows=bulk_report, converted_rows_with_action=converted_rows_with_action, account_id=None
+            )
+        elif isinstance(bulk_report, dict):
+            converted_rows_with_action = self._generate_rows_with_action(True)
+            for account_id in bulk_report.keys():
+                rows = bulk_report.get(account_id, [])
+                if rows:
+                    converted_rows_with_action = self._prepare_rows_for_upload(
+                        rows=rows,
+                        converted_rows_with_action=converted_rows_with_action,
+                        account_id=account_id,
+                    )
+                else:
+                    self.log.warning("account_id: %s returned empty report", str(account_id))
+        else:
+            message = "Facebook Ads Hook returned different type than expected"
+            raise AirflowException(message)

Review comment:
       Probably a good idea to log what exactly was returned to help debugging.

##########
File path: airflow/providers/google/cloud/transfers/facebook_ads_to_gcs.py
##########
@@ -128,11 +145,79 @@ def execute(self, context: dict):
         service = FacebookAdsReportingHook(
             facebook_conn_id=self.facebook_conn_id, api_version=self.api_version
         )
-        rows = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
+        bulk_report = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
 
+        if isinstance(bulk_report, list):
+            converted_rows_with_action = self._generate_rows_with_action(False)
+            converted_rows_with_action = self._prepare_rows_for_upload(
+                rows=bulk_report, converted_rows_with_action=converted_rows_with_action, account_id=None
+            )
+        elif isinstance(bulk_report, dict):
+            converted_rows_with_action = self._generate_rows_with_action(True)
+            for account_id in bulk_report.keys():
+                rows = bulk_report.get(account_id, [])
+                if rows:
+                    converted_rows_with_action = self._prepare_rows_for_upload(
+                        rows=rows,
+                        converted_rows_with_action=converted_rows_with_action,
+                        account_id=account_id,
+                    )
+                else:
+                    self.log.warning("account_id: %s returned empty report", str(account_id))
+        else:
+            message = "Facebook Ads Hook returned different type than expected"
+            raise AirflowException(message)
+        total_row_count = self._decide_and_flush(converted_rows_with_action=converted_rows_with_action)
+        self.log.info("Facebook Returned %s data points in total: ", total_row_count)
+
+    def _generate_rows_with_action(self, type_check: bool):
+        if type_check and self.upload_as_account:
+            return {FlushAction.EXPORT_EVERY_ACCOUNT: []}
+        else:
+            return {FlushAction.EXPORT_ONCE: []}
+
+    def _prepare_rows_for_upload(
+        self,
+        rows: List[AdsInsights],
+        converted_rows_with_action: Dict[FlushAction, list],
+        account_id: Optional[str],
+    ):
         converted_rows = [dict(row) for row in rows]
-        self.log.info("Facebook Returned %s data points", len(converted_rows))
+        if account_id is not None and self.upload_as_account:
+            converted_rows_with_action[FlushAction.EXPORT_EVERY_ACCOUNT].append(
+                {"account_id": account_id, "converted_rows": converted_rows}
+            )
+            self.log.info(
+                "Facebook Returned %s data points for account_id: %s", len(converted_rows), account_id
+            )
+        else:
+            converted_rows_with_action[FlushAction.EXPORT_ONCE].extend(converted_rows)
+            self.log.info("Facebook Returned %s data points ", len(converted_rows))
+        return converted_rows_with_action
 
+    def _decide_and_flush(self, converted_rows_with_action: Dict[FlushAction, list]):
+        total_data_count = 0
+        if FlushAction.EXPORT_ONCE in converted_rows_with_action:
+            self._flush_rows(
+                converted_rows=converted_rows_with_action.get(FlushAction.EXPORT_ONCE),
+                object_name=self.object_name,
+            )
+            total_data_count += len(converted_rows_with_action.get(FlushAction.EXPORT_ONCE))
+        elif FlushAction.EXPORT_EVERY_ACCOUNT in converted_rows_with_action:
+            for converted_rows in converted_rows_with_action.get(FlushAction.EXPORT_EVERY_ACCOUNT):
+                self._flush_rows(
+                    converted_rows=converted_rows.get("converted_rows"),
+                    object_name=self._transform_object_name_with_account_id(
+                        account_id=converted_rows.get("account_id")
+                    ),
+                )
+                total_data_count += len(converted_rows.get("converted_rows"))
+        else:
+            message = "FlushAction not found in the data"
+            raise AirflowException(message)

Review comment:
       Same

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +125,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type

Review comment:
       Probably want to describe a bit more what account ID type triggers what kinds of results. It may be a good idea to provide a couple of examples above.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753457213



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -105,24 +110,79 @@ def bulk_facebook_report(
         params: Dict[str, Any],
         fields: List[str],
         sleep_time: int = 5,
-    ) -> List[AdsInsights]:
+    ) -> Union[List[AdsInsights], Dict[str, List[AdsInsights]]]:
         """
-        Pulls data from the Facebook Ads API
+        Pulls data from the Facebook Ads API regarding Account ID with matching return type
+
+        .. seealso::
+            If the type of Account ID is a List account_id -> list

Review comment:
       This `->` syntax is a bit difficult to understand to me. It’s probably better to write this is prose, something like
   
   ```rst
   The return type and value depends on the ``account_id`` configuration. If the
   configuration is a str representing a single Account ID, the return value is the
   list of reports for that ID. If the configuration is a list of str representing
   multiple Account IDs, the return value is a dict of Account IDs and their
   respective list of reports.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753458174



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -105,24 +110,79 @@ def bulk_facebook_report(
         params: Dict[str, Any],
         fields: List[str],
         sleep_time: int = 5,
-    ) -> List[AdsInsights]:
+    ) -> Union[List[AdsInsights], Dict[str, List[AdsInsights]]]:
         """
-        Pulls data from the Facebook Ads API
+        Pulls data from the Facebook Ads API regarding Account ID with matching return type
+
+        .. seealso::
+            If the type of Account ID is a List account_id -> list

Review comment:
       Also this part does not belong in ``seealso``, which is generally used to reference external content.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-988631333


   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742105387



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more resources than now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 edited a comment on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 edited a comment on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-963768252


   > As discussed - it needs backwards compatibiltty - so i am changing it to "request changes" so that it does not get merged accidentally.
   
   Hi @potiuk, I have made the changes regarding backward compatibility. I have made little changes in the hook and operator. 
   Hook:
   I added a return type of Union which returns as follows.
   - If account_id is string (old format), returns List[AdsInsights]
   - If account_id is list (new format), returns Dict[str, List[AdsInsights]]
   Operator:
   I add a check for the return type of hook. 
   - if List[AdsInsights] returned, the operator exports still in the same format and upload_as_account parameter useless.
   - if Dict[str, List[AdsInsights]] returned, user could choose export format with upload_as_account parameter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r741921242



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       ```suggestion
               self.log.info(
                   "%s Account Id used to extract data from Facebook Ads Iterators successfully", account_id
               )
   ```

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       It looks like a breaking change. What can we do to prevent this from happening?

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       ```suggestion
               self.log.info(
                   "%s Account Id used to extract data from Facebook Ads Iterators successfully", account_id
               )
   ```

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       It looks like a breaking change. What can we do to prevent this from happening?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r744343709



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hi @potiuk, I have not thought that way. For sure, I could easily support that with a simple implementation. Since this hook is only used by FacebookAdsReportToGcsOperator, I thought that while I am changing the return type of the hook only affects this operator. For that reason, since I have changed that operator, I thought the return type could be a single type and not affect anywhere else. I will make that change to provide backward compatibility as soon as possible.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r753331696



##########
File path: airflow/providers/google/cloud/transfers/facebook_ads_to_gcs.py
##########
@@ -128,11 +145,79 @@ def execute(self, context: dict):
         service = FacebookAdsReportingHook(
             facebook_conn_id=self.facebook_conn_id, api_version=self.api_version
         )
-        rows = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
+        bulk_report = service.bulk_facebook_report(params=self.parameters, fields=self.fields)
 
+        if isinstance(bulk_report, list):
+            converted_rows_with_action = self._generate_rows_with_action(False)
+            converted_rows_with_action = self._prepare_rows_for_upload(
+                rows=bulk_report, converted_rows_with_action=converted_rows_with_action, account_id=None
+            )
+        elif isinstance(bulk_report, dict):
+            converted_rows_with_action = self._generate_rows_with_action(True)
+            for account_id in bulk_report.keys():
+                rows = bulk_report.get(account_id, [])
+                if rows:
+                    converted_rows_with_action = self._prepare_rows_for_upload(
+                        rows=rows,
+                        converted_rows_with_action=converted_rows_with_action,
+                        account_id=account_id,
+                    )
+                else:
+                    self.log.warning("account_id: %s returned empty report", str(account_id))
+        else:
+            message = "Facebook Ads Hook returned different type than expected"
+            raise AirflowException(message)

Review comment:
       I have updated the message and added the type of the returned object from the hook.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r759392861



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -105,24 +110,74 @@ def bulk_facebook_report(
         params: Dict[str, Any],
         fields: List[str],
         sleep_time: int = 5,
-    ) -> List[AdsInsights]:
+    ) -> Union[List[AdsInsights], Dict[str, List[AdsInsights]]]:
         """
-        Pulls data from the Facebook Ads API
+        Pulls data from the Facebook Ads API regarding Account ID with matching return type.
+        The return type and value depends on the ``account_id`` configuration. If the
+        configuration is a str representing a single Account ID, the return value is the
+        list of reports for that ID. If the configuration is a list of str representing
+        multiple Account IDs, the return value is a dict of Account IDs and their
+        respective list of reports.

Review comment:
       Hi @uranusjr, sorry for the late reply. I have made the changes. Thank you,




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #19377:
URL: https://github.com/apache/airflow/pull/19377


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r741921242



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       ```suggestion
               self.log.info(
                   "%s Account Id used to extract data from Facebook Ads Iterators successfully", account_id
               )
   ```

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       It looks like a breaking change. What can we do to prevent this from happening?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702692


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742105387



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more required resources than now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742028280



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the changes. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more required resources than now.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the change. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more resources than now.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the changes. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more required resources than now.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the change. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more resources than now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958758081


   Hey @potiuk, thank you very much for your review and your comment. It is strange that static checks give errors for the flake and the black. I have run for all my changes before opening the PR. I have checked again the static code problem in the operator and rerun it. It should be fine now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958702187


   There were some static code cancelled so I reopened it, but it looks pretty good


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r742028280



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the changes. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more required resources than now.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]
         """
+        all_insights = {}
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
-        _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
-        while True:
-            request = _async.api_get()
-            async_status = request[AdReportRun.Field.async_status]
-            percent = request[AdReportRun.Field.async_percent_completion]
-            self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
-            if async_status == JobStatus.COMPLETED.value:
-                self.log.info("Job run completed")
-                break
-            if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
-                message = f"{async_status}. Please retry."
-                raise AirflowException(message)
-            time.sleep(sleep_time)
-        report_run_id = _async.api_get()["report_run_id"]
-        report_object = AdReportRun(report_run_id, api=api)
-        insights = report_object.get_insights()
-        self.log.info("Extracting data from returned Facebook Ads Iterators")
-        return list(insights)
+        for account_id in self.facebook_ads_config["account_id"]:
+            ad_account = AdAccount(account_id, api=api)
+            _async = ad_account.get_insights(params=params, fields=fields, is_async=True)
+            while True:
+                request = _async.api_get()
+                async_status = request[AdReportRun.Field.async_status]
+                percent = request[AdReportRun.Field.async_percent_completion]
+                self.log.info("%s %s completed, async_status: %s", percent, "%", async_status)
+                if async_status == JobStatus.COMPLETED.value:
+                    self.log.info("Job run completed")
+                    break
+                if async_status in [JobStatus.SKIPPED.value, JobStatus.FAILED.value]:
+                    message = f"{async_status}. Please retry."
+                    raise AirflowException(message)
+                time.sleep(sleep_time)
+            report_run_id = _async.api_get()["report_run_id"]
+            report_object = AdReportRun(report_run_id, api=api)
+            self.log.info("Extracting data from returned Facebook Ads Iterators")
+            all_insights[account_id] = list(report_object.get_insights())
+            self.log.info(
+                str(account_id) + " Account Id used to extract data from Facebook Ads Iterators successfully"
+            )

Review comment:
       Hey @mik-laj, I made the change. I think I was too focused on the feature itself, I missed this one. Thank you very much.

##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -119,25 +120,30 @@ def bulk_facebook_report(
         :type sleep_time: int
 
         :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :rtype: Dict[str, List[AdsInsights]]

Review comment:
       Hey @mik-laj, I did this due to we are getting the account_id from the connection extras. We can get the connection from a hook and cannot in operator as I understand from my tracing the code. On the other hand, I think that if we are supporting multiple account_id at once, we need to separate those files while we are exporting to provide distinct files (optional parameter -> upload_as_account). To do this, I need to pass the account_id or correlate it with the list of Ad Insight data.
   I read from Facebook documentation, the Ad Insight data hold the account_id information, we can get from there but if we get account_id from the data most probably the data transformation part in the operator will be more complex. This would cause consuming more resources than now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958758081


   Hey @potiuk, thank you very much for your review and your comment. It is strange that static checks give errors for the flake and the black. I have run for all my changes before opening the PR. I have checked again the static code problem in the operator and rerun it. It should be fine now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958605991


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#issuecomment-958605991


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] bugraoz93 commented on a change in pull request #19377: Providers facebook hook multiple account

Posted by GitBox <gi...@apache.org>.
bugraoz93 commented on a change in pull request #19377:
URL: https://github.com/apache/airflow/pull/19377#discussion_r749826953



##########
File path: airflow/providers/facebook/ads/hooks/ads.py
##########
@@ -118,11 +122,32 @@ def bulk_facebook_report(
         :param sleep_time: Time to sleep when async call is happening
         :type sleep_time: int
 
-        :return: Facebook Ads API response, converted to Facebook Ads Row objects
-        :rtype: List[AdsInsights]
+        :return: Facebook Ads API response,
+            converted to Facebook Ads Row objects regarding given Account ID type
+        :rtype: List[AdsInsights] or Dict[str, List[AdsInsights]]
         """
         api = self._get_service()
-        ad_account = AdAccount(api.get_default_account_id(), api=api)
+        if self.is_backward:
+            return self._facebook_report(
+                account_id=self.facebook_ads_config["account_id"],
+                api=api,
+                params=params,
+                fields=fields,
+                sleep_time=sleep_time,
+            )

Review comment:
       I have changed the controlling flag to a `cached_property`. You are right. It is definitely more readable and safer to use `cached_property` and not mutate any hook object in `_get_service` in this way.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org