You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Austin Bennett <wh...@gmail.com> on 2022/08/04 18:24:24 UTC

PubSub Lite IO & Python?

Hi Users/Devs,

Drew, copied, reported having troubles with PubSub Lite:

"we just weren’t able to get PubSub Lite working with PyBeam. It’s been a
few weeks since we last tried, but we were just trying to use
`apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
<https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
) in PyBeam and couldn’t get it to import so we just gave up. From the
looks of the repo we couldn’t tell if it was ever actually fully
implemented and published"

I haven't used myself, and figured others might be able to comment/share at
least if any have had success using and/or at least whether fully
tested/implemented IO ( whether available via cross-language or 'native'
python ).

Please share any thoughts here.

Cheers,
Austin

Re: PubSub Lite IO & Python?

Posted by Chamikara Jayalath via user <us...@beam.apache.org>.
Do you see anything under the "kubelet" log ? For example, this might show
errors related to container startup. (you have to specifically select it to
see the logs).

If you are actually not seeing any errors, you might have to go
through Google Cloud support so that they can look at your specific
Dataflow job.

On Thu, Aug 25, 2022 at 5:57 PM Drew Forbes <dr...@thatgamecompany.com>
wrote:

> Attached screenshot is all I get from the Worker Logs on debug. The job
> logs all look pretty normal, it's even got a "Autoscaling: Reduced the
> number of workers to 0 based on low average worker CPU utilization, and the
> pipeline having sufficiently low backlog and keeping up with input rate.",
> which seems to me like it's just not really reading things correctly
>
> On Thu, Aug 25, 2022 at 8:44 PM Chamikara Jayalath <ch...@google.com>
> wrote:
>
>> Do you see any errors in Dataflow Cloud Console logs ?
>> Note that you have to click the Worker Logs tab and select all "log
>> names" under Dataflow to see all logs.
>>
>> I suspect your job might not be starting up properly but hard to say
>> without looking at details.
>>
>> On Thu, Aug 25, 2022 at 5:36 PM Drew Forbes <
>> drew.forbes@thatgamecompany.com> wrote:
>>
>>> Yeah I'm not sure, I've tried a couple different ways to run it and
>>> still no luck. I'm able to read from that subscription with a Java beam job
>>> (which is likely what you were seeing in the internal metrics) but I just
>>> can't get the python job to do anything.
>>>
>>> Unfortunately my Java skills have atrophied since college, and I'm
>>> having a lot of trouble developing within the Beam paradigm in Java, so
>>> we're going to try moving in a different direction for now.
>>>
>>> On Wed, Aug 24, 2022 at 11:53 AM Daniel Collins <dp...@google.com>
>>> wrote:
>>>
>>>> Hello Drew,
>>>>
>>>> The object type is the SequencedMessage type here
>>>> https://github.com/googleapis/python-pubsublite/blob/b77cf6ddeaae4e950ed069b652a22a1fc79f74ea/google/cloud/pubsublite_v1/types/common.py#L109,
>>>> so the correct lambda is likely `lambda x: json.loads(x.message.data)`
>>>>
>>>> It does appear from internal metrics that your client is reading data
>>>> from that subscription. If your json.loads call fails, I'm unsure why this
>>>> wouldn't surface as an error in the runtime.
>>>>
>>>> -Daniel
>>>>
>>>> On Tuesday, August 23, 2022 at 2:52:54 PM UTC-4 Drew Forbes wrote:
>>>>
>>>>> Hey all, thank you for these updates. We ran into some other issues
>>>>> with our Spark - PubSubLite pipeline so I've had time to re-evaluate Beam
>>>>> with PubSubLite. I was able to get the package importing correctly and the
>>>>> job to build using "from apache_beam.io.gcp.pubsublite import *". Not sure
>>>>> why that worked when the other permutations on that failed, but I'll take
>>>>> it.
>>>>>
>>>>> At risk of turning this into a troubleshooting thread (please feel
>>>>> free to turn me away if that's not something y'all have interest in), I'm
>>>>> going to ask if you've got any ideas on why this pipeline isn't actually
>>>>> reading from PSL. I'm submitting jobs either to DirectRunner or
>>>>> DataflowRunner and they are spinning up and staying up without error, but
>>>>> they're not actually reading anything from the PSL subscription and thus
>>>>> not doing any work. I've got several Java Beam jobs that read correctly
>>>>> from PSL subscriptions as well as a test Python Beam job that can read from
>>>>> regular PubSub, but I'm not sure what's happening here. There aren't any
>>>>> logs to go off of either.
>>>>>
>>>>> Below is the pipeline code, I actually expect this to fail since I'm
>>>>> not sure how to do the parsing, this is the same code I used with
>>>>> ReadFromPubSub except for with Lite. But I'm not even getting errors yet
>>>>> because it's not reading from the Subscription. Any rough ideas I can try?
>>>>>
>>>>>     with beam.Pipeline(options=pipeline_options) as p:
>>>>>>
>>>>>>         p | 'Read From PubSubLite' >> ReadFromPubSubLite(
>>>>>>
>>>>>> subscription_path='projects/starwatch/locations/us-west1-a/subscriptions/sw-test-sky-events-timescale'
>>>>>>         ) | 'JSONParse' >> beam.Map(lambda x: json.loads(x)
>>>>>>         ) | 'Extract JSON Columns' >> beam.Map(extract_json_columns
>>>>>>         ) | 'To string' >> beam.ToString.Element() | beam.Map(print
>>>>>>         ) | 'Writing to DB' >> relational_db.Write(
>>>>>>             source_config=source_config,
>>>>>>             table_config=table_config
>>>>>>         )
>>>>>>
>>>>>
>>>>> On Fri, Aug 5, 2022 at 1:11 AM Austin Bennett <
>>>>> whatwouldaustindo@gmail.com> wrote:
>>>>>
>>>>>> @cham thanks for bringing the conversation back to the list ( esp.
>>>>>> for anyone else searching/wondering in the future )!
>>>>>>
>>>>>> From what I understand/summary:  Python should be able to call via
>>>>>> X-Lang the [ Java ] PubSubLite IO for use with any underlying runner (
>>>>>> well, that utilizes portable runner, ex: Spark, Flink, DataflowV2, etc  )
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 4, 2022 at 5:49 PM Chamikara Jayalath via user <
>>>>>> user@beam.apache.org> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 4, 2022 at 5:29 PM Daniel Collins <dp...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello Drew,
>>>>>>>>
>>>>>>>> > I upgraded to apache-beam 2.40.0 and tried to access
>>>>>>>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite
>>>>>>>>
>>>>>>>> You should ensure to import `apache_beam.io.gcp.pubsublite.*`. I
>>>>>>>> have no idea why the specific import isn't working- but that should work.
>>>>>>>> If its not, I'll look into it more.
>>>>>>>>
>>>>>>>> > writing native Spark code to pull from PubSub Lite
>>>>>>>>
>>>>>>>> Note that we have a spark native source you can use. I'm unsure if
>>>>>>>> spark works with beam python however, Chamikara would know that better.
>>>>>>>> https://github.com/googleapis/java-pubsublite-spark
>>>>>>>>
>>>>>>>
>>>>>>> It should be supported. See instructions here under "Portable
>>>>>>> (Java/Python/Go)":
>>>>>>> https://beam.apache.org/documentation/runners/spark/
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -Daniel
>>>>>>>>
>>>>>>>> On Thu, Aug 4, 2022 at 7:48 PM Drew Forbes <
>>>>>>>> drew.forbes@thatgamecompany.com> wrote:
>>>>>>>>
>>>>>>>>> I've actually not used PyBeam, I just meant writing Beam code with
>>>>>>>>> Python. Didn't realize there was a whole separate PyBeam package.
>>>>>>>>>
>>>>>>>>
>>>>>>> Thanks for clarifying.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Cham
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> I feel dumb asking, but basically we just couldn't get the import
>>>>>>>>> to work. I upgraded to apache-beam 2.40.0 and tried to access
>>>>>>>>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite through various
>>>>>>>>> means (regular import, proto_api, something like .external., etc) within
>>>>>>>>> Python and determined that there just wasn't anything to access. We could
>>>>>>>>> definitely have been wrong about that but it wasn't clear how to move
>>>>>>>>> forward so we just switched our focus to writing native Spark code to pull
>>>>>>>>> from PubSub Lite
>>>>>>>>>
>>>>>>>>> On Thu, Aug 4, 2022 at 6:46 PM Chamikara Jayalath <
>>>>>>>>> chamikara@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> I believe this should be fully working. I'm not familiar with
>>>>>>>>>> PyBeam though. Is the execution mechanism the same as running a regular
>>>>>>>>>> Beam pipeline ? Also, note that for multi-language, you need to use a
>>>>>>>>>> portable Beam runner.
>>>>>>>>>>
>>>>>>>>>> +Daniel Collins <dp...@google.com> who implemented this.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Cham
>>>>>>>>>>
>>>>>>>>>> On Thu, Aug 4, 2022 at 11:24 AM Austin Bennett <
>>>>>>>>>> whatwouldaustindo@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Users/Devs,
>>>>>>>>>>>
>>>>>>>>>>> Drew, copied, reported having troubles with PubSub Lite:
>>>>>>>>>>>
>>>>>>>>>>> "we just weren’t able to get PubSub Lite working with PyBeam.
>>>>>>>>>>> It’s been a few weeks since we last tried, but we were just trying to use
>>>>>>>>>>> `apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
>>>>>>>>>>> <https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
>>>>>>>>>>> ) in PyBeam and couldn’t get it to import so we just gave up. From the
>>>>>>>>>>> looks of the repo we couldn’t tell if it was ever actually fully
>>>>>>>>>>> implemented and published"
>>>>>>>>>>>
>>>>>>>>>>> I haven't used myself, and figured others might be able to
>>>>>>>>>>> comment/share at least if any have had success using and/or at least
>>>>>>>>>>> whether fully tested/implemented IO ( whether available via cross-language
>>>>>>>>>>> or 'native' python ).
>>>>>>>>>>>
>>>>>>>>>>> Please share any thoughts here.
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Austin
>>>>>>>>>>>
>>>>>>>>>>>

Re: PubSub Lite IO & Python?

Posted by Chamikara Jayalath via user <us...@beam.apache.org>.
Do you see any errors in Dataflow Cloud Console logs ?
Note that you have to click the Worker Logs tab and select all "log names"
under Dataflow to see all logs.

I suspect your job might not be starting up properly but hard to say
without looking at details.

On Thu, Aug 25, 2022 at 5:36 PM Drew Forbes <dr...@thatgamecompany.com>
wrote:

> Yeah I'm not sure, I've tried a couple different ways to run it and still
> no luck. I'm able to read from that subscription with a Java beam job
> (which is likely what you were seeing in the internal metrics) but I just
> can't get the python job to do anything.
>
> Unfortunately my Java skills have atrophied since college, and I'm having
> a lot of trouble developing within the Beam paradigm in Java, so we're
> going to try moving in a different direction for now.
>
> On Wed, Aug 24, 2022 at 11:53 AM Daniel Collins <dp...@google.com>
> wrote:
>
>> Hello Drew,
>>
>> The object type is the SequencedMessage type here
>> https://github.com/googleapis/python-pubsublite/blob/b77cf6ddeaae4e950ed069b652a22a1fc79f74ea/google/cloud/pubsublite_v1/types/common.py#L109,
>> so the correct lambda is likely `lambda x: json.loads(x.message.data)`
>>
>> It does appear from internal metrics that your client is reading data
>> from that subscription. If your json.loads call fails, I'm unsure why this
>> wouldn't surface as an error in the runtime.
>>
>> -Daniel
>>
>> On Tuesday, August 23, 2022 at 2:52:54 PM UTC-4 Drew Forbes wrote:
>>
>>> Hey all, thank you for these updates. We ran into some other issues with
>>> our Spark - PubSubLite pipeline so I've had time to re-evaluate Beam with
>>> PubSubLite. I was able to get the package importing correctly and the job
>>> to build using "from apache_beam.io.gcp.pubsublite import *". Not sure why
>>> that worked when the other permutations on that failed, but I'll take it.
>>>
>>> At risk of turning this into a troubleshooting thread (please feel free
>>> to turn me away if that's not something y'all have interest in), I'm going
>>> to ask if you've got any ideas on why this pipeline isn't actually reading
>>> from PSL. I'm submitting jobs either to DirectRunner or DataflowRunner and
>>> they are spinning up and staying up without error, but they're not actually
>>> reading anything from the PSL subscription and thus not doing any work.
>>> I've got several Java Beam jobs that read correctly from PSL subscriptions
>>> as well as a test Python Beam job that can read from regular PubSub, but
>>> I'm not sure what's happening here. There aren't any logs to go off of
>>> either.
>>>
>>> Below is the pipeline code, I actually expect this to fail since I'm not
>>> sure how to do the parsing, this is the same code I used with
>>> ReadFromPubSub except for with Lite. But I'm not even getting errors yet
>>> because it's not reading from the Subscription. Any rough ideas I can try?
>>>
>>>     with beam.Pipeline(options=pipeline_options) as p:
>>>>
>>>>         p | 'Read From PubSubLite' >> ReadFromPubSubLite(
>>>>
>>>> subscription_path='projects/starwatch/locations/us-west1-a/subscriptions/sw-test-sky-events-timescale'
>>>>         ) | 'JSONParse' >> beam.Map(lambda x: json.loads(x)
>>>>         ) | 'Extract JSON Columns' >> beam.Map(extract_json_columns
>>>>         ) | 'To string' >> beam.ToString.Element() | beam.Map(print
>>>>         ) | 'Writing to DB' >> relational_db.Write(
>>>>             source_config=source_config,
>>>>             table_config=table_config
>>>>         )
>>>>
>>>
>>> On Fri, Aug 5, 2022 at 1:11 AM Austin Bennett <
>>> whatwouldaustindo@gmail.com> wrote:
>>>
>>>> @cham thanks for bringing the conversation back to the list ( esp. for
>>>> anyone else searching/wondering in the future )!
>>>>
>>>> From what I understand/summary:  Python should be able to call via
>>>> X-Lang the [ Java ] PubSubLite IO for use with any underlying runner (
>>>> well, that utilizes portable runner, ex: Spark, Flink, DataflowV2, etc  )
>>>>
>>>>
>>>>
>>>> On Thu, Aug 4, 2022 at 5:49 PM Chamikara Jayalath via user <
>>>> user@beam.apache.org> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 4, 2022 at 5:29 PM Daniel Collins <dp...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hello Drew,
>>>>>>
>>>>>> > I upgraded to apache-beam 2.40.0 and tried to access
>>>>>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite
>>>>>>
>>>>>> You should ensure to import `apache_beam.io.gcp.pubsublite.*`. I have
>>>>>> no idea why the specific import isn't working- but that should work. If
>>>>>> its not, I'll look into it more.
>>>>>>
>>>>>> > writing native Spark code to pull from PubSub Lite
>>>>>>
>>>>>> Note that we have a spark native source you can use. I'm unsure if
>>>>>> spark works with beam python however, Chamikara would know that better.
>>>>>> https://github.com/googleapis/java-pubsublite-spark
>>>>>>
>>>>>
>>>>> It should be supported. See instructions here under "Portable
>>>>> (Java/Python/Go)":
>>>>> https://beam.apache.org/documentation/runners/spark/
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> -Daniel
>>>>>>
>>>>>> On Thu, Aug 4, 2022 at 7:48 PM Drew Forbes <
>>>>>> drew.forbes@thatgamecompany.com> wrote:
>>>>>>
>>>>>>> I've actually not used PyBeam, I just meant writing Beam code with
>>>>>>> Python. Didn't realize there was a whole separate PyBeam package.
>>>>>>>
>>>>>>
>>>>> Thanks for clarifying.
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>>
>>>>>>
>>>>>>> I feel dumb asking, but basically we just couldn't get the import to
>>>>>>> work. I upgraded to apache-beam 2.40.0 and tried to access
>>>>>>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite through various
>>>>>>> means (regular import, proto_api, something like .external., etc) within
>>>>>>> Python and determined that there just wasn't anything to access. We could
>>>>>>> definitely have been wrong about that but it wasn't clear how to move
>>>>>>> forward so we just switched our focus to writing native Spark code to pull
>>>>>>> from PubSub Lite
>>>>>>>
>>>>>>> On Thu, Aug 4, 2022 at 6:46 PM Chamikara Jayalath <
>>>>>>> chamikara@google.com> wrote:
>>>>>>>
>>>>>>>> I believe this should be fully working. I'm not familiar with
>>>>>>>> PyBeam though. Is the execution mechanism the same as running a regular
>>>>>>>> Beam pipeline ? Also, note that for multi-language, you need to use a
>>>>>>>> portable Beam runner.
>>>>>>>>
>>>>>>>> +Daniel Collins <dp...@google.com> who implemented this.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Cham
>>>>>>>>
>>>>>>>> On Thu, Aug 4, 2022 at 11:24 AM Austin Bennett <
>>>>>>>> whatwouldaustindo@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Users/Devs,
>>>>>>>>>
>>>>>>>>> Drew, copied, reported having troubles with PubSub Lite:
>>>>>>>>>
>>>>>>>>> "we just weren’t able to get PubSub Lite working with PyBeam. It’s
>>>>>>>>> been a few weeks since we last tried, but we were just trying to use
>>>>>>>>> `apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
>>>>>>>>> <https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
>>>>>>>>> ) in PyBeam and couldn’t get it to import so we just gave up. From the
>>>>>>>>> looks of the repo we couldn’t tell if it was ever actually fully
>>>>>>>>> implemented and published"
>>>>>>>>>
>>>>>>>>> I haven't used myself, and figured others might be able to
>>>>>>>>> comment/share at least if any have had success using and/or at least
>>>>>>>>> whether fully tested/implemented IO ( whether available via cross-language
>>>>>>>>> or 'native' python ).
>>>>>>>>>
>>>>>>>>> Please share any thoughts here.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Austin
>>>>>>>>>
>>>>>>>>>

Re: PubSub Lite IO & Python?

Posted by Austin Bennett <wh...@gmail.com>.
@cham thanks for bringing the conversation back to the list ( esp. for
anyone else searching/wondering in the future )!

From what I understand/summary:  Python should be able to call via X-Lang
the [ Java ] PubSubLite IO for use with any underlying runner ( well, that
utilizes portable runner, ex: Spark, Flink, DataflowV2, etc  )



On Thu, Aug 4, 2022 at 5:49 PM Chamikara Jayalath via user <
user@beam.apache.org> wrote:

>
>
> On Thu, Aug 4, 2022 at 5:29 PM Daniel Collins <dp...@google.com>
> wrote:
>
>> Hello Drew,
>>
>> > I upgraded to apache-beam 2.40.0 and tried to access
>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite
>>
>> You should ensure to import `apache_beam.io.gcp.pubsublite.*`. I have no
>> idea why the specific import isn't working- but that should work. If
>> its not, I'll look into it more.
>>
>> > writing native Spark code to pull from PubSub Lite
>>
>> Note that we have a spark native source you can use. I'm unsure if spark
>> works with beam python however, Chamikara would know that better.
>> https://github.com/googleapis/java-pubsublite-spark
>>
>
> It should be supported. See instructions here under "Portable
> (Java/Python/Go)": https://beam.apache.org/documentation/runners/spark/
>
>
>>
>>
>> -Daniel
>>
>> On Thu, Aug 4, 2022 at 7:48 PM Drew Forbes <
>> drew.forbes@thatgamecompany.com> wrote:
>>
>>> I've actually not used PyBeam, I just meant writing Beam code with
>>> Python. Didn't realize there was a whole separate PyBeam package.
>>>
>>
> Thanks for clarifying.
>
> Thanks,
> Cham
>
>
>>
>>> I feel dumb asking, but basically we just couldn't get the import to
>>> work. I upgraded to apache-beam 2.40.0 and tried to access
>>> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite through various means
>>> (regular import, proto_api, something like .external., etc) within Python
>>> and determined that there just wasn't anything to access. We could
>>> definitely have been wrong about that but it wasn't clear how to move
>>> forward so we just switched our focus to writing native Spark code to pull
>>> from PubSub Lite
>>>
>>> On Thu, Aug 4, 2022 at 6:46 PM Chamikara Jayalath <ch...@google.com>
>>> wrote:
>>>
>>>> I believe this should be fully working. I'm not familiar with PyBeam
>>>> though. Is the execution mechanism the same as running a regular Beam
>>>> pipeline ? Also, note that for multi-language, you need to use a portable
>>>> Beam runner.
>>>>
>>>> +Daniel Collins <dp...@google.com> who implemented this.
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>> On Thu, Aug 4, 2022 at 11:24 AM Austin Bennett <
>>>> whatwouldaustindo@gmail.com> wrote:
>>>>
>>>>> Hi Users/Devs,
>>>>>
>>>>> Drew, copied, reported having troubles with PubSub Lite:
>>>>>
>>>>> "we just weren’t able to get PubSub Lite working with PyBeam. It’s
>>>>> been a few weeks since we last tried, but we were just trying to use
>>>>> `apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
>>>>> <https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
>>>>> ) in PyBeam and couldn’t get it to import so we just gave up. From the
>>>>> looks of the repo we couldn’t tell if it was ever actually fully
>>>>> implemented and published"
>>>>>
>>>>> I haven't used myself, and figured others might be able to
>>>>> comment/share at least if any have had success using and/or at least
>>>>> whether fully tested/implemented IO ( whether available via cross-language
>>>>> or 'native' python ).
>>>>>
>>>>> Please share any thoughts here.
>>>>>
>>>>> Cheers,
>>>>> Austin
>>>>>
>>>>>

Re: PubSub Lite IO & Python?

Posted by Chamikara Jayalath via user <us...@beam.apache.org>.
On Thu, Aug 4, 2022 at 5:29 PM Daniel Collins <dp...@google.com> wrote:

> Hello Drew,
>
> > I upgraded to apache-beam 2.40.0 and tried to access
> apache_beam.io.gcp.pubsublite.ReadFromPubSubLite
>
> You should ensure to import `apache_beam.io.gcp.pubsublite.*`. I have no
> idea why the specific import isn't working- but that should work. If
> its not, I'll look into it more.
>
> > writing native Spark code to pull from PubSub Lite
>
> Note that we have a spark native source you can use. I'm unsure if spark
> works with beam python however, Chamikara would know that better.
> https://github.com/googleapis/java-pubsublite-spark
>

It should be supported. See instructions here under "Portable
(Java/Python/Go)": https://beam.apache.org/documentation/runners/spark/


>
>
> -Daniel
>
> On Thu, Aug 4, 2022 at 7:48 PM Drew Forbes <
> drew.forbes@thatgamecompany.com> wrote:
>
>> I've actually not used PyBeam, I just meant writing Beam code with
>> Python. Didn't realize there was a whole separate PyBeam package.
>>
>
Thanks for clarifying.

Thanks,
Cham


>
>> I feel dumb asking, but basically we just couldn't get the import to
>> work. I upgraded to apache-beam 2.40.0 and tried to access apache_beam.io
>> .gcp.pubsublite.ReadFromPubSubLite through various means (regular
>> import, proto_api, something like .external., etc) within Python and
>> determined that there just wasn't anything to access. We could definitely
>> have been wrong about that but it wasn't clear how to move forward so we
>> just switched our focus to writing native Spark code to pull from PubSub
>> Lite
>>
>> On Thu, Aug 4, 2022 at 6:46 PM Chamikara Jayalath <ch...@google.com>
>> wrote:
>>
>>> I believe this should be fully working. I'm not familiar with PyBeam
>>> though. Is the execution mechanism the same as running a regular Beam
>>> pipeline ? Also, note that for multi-language, you need to use a portable
>>> Beam runner.
>>>
>>> +Daniel Collins <dp...@google.com> who implemented this.
>>>
>>> Thanks,
>>> Cham
>>>
>>> On Thu, Aug 4, 2022 at 11:24 AM Austin Bennett <
>>> whatwouldaustindo@gmail.com> wrote:
>>>
>>>> Hi Users/Devs,
>>>>
>>>> Drew, copied, reported having troubles with PubSub Lite:
>>>>
>>>> "we just weren’t able to get PubSub Lite working with PyBeam. It’s been
>>>> a few weeks since we last tried, but we were just trying to use
>>>> `apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
>>>> <https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
>>>> ) in PyBeam and couldn’t get it to import so we just gave up. From the
>>>> looks of the repo we couldn’t tell if it was ever actually fully
>>>> implemented and published"
>>>>
>>>> I haven't used myself, and figured others might be able to
>>>> comment/share at least if any have had success using and/or at least
>>>> whether fully tested/implemented IO ( whether available via cross-language
>>>> or 'native' python ).
>>>>
>>>> Please share any thoughts here.
>>>>
>>>> Cheers,
>>>> Austin
>>>>
>>>>

Re: PubSub Lite IO & Python?

Posted by Chamikara Jayalath via user <us...@beam.apache.org>.
I believe this should be fully working. I'm not familiar with PyBeam
though. Is the execution mechanism the same as running a regular Beam
pipeline ? Also, note that for multi-language, you need to use a portable
Beam runner.

+Daniel Collins <dp...@google.com> who implemented this.

Thanks,
Cham

On Thu, Aug 4, 2022 at 11:24 AM Austin Bennett <wh...@gmail.com>
wrote:

> Hi Users/Devs,
>
> Drew, copied, reported having troubles with PubSub Lite:
>
> "we just weren’t able to get PubSub Lite working with PyBeam. It’s been a
> few weeks since we last tried, but we were just trying to use
> `apache_beam.io.gcp.pubsublite.ReadFromPubSubLite` (here
> <https://beam.apache.org/releases/pydoc/current/apache_beam.io.gcp.pubsublite.html>
> ) in PyBeam and couldn’t get it to import so we just gave up. From the
> looks of the repo we couldn’t tell if it was ever actually fully
> implemented and published"
>
> I haven't used myself, and figured others might be able to comment/share
> at least if any have had success using and/or at least whether fully
> tested/implemented IO ( whether available via cross-language or 'native'
> python ).
>
> Please share any thoughts here.
>
> Cheers,
> Austin
>
>