You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by André Rocha Silva <a....@portaltelemedicina.com.br> on 2020/01/09 14:48:31 UTC

dataflow and ImageMagick

Hi all!

I am trying to use imagemagick on Dataflow [Apache Beam Python 3.7 SDK
2.17.0], but I am facing a problem. The function works properly local, but
when I use it in Dataflow I receive this message:
File "/usr/local/lib/python3.7/site-packages/wand/image.py", line 7888, in
read raise WandRuntimeError(msg) wand.exceptions.WandRuntimeError:
MagickReadImage returns false, but did raise ImageMagick exception. This
can occurs when a delegate is missing, or returns EXIT_SUCCESS without
generating a raster.
As far as I researched, it may be because ghostscript is not installed.
Then I would like to know:

1) Is there a way to change the image that Dataflow loads for its workers?
If so, I could install the programs I need for the job

2) Can I ask Dataflow to install a program for all workers? It is not
feasible to run a " os.system(apt-get install ghostscript)" for every
single element, isn't it?

If someone has faced this problem and solved in another way, I am totally
open to suggestions.

Thank you
André Rocha Silva

Re: dataflow and ImageMagick

Posted by André Rocha Silva <a....@portaltelemedicina.com.br>.
Luke, it worked!

I changed the CUSTOM_COMMANDS on the juliaset example to:
"CUSTOM_COMMANDS = [
    ['apt-get', 'update'],
    ['apt-get', 'install', 'ghostscript', '-y']]"

Thank you very much!!

On Thu, Jan 9, 2020 at 3:03 PM Luke Cwik <lc...@google.com> wrote:

> Andre, add the required installation commands (e.g. the apt-get install
> commands) for the non-Python dependencies to the list of CUSTOM_COMMANDS in
> your setup.py file. See the Juliaset setup.py [1] for an example. Note: You
> must make sure that these commands are runnable on the remote worker (e.g.
> if you use apt-get, the remote worker needs apt-get support).
>
> See the website about managing Python dependencies [2] for more details.
>
> 1:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py
> 2:
> https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython
>
> On Thu, Jan 9, 2020 at 7:36 AM Leonardo Campos <
> leonardo.campos@gameduell.de> wrote:
>
>> Hi, Andre,
>>
>> On a very different topic, I was trying to find a way to change the JVM
>> default encoding and could not find a way to do so.
>> In this sense, it would also be of my interest to be able to influence
>> the image used by the workers.
>>
>> Sorry for having no help,
>> Leonardo Campos
>> On 1/9/20 3:48 PM, André Rocha Silva wrote:
>>
>> Hi all!
>>
>> I am trying to use imagemagick on Dataflow [Apache Beam Python 3.7 SDK
>> 2.17.0], but I am facing a problem. The function works properly local, but
>> when I use it in Dataflow I receive this message:
>> File "/usr/local/lib/python3.7/site-packages/wand/image.py", line 7888,
>> in read raise WandRuntimeError(msg) wand.exceptions.WandRuntimeError:
>> MagickReadImage returns false, but did raise ImageMagick exception. This
>> can occurs when a delegate is missing, or returns EXIT_SUCCESS without
>> generating a raster.
>> As far as I researched, it may be because ghostscript is not installed.
>> Then I would like to know:
>>
>> 1) Is there a way to change the image that Dataflow loads for its
>> workers? If so, I could install the programs I need for the job
>>
>> 2) Can I ask Dataflow to install a program for all workers? It is not
>> feasible to run a " os.system(apt-get install ghostscript)" for every
>> single element, isn't it?
>>
>> If someone has faced this problem and solved in another way, I am totally
>> open to suggestions.
>>
>> Thank you
>> André Rocha Silva
>>
>>

Re: dataflow and ImageMagick

Posted by Luke Cwik <lc...@google.com>.
Andre, add the required installation commands (e.g. the apt-get install
commands) for the non-Python dependencies to the list of CUSTOM_COMMANDS in
your setup.py file. See the Juliaset setup.py [1] for an example. Note: You
must make sure that these commands are runnable on the remote worker (e.g.
if you use apt-get, the remote worker needs apt-get support).

See the website about managing Python dependencies [2] for more details.

1:
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/complete/juliaset/setup.py
2:
https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/#nonpython

On Thu, Jan 9, 2020 at 7:36 AM Leonardo Campos <le...@gameduell.de>
wrote:

> Hi, Andre,
>
> On a very different topic, I was trying to find a way to change the JVM
> default encoding and could not find a way to do so.
> In this sense, it would also be of my interest to be able to influence the
> image used by the workers.
>
> Sorry for having no help,
> Leonardo Campos
> On 1/9/20 3:48 PM, André Rocha Silva wrote:
>
> Hi all!
>
> I am trying to use imagemagick on Dataflow [Apache Beam Python 3.7 SDK
> 2.17.0], but I am facing a problem. The function works properly local, but
> when I use it in Dataflow I receive this message:
> File "/usr/local/lib/python3.7/site-packages/wand/image.py", line 7888, in
> read raise WandRuntimeError(msg) wand.exceptions.WandRuntimeError:
> MagickReadImage returns false, but did raise ImageMagick exception. This
> can occurs when a delegate is missing, or returns EXIT_SUCCESS without
> generating a raster.
> As far as I researched, it may be because ghostscript is not installed.
> Then I would like to know:
>
> 1) Is there a way to change the image that Dataflow loads for its workers?
> If so, I could install the programs I need for the job
>
> 2) Can I ask Dataflow to install a program for all workers? It is not
> feasible to run a " os.system(apt-get install ghostscript)" for every
> single element, isn't it?
>
> If someone has faced this problem and solved in another way, I am totally
> open to suggestions.
>
> Thank you
> André Rocha Silva
>
>

Re: dataflow and ImageMagick

Posted by Leonardo Campos <le...@gameduell.de>.
Hi, Andre,

On a very different topic, I was trying to find a way to change the JVM 
default encoding and could not find a way to do so.
In this sense, it would also be of my interest to be able to influence 
the image used by the workers.

Sorry for having no help,
Leonardo Campos

On 1/9/20 3:48 PM, André Rocha Silva wrote:
> Hi all!
>
> I am trying to use imagemagick on Dataflow [Apache Beam Python 3.7 SDK 
> 2.17.0], but I am facing a problem. The function works properly local, 
> but when I use it in Dataflow I receive this message:
> File "/usr/local/lib/python3.7/site-packages/wand/image.py", line 
> 7888, in read raise WandRuntimeError(msg) 
> wand.exceptions.WandRuntimeError: MagickReadImage returns false, but 
> did raise ImageMagick exception. This can occurs when a delegate is 
> missing, or returns EXIT_SUCCESS without generating a raster.
> As far as I researched, it may be because ghostscript is not installed.
> Then I would like to know:
>
> 1) Is there a way to change the image that Dataflow loads for its 
> workers? If so, I could install the programs I need for the job
>
> 2) Can I ask Dataflow to install a program for all workers? It is not 
> feasible to run a " os.system(apt-get install ghostscript)" for every 
> single element, isn't it?
>
> If someone has faced this problem and solved in another way, I am 
> totally open to suggestions.
>
> Thank you
> André Rocha Silva
>