You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemds.apache.org by Mark Dokter <md...@know-center.at> on 2021/06/16 17:50:32 UTC
Jars in the release artifacts
Hey there!
When testing the release candidate, Shafaq and I found jars without
which SystemDS crashes under certain conditions.
* janino*.jar is needed for map() and for codegen.
* spark-core*.jar is needed for -exec hybrid when running locally
(otherwise provided by Spark)
These problems do not occur when using the source distribution as Maven
pulls everything in that's needed.
The safe solution is to include everything that Maven pulls into the lib
directory. But taking a look at the size of this directory shows 190 MB.
So aside from any legal redistribution issues (I hope there are none?)
the release artifacts would be blown up by quite a bit.
Any thoughts?
Regards, Mark
Re: Jars in the release artifacts
Posted by Matthias Boehm <mb...@gmail.com>.
thanks for catching this issue. We have to distinguish two scenarios:
1) Deployed spark cluster: here we do not need any of these libraries
(Spark's assembly jar is enough) but only the SystemDS jar which
includes the very few libraries that are not provided.
2) Standalone: here we need all libraries that are reachable through
local operations including pseudo-distributed spark operations. In this
mode, for example, DFS interactions are redirected to local file system
which does not require all libraries.
I like Arnab's proposal to additional include the must-have libraries
like Janino and Spark core. In the past we did an orthogonal thing of
creating a SystemML-lite as minimal self-contained jar by running a
variety of algorithms under different configurations and tracing the
class loader to get a fine-grained list of things to pack in. However,
for the binary release we should be at jar level.
Regards,
Matthias
On 6/16/2021 7:56 PM, arnab phani wrote:
> Today we include libraries that are needed to run in standalone mode. I
> like this balance.
> However, we might have missing libraries such as janino, that is needed for
> standalone execution.
>
> Regards,
> Arnab..
>
> On Wed, Jun 16, 2021, 19:50 Mark Dokter <md...@know-center.at> wrote:
>
>> Hey there!
>>
>> When testing the release candidate, Shafaq and I found jars without
>> which SystemDS crashes under certain conditions.
>>
>> * janino*.jar is needed for map() and for codegen.
>> * spark-core*.jar is needed for -exec hybrid when running locally
>> (otherwise provided by Spark)
>>
>> These problems do not occur when using the source distribution as Maven
>> pulls everything in that's needed.
>>
>> The safe solution is to include everything that Maven pulls into the lib
>> directory. But taking a look at the size of this directory shows 190 MB.
>> So aside from any legal redistribution issues (I hope there are none?)
>> the release artifacts would be blown up by quite a bit.
>>
>> Any thoughts?
>>
>> Regards, Mark
>>
>>
>
Re: Jars in the release artifacts
Posted by Shafaq Siddiqi <sh...@tugraz.at.INVALID>.
If there are no size constraints then I will suggest adding Janino into
the lib directory as we are using it for providing string processing
functionalities in DML.
Shafaq Siddiqi
On 6/16/2021 7:56 PM, arnab phani wrote:
> Today we include libraries that are needed to run in standalone mode. I
> like this balance.
> However, we might have missing libraries such as janino, that is needed for
> standalone execution.
>
> Regards,
> Arnab..
>
> On Wed, Jun 16, 2021, 19:50 Mark Dokter <md...@know-center.at> wrote:
>
>> Hey there!
>>
>> When testing the release candidate, Shafaq and I found jars without
>> which SystemDS crashes under certain conditions.
>>
>> * janino*.jar is needed for map() and for codegen.
>> * spark-core*.jar is needed for -exec hybrid when running locally
>> (otherwise provided by Spark)
>>
>> These problems do not occur when using the source distribution as Maven
>> pulls everything in that's needed.
>>
>> The safe solution is to include everything that Maven pulls into the lib
>> directory. But taking a look at the size of this directory shows 190 MB.
>> So aside from any legal redistribution issues (I hope there are none?)
>> the release artifacts would be blown up by quite a bit.
>>
>> Any thoughts?
>>
>> Regards, Mark
>>
>>
Re: Jars in the release artifacts
Posted by arnab phani <ph...@gmail.com>.
Today we include libraries that are needed to run in standalone mode. I
like this balance.
However, we might have missing libraries such as janino, that is needed for
standalone execution.
Regards,
Arnab..
On Wed, Jun 16, 2021, 19:50 Mark Dokter <md...@know-center.at> wrote:
> Hey there!
>
> When testing the release candidate, Shafaq and I found jars without
> which SystemDS crashes under certain conditions.
>
> * janino*.jar is needed for map() and for codegen.
> * spark-core*.jar is needed for -exec hybrid when running locally
> (otherwise provided by Spark)
>
> These problems do not occur when using the source distribution as Maven
> pulls everything in that's needed.
>
> The safe solution is to include everything that Maven pulls into the lib
> directory. But taking a look at the size of this directory shows 190 MB.
> So aside from any legal redistribution issues (I hope there are none?)
> the release artifacts would be blown up by quite a bit.
>
> Any thoughts?
>
> Regards, Mark
>
>