You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Matt Narrell <ma...@gmail.com> on 2015/07/28 22:32:27 UTC

[Spark ML] HasInputCol, etc.

Hey,

Our ML ETL pipeline has several complex steps that I’d like to address with custom Transformers in an ML Pipeline.  Looking at the Tokenizer and HashingTF transformers I see these handy traits (HasInputCol, HasLabelCol, HasOutputCol, etc.) but they have strict access modifiers.  How can I use these with custom Transformer/Estimator implementations?

I’m stuck depositing my implementations in org.apache.spark.ml, which is tolerable for now, but I’m wondering if I’m missing some pattern?

Thanks,
mn
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: [Spark ML] HasInputCol, etc.

Posted by Feynman Liang <fl...@databricks.com>.

Unfortunately, AFAIK custom transformers are not part of the public API so
you will have to continue with what you're doing.

On Tue, Jul 28, 2015 at 1:32 PM, Matt Narrell <ma...@gmail.com>
wrote:

> Hey,
>
> Our ML ETL pipeline has several complex steps that I’d like to address
> with custom Transformers in an ML Pipeline.  Looking at the Tokenizer and
> HashingTF transformers I see these handy traits (HasInputCol, HasLabelCol,
> HasOutputCol, etc.) but they have strict access modifiers.  How can I use
> these with custom Transformer/Estimator implementations?
>
> I’m stuck depositing my implementations in org.apache.spark.ml, which is
> tolerable for now, but I’m wondering if I’m missing some pattern?
>
> Thanks,
> mn
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>