You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Nirav Patel <np...@xactlycorp.com> on 2016/11/08 22:41:03 UTC
spark ml - ngram - how to preserve single word (1-gram)
Is it possible to preserve single token while using n-gram feature
transformer?
e.g.
Array("Hi", "I", "heard", "about", "Spark")
Becomes
Array("Hi", "i", "heard", "about", "Spark", "Hi i", "I heard", "heard
about", "about Spark")
Currently if I want to do it I will have to manually transform column first
using current ngram implementation then join 1-gram tokens to each column
value. basically I have to do this outside of pipeline.
--
[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
<https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn]
<https://www.linkedin.com/company/xactly-corporation> [image: Twitter]
<https://twitter.com/Xactly> [image: Facebook]
<https://www.facebook.com/XactlyCorp> [image: YouTube]
<http://www.youtube.com/xactlycorporation>