You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Sandeep Nemuri <nh...@gmail.com> on 2017/12/18 19:43:06 UTC

Mapping words to vector sparkml CountVectorizerModel

Hi All,

I've used CountVectorizerModel in spark ml and got the td-idf of the words.

Output column of a df looks like:

*(63709,[0,1,2,3,6,7,8,10,11,13],[0.6095235999680518,0.9946971867717818,0.5151611294911758,0.4371112749198506,3.4968901993588046,0.06806241719930584,1.1156025996012633,3.0425756717399217,0.3760235829400124])*

Wanted to get top n words which are mapped with this ranking.

Any pointers on how to achieve this?

-- 
*  Regards*
*  Sandeep Nemuri*