[SPARK-2864][MLLIB] fix random seed in word2vec; move model to local#1790
[SPARK-2864][MLLIB] fix random seed in word2vec; move model to local#1790mengxr wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
This does more than fix the seed for unit tests, but for every call. Is it not a bit better to make the RNG injectable via a discreet package-private setter and let the tests inject a seeded RNG?
There was a problem hiding this comment.
Added setters and made seed configurable.
|
@mengxr LGTM. We may need better implementation of TopK. It also worth trying to change the starting alpha in each iteration. |
|
Jenkins, test this please. |
|
Jenkins, where are you? |
|
Jenkins, test this please. |
|
Jenkins, retest this please. |
|
Jenkins, test this please. |
|
QA tests have started for PR 1790. This patch merges cleanly. |
|
QA results for PR 1790: |
|
Merged into both master and branch-1.1. |
It also moves the model to local in order to map `RDD[String]` to `RDD[Vector]`. Ishiihara Author: Xiangrui Meng <meng@databricks.com> Closes #1790 from mengxr/word2vec-fix and squashes the following commits: a87146c [Xiangrui Meng] add setters and make a default constructor e5c923b [Xiangrui Meng] fix random seed in word2vec; move model to local (cherry picked from commit cc491f6) Signed-off-by: Xiangrui Meng <meng@databricks.com>
It also moves the model to local in order to map `RDD[String]` to `RDD[Vector]`. Ishiihara Author: Xiangrui Meng <meng@databricks.com> Closes apache#1790 from mengxr/word2vec-fix and squashes the following commits: a87146c [Xiangrui Meng] add setters and make a default constructor e5c923b [Xiangrui Meng] fix random seed in word2vec; move model to local
It also moves the model to local in order to map
RDD[String]toRDD[Vector].@Ishiihara