You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by pa...@apache.org on 2014/06/05 03:44:26 UTC

git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Repository: mahout
Updated Branches:
  refs/heads/mahout-1541 8a4b4347d -> 2f87f5433


MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet


Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543

Branch: refs/heads/mahout-1541
Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
Parents: 8a4b434
Author: pferrel <pa...@occamsmachete.com>
Authored: Wed Jun 4 18:44:16 2014 -0700
Committer: pferrel <pa...@occamsmachete.com>
Committed: Wed Jun 4 18:44:16 2014 -0700

----------------------------------------------------------------------
 .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
----------------------------------------------------------------------
diff --git a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
index 1179eef..9201c81 100644
--- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
+++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
@@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends Writer[IndexedDataset]{
       val matrix: DrmLike[Int] = indexedDataset.matrix
       val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
       val columnIDDictionary: BiMap[String, Int] = indexedDataset.columnIDs
+      // below doesn't compile because the rdd is not in a CheckpointedDrmSpark also I don't know how to turn a
+      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to pass in the CooccurrenceAnalysis#cooccurrence
+      // This seems to be about the refacotring to abstract away from Spark but the Read and Write are Spark specific
+      // and the non-specific DrmLike is no longer attached to a CheckpointedDrmSpark, could be missing something though
       matrix.rdd.map({ case (rowID, itemVector) =>
         var line: String = rowIDDictionary.inverse.get(rowID) + outDelim1
         for (item <- itemVector.nonZeroes()) {


Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
you need to do a PR of pferrel/mahout branch "MAHOUT-1541" for
apache/mahout "master"


On Thu, Jun 5, 2014 at 10:10 AM, Pat Ferrel <pa...@gmail.com> wrote:

> I deleted the apache.git version of the branch but it did not get mirrored
> to github.
>
> 'git remote show apache’ shows it removed but the github UI still has it,
> maybe it will disappear later.
>
> I’ll do a pr to github/apache/mahout, but it will by default try to merge
> with master. I can’t pick the branch the PR is targeted for. That makes me
> nervous but if you say so...
>
>
> On Jun 5, 2014, at 10:02 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>
> sorry for confusion
>
>
> On Thu, Jun 5, 2014 at 9:59 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > I probably meant to do a PR to github's "apache/mahout" MASTER, not push
> > to it to git-wip-us.
> >
> >
> > On Thu, Jun 5, 2014 at 9:42 AM, Pat Ferrel <pa...@gmail.com> wrote:
> >
> >> Tried doing a PR to your repo and you asked for it to go to apache HEAD.
> >> I certainly didn’t want it to get into the master yet.
> >>
> >> Happy to delete it but isn’t the Apache git OK for WIP branches?
> >>
> >> On Jun 5, 2014, at 9:18 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> >>
> >> * don't think we should be pushing this to apache git, I'd suggest to
> keep
> >> individual issue branches strictly on github. I'd suggest to drop this
> >> branch from apache git.*
> >>
> >>
> >> On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:
> >>
> >>> Repository: mahout
> >>> Updated Branches:
> >>> refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
> >>>
> >>>
> >>> MAHOUT-1541 still working on this, some refactoring in the DSL for
> >>> abstracting away Spark has moved access to rddsno Jira is closed yet
> >>>
> >>>
> >>> Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
> >>> Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
> >>> Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
> >>> Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
> >>>
> >>> Branch: refs/heads/mahout-1541
> >>> Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
> >>> Parents: 8a4b434
> >>> Author: pferrel <pa...@occamsmachete.com>
> >>> Authored: Wed Jun 4 18:44:16 2014 -0700
> >>> Committer: pferrel <pa...@occamsmachete.com>
> >>> Committed: Wed Jun 4 18:44:16 2014 -0700
> >>>
> >>> ----------------------------------------------------------------------
> >>> .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4
> >> ++++
> >>> 1 file changed, 4 insertions(+)
> >>> ----------------------------------------------------------------------
> >>>
> >>>
> >>>
> >>>
> >>
> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> >>> ----------------------------------------------------------------------
> >>> diff --git
> >>> a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> >>> b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> >>> index 1179eef..9201c81 100644
> >>> --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> >>> +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> >>> @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
> >>> Writer[IndexedDataset]{
> >>>      val matrix: DrmLike[Int] = indexedDataset.matrix
> >>>      val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
> >>>      val columnIDDictionary: BiMap[String, Int] =
> >>> indexedDataset.columnIDs
> >>> +      // below doesn't compile because the rdd is not in a
> >>> CheckpointedDrmSpark also I don't know how to turn a
> >>> +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need
> to
> >>> pass in the CooccurrenceAnalysis#cooccurrence
> >>> +      // This seems to be about the refacotring to abstract away from
> >>> Spark but the Read and Write are Spark specific
> >>> +      // and the non-specific DrmLike is no longer attached to a
> >>> CheckpointedDrmSpark, could be missing something though
> >>>      matrix.rdd.map({ case (rowID, itemVector) =>
> >>>        var line: String = rowIDDictionary.inverse.get(rowID) +
> >> outDelim1
> >>>        for (item <- itemVector.nonZeroes()) {
> >>>
> >>>
> >>
> >>
> >
>
>

Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Pat Ferrel <pa...@gmail.com>.
I deleted the apache.git version of the branch but it did not get mirrored to github.

'git remote show apache’ shows it removed but the github UI still has it, maybe it will disappear later.

I’ll do a pr to github/apache/mahout, but it will by default try to merge with master. I can’t pick the branch the PR is targeted for. That makes me nervous but if you say so...


On Jun 5, 2014, at 10:02 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

sorry for confusion


On Thu, Jun 5, 2014 at 9:59 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> I probably meant to do a PR to github's "apache/mahout" MASTER, not push
> to it to git-wip-us.
> 
> 
> On Thu, Jun 5, 2014 at 9:42 AM, Pat Ferrel <pa...@gmail.com> wrote:
> 
>> Tried doing a PR to your repo and you asked for it to go to apache HEAD.
>> I certainly didn’t want it to get into the master yet.
>> 
>> Happy to delete it but isn’t the Apache git OK for WIP branches?
>> 
>> On Jun 5, 2014, at 9:18 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>> 
>> * don't think we should be pushing this to apache git, I'd suggest to keep
>> individual issue branches strictly on github. I'd suggest to drop this
>> branch from apache git.*
>> 
>> 
>> On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:
>> 
>>> Repository: mahout
>>> Updated Branches:
>>> refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
>>> 
>>> 
>>> MAHOUT-1541 still working on this, some refactoring in the DSL for
>>> abstracting away Spark has moved access to rddsno Jira is closed yet
>>> 
>>> 
>>> Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
>>> Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
>>> Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
>>> Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
>>> 
>>> Branch: refs/heads/mahout-1541
>>> Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
>>> Parents: 8a4b434
>>> Author: pferrel <pa...@occamsmachete.com>
>>> Authored: Wed Jun 4 18:44:16 2014 -0700
>>> Committer: pferrel <pa...@occamsmachete.com>
>>> Committed: Wed Jun 4 18:44:16 2014 -0700
>>> 
>>> ----------------------------------------------------------------------
>>> .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4
>> ++++
>>> 1 file changed, 4 insertions(+)
>>> ----------------------------------------------------------------------
>>> 
>>> 
>>> 
>>> 
>> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>>> ----------------------------------------------------------------------
>>> diff --git
>>> a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>>> b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>>> index 1179eef..9201c81 100644
>>> --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>>> +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>>> @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
>>> Writer[IndexedDataset]{
>>>      val matrix: DrmLike[Int] = indexedDataset.matrix
>>>      val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
>>>      val columnIDDictionary: BiMap[String, Int] =
>>> indexedDataset.columnIDs
>>> +      // below doesn't compile because the rdd is not in a
>>> CheckpointedDrmSpark also I don't know how to turn a
>>> +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to
>>> pass in the CooccurrenceAnalysis#cooccurrence
>>> +      // This seems to be about the refacotring to abstract away from
>>> Spark but the Read and Write are Spark specific
>>> +      // and the non-specific DrmLike is no longer attached to a
>>> CheckpointedDrmSpark, could be missing something though
>>>      matrix.rdd.map({ case (rowID, itemVector) =>
>>>        var line: String = rowIDDictionary.inverse.get(rowID) +
>> outDelim1
>>>        for (item <- itemVector.nonZeroes()) {
>>> 
>>> 
>> 
>> 
> 


Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
sorry for confusion


On Thu, Jun 5, 2014 at 9:59 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> I probably meant to do a PR to github's "apache/mahout" MASTER, not push
> to it to git-wip-us.
>
>
> On Thu, Jun 5, 2014 at 9:42 AM, Pat Ferrel <pa...@gmail.com> wrote:
>
>> Tried doing a PR to your repo and you asked for it to go to apache HEAD.
>> I certainly didn’t want it to get into the master yet.
>>
>> Happy to delete it but isn’t the Apache git OK for WIP branches?
>>
>> On Jun 5, 2014, at 9:18 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>
>> * don't think we should be pushing this to apache git, I'd suggest to keep
>> individual issue branches strictly on github. I'd suggest to drop this
>> branch from apache git.*
>>
>>
>> On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:
>>
>> > Repository: mahout
>> > Updated Branches:
>> >  refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
>> >
>> >
>> > MAHOUT-1541 still working on this, some refactoring in the DSL for
>> > abstracting away Spark has moved access to rddsno Jira is closed yet
>> >
>> >
>> > Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
>> > Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
>> > Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
>> > Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
>> >
>> > Branch: refs/heads/mahout-1541
>> > Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
>> > Parents: 8a4b434
>> > Author: pferrel <pa...@occamsmachete.com>
>> > Authored: Wed Jun 4 18:44:16 2014 -0700
>> > Committer: pferrel <pa...@occamsmachete.com>
>> > Committed: Wed Jun 4 18:44:16 2014 -0700
>> >
>> > ----------------------------------------------------------------------
>> > .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4
>> ++++
>> > 1 file changed, 4 insertions(+)
>> > ----------------------------------------------------------------------
>> >
>> >
>> >
>> >
>> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>> > ----------------------------------------------------------------------
>> > diff --git
>> > a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>> > b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>> > index 1179eef..9201c81 100644
>> > --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>> > +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
>> > @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
>> > Writer[IndexedDataset]{
>> >       val matrix: DrmLike[Int] = indexedDataset.matrix
>> >       val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
>> >       val columnIDDictionary: BiMap[String, Int] =
>> > indexedDataset.columnIDs
>> > +      // below doesn't compile because the rdd is not in a
>> > CheckpointedDrmSpark also I don't know how to turn a
>> > +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to
>> > pass in the CooccurrenceAnalysis#cooccurrence
>> > +      // This seems to be about the refacotring to abstract away from
>> > Spark but the Read and Write are Spark specific
>> > +      // and the non-specific DrmLike is no longer attached to a
>> > CheckpointedDrmSpark, could be missing something though
>> >       matrix.rdd.map({ case (rowID, itemVector) =>
>> >         var line: String = rowIDDictionary.inverse.get(rowID) +
>> outDelim1
>> >         for (item <- itemVector.nonZeroes()) {
>> >
>> >
>>
>>
>

Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I probably meant to do a PR to github's "apache/mahout" MASTER, not push to
it to git-wip-us.


On Thu, Jun 5, 2014 at 9:42 AM, Pat Ferrel <pa...@gmail.com> wrote:

> Tried doing a PR to your repo and you asked for it to go to apache HEAD. I
> certainly didn’t want it to get into the master yet.
>
> Happy to delete it but isn’t the Apache git OK for WIP branches?
>
> On Jun 5, 2014, at 9:18 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>
> * don't think we should be pushing this to apache git, I'd suggest to keep
> individual issue branches strictly on github. I'd suggest to drop this
> branch from apache git.*
>
>
> On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:
>
> > Repository: mahout
> > Updated Branches:
> >  refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
> >
> >
> > MAHOUT-1541 still working on this, some refactoring in the DSL for
> > abstracting away Spark has moved access to rddsno Jira is closed yet
> >
> >
> > Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
> > Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
> > Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
> > Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
> >
> > Branch: refs/heads/mahout-1541
> > Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
> > Parents: 8a4b434
> > Author: pferrel <pa...@occamsmachete.com>
> > Authored: Wed Jun 4 18:44:16 2014 -0700
> > Committer: pferrel <pa...@occamsmachete.com>
> > Committed: Wed Jun 4 18:44:16 2014 -0700
> >
> > ----------------------------------------------------------------------
> > .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4 ++++
> > 1 file changed, 4 insertions(+)
> > ----------------------------------------------------------------------
> >
> >
> >
> >
> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> > ----------------------------------------------------------------------
> > diff --git
> > a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> > b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> > index 1179eef..9201c81 100644
> > --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> > +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> > @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
> > Writer[IndexedDataset]{
> >       val matrix: DrmLike[Int] = indexedDataset.matrix
> >       val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
> >       val columnIDDictionary: BiMap[String, Int] =
> > indexedDataset.columnIDs
> > +      // below doesn't compile because the rdd is not in a
> > CheckpointedDrmSpark also I don't know how to turn a
> > +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to
> > pass in the CooccurrenceAnalysis#cooccurrence
> > +      // This seems to be about the refacotring to abstract away from
> > Spark but the Read and Write are Spark specific
> > +      // and the non-specific DrmLike is no longer attached to a
> > CheckpointedDrmSpark, could be missing something though
> >       matrix.rdd.map({ case (rowID, itemVector) =>
> >         var line: String = rowIDDictionary.inverse.get(rowID) + outDelim1
> >         for (item <- itemVector.nonZeroes()) {
> >
> >
>
>

Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Pat Ferrel <pa...@gmail.com>.
Tried doing a PR to your repo and you asked for it to go to apache HEAD. I certainly didn’t want it to get into the master yet.

Happy to delete it but isn’t the Apache git OK for WIP branches?

On Jun 5, 2014, at 9:18 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

* don't think we should be pushing this to apache git, I'd suggest to keep
individual issue branches strictly on github. I'd suggest to drop this
branch from apache git.*


On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:

> Repository: mahout
> Updated Branches:
>  refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
> 
> 
> MAHOUT-1541 still working on this, some refactoring in the DSL for
> abstracting away Spark has moved access to rddsno Jira is closed yet
> 
> 
> Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
> Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
> Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
> 
> Branch: refs/heads/mahout-1541
> Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
> Parents: 8a4b434
> Author: pferrel <pa...@occamsmachete.com>
> Authored: Wed Jun 4 18:44:16 2014 -0700
> Committer: pferrel <pa...@occamsmachete.com>
> Committed: Wed Jun 4 18:44:16 2014 -0700
> 
> ----------------------------------------------------------------------
> .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4 ++++
> 1 file changed, 4 insertions(+)
> ----------------------------------------------------------------------
> 
> 
> 
> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> ----------------------------------------------------------------------
> diff --git
> a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> index 1179eef..9201c81 100644
> --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
> Writer[IndexedDataset]{
>       val matrix: DrmLike[Int] = indexedDataset.matrix
>       val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
>       val columnIDDictionary: BiMap[String, Int] =
> indexedDataset.columnIDs
> +      // below doesn't compile because the rdd is not in a
> CheckpointedDrmSpark also I don't know how to turn a
> +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to
> pass in the CooccurrenceAnalysis#cooccurrence
> +      // This seems to be about the refacotring to abstract away from
> Spark but the Read and Write are Spark specific
> +      // and the non-specific DrmLike is no longer attached to a
> CheckpointedDrmSpark, could be missing something though
>       matrix.rdd.map({ case (rowID, itemVector) =>
>         var line: String = rowIDDictionary.inverse.get(rowID) + outDelim1
>         for (item <- itemVector.nonZeroes()) {
> 
> 


Re: git commit: MAHOUT-1541 still working on this, some refactoring in the DSL for abstracting away Spark has moved access to rddsno Jira is closed yet

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
* don't think we should be pushing this to apache git, I'd suggest to keep
individual issue branches strictly on github. I'd suggest to drop this
branch from apache git.*


On Wed, Jun 4, 2014 at 6:44 PM, <pa...@apache.org> wrote:

> Repository: mahout
> Updated Branches:
>   refs/heads/mahout-1541 8a4b4347d -> 2f87f5433
>
>
> MAHOUT-1541 still working on this, some refactoring in the DSL for
> abstracting away Spark has moved access to rddsno Jira is closed yet
>
>
> Project: http://git-wip-us.apache.org/repos/asf/mahout/repo
> Commit: http://git-wip-us.apache.org/repos/asf/mahout/commit/2f87f543
> Tree: http://git-wip-us.apache.org/repos/asf/mahout/tree/2f87f543
> Diff: http://git-wip-us.apache.org/repos/asf/mahout/diff/2f87f543
>
> Branch: refs/heads/mahout-1541
> Commit: 2f87f5433f90fa2c49ef386ca245943e1fc73beb
> Parents: 8a4b434
> Author: pferrel <pa...@occamsmachete.com>
> Authored: Wed Jun 4 18:44:16 2014 -0700
> Committer: pferrel <pa...@occamsmachete.com>
> Committed: Wed Jun 4 18:44:16 2014 -0700
>
> ----------------------------------------------------------------------
>  .../src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala  | 4 ++++
>  1 file changed, 4 insertions(+)
> ----------------------------------------------------------------------
>
>
>
> http://git-wip-us.apache.org/repos/asf/mahout/blob/2f87f543/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> ----------------------------------------------------------------------
> diff --git
> a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> index 1179eef..9201c81 100644
> --- a/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> +++ b/spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
> @@ -149,6 +149,10 @@ trait TDIndexedDatasetWriter extends
> Writer[IndexedDataset]{
>        val matrix: DrmLike[Int] = indexedDataset.matrix
>        val rowIDDictionary: BiMap[String, Int] = indexedDataset.rowIDs
>        val columnIDDictionary: BiMap[String, Int] =
> indexedDataset.columnIDs
> +      // below doesn't compile because the rdd is not in a
> CheckpointedDrmSpark also I don't know how to turn a
> +      // CheckpointedDrmSpark[Int] into a DrmLike[Int], which I need to
> pass in the CooccurrenceAnalysis#cooccurrence
> +      // This seems to be about the refacotring to abstract away from
> Spark but the Read and Write are Spark specific
> +      // and the non-specific DrmLike is no longer attached to a
> CheckpointedDrmSpark, could be missing something though
>        matrix.rdd.map({ case (rowID, itemVector) =>
>          var line: String = rowIDDictionary.inverse.get(rowID) + outDelim1
>          for (item <- itemVector.nonZeroes()) {
>
>