You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Sean Owen <sr...@gmail.com> on 2011/03/27 18:33:16 UTC

Bug in VectorCache? or just outdated docs?

Quick one. Check this in VectorCache. The key isn't actually used; it just
reads the first Vector in the SequenceFile. I do see code calls it
attempting to, it seems, pass a meaningful key. Everything "works" I guess.
Does it just happen to work because all the files in question just have 1
vector anyway, in which case I just fix the code and move on? or something
deeper to worry about.


  /**
   * Loads the vector with the specified key from the cache. Returns null
   * if nothing is found (up to the caller to handle this accordingly)
   */
  public static Vector load(Writable key, Configuration conf) throws
IOException {
    URI [] files = DistributedCache.getCacheFiles(conf);
    if (files == null || files.length < 1) {
      return null;
    }
    return load(key, conf, new Path(files[0].getPath()));
  }

  /**
   * Loads a Vector from the specified path
   */
  public static Vector load(Writable key, Configuration conf, Path input)
throws IOException {

    FileSystem fs = FileSystem.get(conf);
    SequenceFile.Reader reader = new SequenceFile.Reader(fs, input, conf);
    VectorWritable retval = new VectorWritable();
    reader.next(key, retval);
    reader.close();
    return retval.get();
  }