You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Uwe Schindler (JIRA)" <ji...@apache.org> on 2015/06/09 12:12:00 UTC

[jira] [Comment Edited] (LUCENE-6536) Migrate HDFSDirectory from solr to lucene-hadoop

    [ https://issues.apache.org/jira/browse/LUCENE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578664#comment-14578664 ] 

Uwe Schindler edited comment on LUCENE-6536 at 6/9/15 10:11 AM:
----------------------------------------------------------------

bq. Personally, I think if someone wants to do this, a better integration point is to make it a java 7 filesystem provider. That is really how such a filesystem should work anyway.

I agree. This is how it should be. Once HDFS provides a Java 7 FileSystemProvider SPI (see, http://docs.oracle.com/javase/7/docs/api/java/nio/file/spi/FileSystemProvider.html), you just need to plug your HDFS JAR file into the classpath, then you would be able to create a standard FSDirectory (NIO, Simple, mmap) using Paths.get(URI) and you are done. No single line of Lucene code needed. I have no idea why Hadoop does not yet provide a FileSystem implementation for Java 7 (maybe because they are still on Java 6).

I would suggest that you talk with the Hadoop people about doing this (including the block cache, which could me implemented as ByteBuffer like MappedByteBuffer off-heap, so it would automatically work with MMapDirectory in Lucene; I don't want to also take over responsibility for the block cache in Lucene). Or you start your own project implementing the FSProvider.


was (Author: thetaphi):
bq. Personally, I think if someone wants to do this, a better integration point is to make it a java 7 filesystem provider. That is really how such a filesystem should work anyway.

I agree. This is how it should be. Once HDFS provides a Java 7 FileSystemProvider SPI (see, http://docs.oracle.com/javase/7/docs/api/java/nio/file/spi/FileSystemProvider.html), you just need to plug your HDFS JAR file into the classpath, then you would be able to create a standard FSDirectory (NIO, Simple, mmap) using Paths.get(URI) and you are done. No single line of Lucene code needed. I have no idea why Hadoop does not yet provide a FileSystem implementation for Java 7 (maybe because they are still on Java 6).

I would suggest that you talk with the Hadoop people about doing this (including the block cache - I don't want to also take over responsibility for the block cache in Lucene). Or you start your own project implementing the FSProvider.

> Migrate HDFSDirectory from solr to lucene-hadoop
> ------------------------------------------------
>
>                 Key: LUCENE-6536
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6536
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Greg Bowyer
>              Labels: hadoop, hdfs, lucene, solr
>         Attachments: LUCENE-6536.patch
>
>
> I am currently working on a search engine that is throughput orientated and works entirely in apache-spark.
> As part of this, I need a directory implementation that can operate on HDFS directly. This got me thinking, can I take the one that was worked on so hard for solr hadoop.
> As such I migrated the HDFS and blockcache directories out to a lucene-hadoop module.
> Having done this work, I am not sure if it is actually a good change, it feels a bit messy, and I dont like how the Metrics class gets extended and abused.
> Thoughts anyone



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org