You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "dhruba borthakur (JIRA)" <ji...@apache.org> on 2008/07/14 10:39:32 UTC

[jira] Created: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Support a Thrift Interface to access files/directories in HDFS
--------------------------------------------------------------

                 Key: HADOOP-3754
                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
             Project: Hadoop Core
          Issue Type: New Feature
            Reporter: dhruba borthakur
            Assignee: dhruba borthakur


Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626252#action_12626252 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

+1 on Owen's comment about including the generated code.  The compiled thrift will only work on compatible platforms, so people on Windows or other "weird" :) platforms will not be able to use this without compiling thrift themselves which is too much of a hassle i would think.





> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626245#action_12626245 ] 

Owen O'Malley commented on HADOOP-3754:
---------------------------------------

Actually, other contrib modules have a lib directory. We should probably move the thrift library stuff into src/contrib/thriftfs/lib/. (Although I still think the executable for thrift should be removed. We do not want to get into the business of distributing compilation tools.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620164#action_12620164 ] 

dhruba borthakur commented on HADOOP-3754:
------------------------------------------

Thanks Pete & Nitay for the detailed comments. Thanks a bunch.

1. The patch includes the thrift binary for Linux. See lib/thrift/thrift and lib/thrift/libthrift.jar. Thus, a Linux compile does not have to download any external libraries, utilities.

2. The proxy server uses the message from the hadoop.IOException to create its own exception. This is the best we can do for now. If we want to improve it later, we can do that. The application would see the real exception string, so it shoudl be enough for debugging purposes, won't it?

3. Added a note to chown to say that it is not-atomic. This is true for hdfs.py only and does not apply to the chown thrift interface.

4. I like your idea of using the checksum all the way from the client, but maybe we can postpone it to a later date.

5. The python command line needs more work. However, I am not targeting the python wrapper as a piece that an application will use as it is. It is there to demonstrate how to access HDFS from a python script. I

6. Added README that describes the approach, build and deployment process. I plan on writing a Wiki page once this patch gets committed.

7. performance measurement will come at a later date

8. Added default minimum number of threads to be 10.

9. The change to build-contrib.xml ensures that the generated jar file(s) are in the CLASSPATH while compiling HadoopThriftServer.java.

10. I would wait to include fb303. This is mostly for statistics management and process management and can be added at a later date. It might be useful to use HadoopMetrics or via HADOOP-3772. 

11. I added a new call setInactiveTimeoutPeriod() that allows an application to specify how long the proxy server should remain active starting from the last call to it. If this timer expires, then the proxy server closes all open files and shuts down. The default inactivity timeout is 1 hour. This does not completely address Nitay's problems, but maybe solves it to a certain extent. If Nitay could merge in his code for per-handle timer once this patch is committed, that will be great.

12. If, at a future time, we add Thrift APIs to Namenode, Datanode, etc, they would have to be located in src/hdfs and not in contrib.  Even if we decide to keep them in contrib, they could be src/contrib/thriftfs/namenode, src/contrib/thriftfs/datanode, etc. I think the API in this patch should try to resemble existing API in fs.FileSystem.

13. I added a getFileBlockLocations API to allow fetching the block locations of a file.


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620754#action_12620754 ] 

Doug Cutting commented on HADOOP-3754:
--------------------------------------

> Does use mean we can include it in svn with its runtime libraries?

Do you mean software distributed under the Thrift Software License?  Yes.  This looks like a standard BSD-style license, which is permitted under Apache's 3rd party licensing policy (http://www.apache.org/legal/3party.html).

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: thrift1.patch

Attached build.xml

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Open  (was: Patch Available)

Fix HadoopQA failures.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Patch Available  (was: Open)

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Open  (was: Patch Available)

Canceling patch because it went stale.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619950#action_12619950 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

oh one other thing is I would hope this generates lots of small/medium utilities (because they can be self contained - e.g., perl which only needs the address of the server and not Java and a config dir and hadoop itself), so do you have any thoughts on where these things would be contributed? e.g., src/contrib/thrfitfs/scripts or something ??

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift9.patch

Fixed warnings detected by HadoopQA.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621505#action_12621505 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

+1


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift2.patch

Implemented more FileSystem APIs, i.e. ls, setpermissions, setowner, setReplication, etc.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift7.patch

1. Removed author tags.
2. Added a unit test (needs more work to be comprehensive)
3. Changed lots of LOG messages from INFO to DEBUG level.


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Patch Available  (was: Open)

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623038#action_12623038 ] 

Hadoop QA commented on HADOOP-3754:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12388306/hadoopthrift7.patch
  against trunk revision 686362.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3061/console

This message is automatically generated.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift3.patch

Incorporated most of Pete's and Craig's comments. Thanks folks!

I have renamed the package to thriftfs. I have also incorporated the "append" API to this package. This package now generates HDFS interfaces for Ruby, Cocoa, perl, php, python, etc.etc.

The generated files are not part of the source code. An "ant package" generates the API in all of the above languages. Thus, a binary package should contain these APIs. Does it sound ok to you?

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12621093#action_12621093 ] 

dhruba borthakur commented on HADOOP-3754:
------------------------------------------

I agree. Would it help if I start the object Id from a Random.nextLong() instead of 100?

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620930#action_12620930 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

RE: new objectids, there's nothing in the API that prevents a client from connecting, getting a thrift object id, closing the connection, and re-connecting to do the reads or writes. So, they wouldn't get any errors in that case, assuming the server did a restart between the close and the next open.



> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12624815#action_12624815 ] 

Hadoop QA commented on HADOOP-3754:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12388504/hadoopthrift8.patch
  against trunk revision 687868.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs to fail.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3082/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3082/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3082/console

This message is automatically generated.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620301#action_12620301 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

Code review comments:

1. I didn't notice before, but if the server is restarted, it is highly likely that clients will have handles to the wrong files sitting around. ie the nextId counter will just start over at 100.  

2. what about Nitay's #2 comment about closing files if the client dies? Is it possible to implement the same semantics that the DFSClient would?

3. Steve's comments - esp about including thrift ...



> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619951#action_12619951 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

One other thought on the naming - I could very easily see adding thrift APIs for the NameNode and JobTracker for managing them and getting stats and such - e.g., for Nagios or proprietary monitoring systems and stats collection. In which case, naming this src/contrib/thriftapis/dfsclient would be more extensible.



> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift6.patch

I changed the number 100 to start from a random number. Submitting this patch to get HadoopQA  tests.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Chad Walters (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620755#action_12620755 ] 

Chad Walters commented on HADOOP-3754:
--------------------------------------

Hmm, that release is months old. There's been a lot of changes/progress on Thrift since that release was made, including improvements to the Java portion of the system. Doug, what are the relevant policies around using code that is in the Incubator?

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625654#action_12625654 ] 

Hadoop QA commented on HADOOP-3754:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12388784/hadoopthrift9.patch
  against trunk revision 688960.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    -1 release audit.  The applied patch generated 275 release audit warnings (more than the trunk's current 271 warnings).

    +1 core tests.  The patch passed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3104/testReport/
Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3104/artifact/trunk/current/releaseAuditDiffWarnings.txt
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3104/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3104/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3104/console

This message is automatically generated.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620871#action_12620871 ] 

dhruba borthakur commented on HADOOP-3754:
------------------------------------------

I included the thrift from http://developers.facebook.com/thrift/.

Also, make the shutdown from the python script more elegant. The python script creates a proxy server if the port is not specified. Also, the proxy server kills itself if the inactivity period is more than 1 hour (configurable). I would address Nitay's comments about particular files going inactive in a later release. 

If the proxy server is restarted, all existing client's connections would get an error. They would have to re-establish a new thrift connection to the proxy server. Thus, the proxy server does not have to persist the objectids, they get restarted from 100.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.19.0
           Status: Resolved  (was: Patch Available)

I just committed this.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Open  (was: Patch Available)

The problem is that the patch includes two binaries: a joar file for libthrift and an executable for "thrift". When the patch is extracted using "patch -i <filename> -p 0" the two binary fies do not get extracted. Is there a special 'svn diff" command that I need to use to package binary files? The two binary files show up as "M" in my local workspace.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619135#action_12619135 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------


Thrift namespacing stuff - 

https://issues.apache.org/jira/browse/THRIFT-34


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by Kunsheng Chen <ke...@yahoo.com>.

--- On Wed, 8/6/08, Doug Cutting (JIRA) <ji...@apache.org> wrote:
From: Doug Cutting (JIRA) <ji...@apache.org>
Subject: [jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS
To: core-dev@hadoop.apache.org
Date: Wednesday, August 6, 2008, 4:52 PM

    [
https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620317#action_12620317
] 

Doug Cutting commented on HADOOP-3754:
--------------------------------------

> until something comes out of incubation, apache can't release it. 

Right, we can use the version at http://developers.facebook.com/thrift/, under
the Thrift Software License, but we should not use the incubating software until
it is released by Apache.


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch,
hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code
generation for a variety of languages (Java, C++, python, PHP, etc) It would be
nice if HDFS APIs are exposed though Thirft. It will allow applications written
in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




      

[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620317#action_12620317 ] 

Doug Cutting commented on HADOOP-3754:
--------------------------------------

> until something comes out of incubation, apache can't release it. 

Right, we can use the version at http://developers.facebook.com/thrift/, under the Thrift Software License, but we should not use the incubating software until it is released by Apache.


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Component/s: contrib/thiftfs

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/thiftfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619130#action_12619130 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

 A few quick comments - (1) I don't see the generated code from thrift in the patch and (2) I think including bindings for perl, java, python, ruby, C (if that exists in thrift now) and C++ would be good; (3) your name is in the header for the python script; (4) I think you definitely need a comment about opening a JIRA to change the com.facebook.thrift to org.apache.thrift with a dependency on the thrift JIRA to that effect (I can give you the url for it); (5) the name I don't think is so good?? Should it be contrib/thrift or contrib/thrift_dfsclient or something.

btw this could be a great building block for fuse-dfs - getting it out of the Java business could mean making it a real kernel module.


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Craig Macdonald (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619206#action_12619206 ] 

Craig Macdonald commented on HADOOP-3754:
-----------------------------------------

Some questions/comments:

 * Would it be desirable that the thrift API reflects in some way the libhdfs API?
  ** See also HADOOP-3264 (libhdfs does not provide permissions in API)
 * How is open for write, open for append signalled in the API?

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620361#action_12620361 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

Does use mean we can include it in svn with its runtime libraries?


> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625882#action_12625882 ] 

dhruba borthakur commented on HADOOP-3754:
------------------------------------------

The javadoc warning does not seem to be introduced by this patch. The warnings are:

     [exec]   [javadoc] Loading source files for package org.apache.hadoop.contrib.failmon...
     [exec]   [javadoc] Constructing Javadoc information...
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/core/org/apache/hadoop/fs/FileSystem.java:34: package org.apache.hadoop.hdfs does not exist
     [exec]   [javadoc] import org.apache.hadoop.hdfs.DistributedFileSystem;
     [exec]   [javadoc]                              ^
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/core/org/apache/hadoop/fs/FsShell.java:46: package org.apache.hadoop.hdfs does not exist
     [exec]   [javadoc] import org.apache.hadoop.hdfs.DistributedFileSystem;
     [exec]   [javadoc]                              ^
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/mapred/org/apache/hadoop/mapred/Task.java:36: package org.apache.hadoop.hdfs does not exist
     [exec]   [javadoc] import org.apache.hadoop.hdfs.DistributedFileSystem;
     [exec]   [javadoc]                              ^
     [exec]   [javadoc] Standard Doclet version 1.6.0_06
     [exec]   [javadoc] Building tree for all the packages and classes...
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/core/org/apache/hadoop/fs/FileSystem.java:56: warning - Tag @link: reference not found: DistributedFileSystem
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/core/org/apache/hadoop/metrics/MetricsUtil.java:34: warning - Tag @link: reference not found: org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics
     [exec]   [javadoc] Building index for all the packages and classes...
     [exec]   [javadoc] Building index for all classes...
     [exec]   [javadoc] /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/src/core/overview.html: warning - Tag @link: reference not found: org.apache.hadoop.hdfs.server.namenode.NameNode
     [exec]   [javadoc] Generating /zonestorage/hudson/home/hudson/hudson/jobs/Hadoop-Patch/workspace/trunk/build/docs/api/stylesheet.css...
     [exec]   [javadoc] 6 warnings
     [exec] There appear to be 1 javadoc warnings generated by the patched build.
     [exec] 

None of these files are touched by this patch. I think this patch can be safely committed. Please comment if you think otherwise.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Patch Available  (was: Reopened)

Owen can you pl review it? Thanks.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift5.patch

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift.tar.gz

This includes the changes proposed by Owen. This patch has binary files (jars), so a gzipped tar file is included as a patch.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620772#action_12620772 ] 

Doug Cutting commented on HADOOP-3754:
--------------------------------------

> Hmm, that release is months old. There's been a lot of changes/progress on Thrift since that release was made, including improvements to the Java portion of the system.

We should only include released code from other Apache projects.  So until Thrift makes a release from the incubator, we should not include it, since it's not yet legally Apache code.  Apache's fundamental law is that it's not licensed by Apache until a PMC has voted it so with 3 or more +1 votes and no -1 votes.  If Facebook wanted to produce and distribute a new release of "Facebook Thrift" under an acceptable license (e.g., the Apache license) then Hadoop could re-distribute it, but it could not be anywhere called "Apache Thrift".

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620295#action_12620295 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

Hi Steve,

Regarding #3, thrift is a thin library and as such is a lot lighter than requiring Hadoop RPC which uses reflection and thus must basically be kept in perfect sync with the server side's code (yes, 19 has separate jars, but the client jar isn't very full featured i bet).  Thrift gives you the advantage of tight integration (ie well-defined APIs) whilst still providing pretty loose coupling, although, of course, not as loose as REST.  With thrift, one can work on data structures and thrift does provide some REST-like interfaces too. I think REST may be good for some things like just asking for job status and such, but doesn't provide robust enough APIs as thrift, which could provide a better alternative RPC to remote reflection in addition to providing something for thin clients. And thrift has bindings in literally about 20 languages.

Although I guess protocol buffers are an alternative too now that they are open sourced :) but, I think they only have C, Java and Python bindings.

pete






> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift8.patch

Merged patch with latest trunk

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620033#action_12620033 ] 

Nitay Joffe commented on HADOOP-3754:
-------------------------------------

This is great work Dhruba!

I worked on a very similar system at Powerset and had a few questions/comments:

1) I wrote a C++ stream using boost::iostreams which used the thrift API underneath so that you can work with a standard stream yet read/write to hadoop. In order to make this work, I had to add a seek() method to the IDL. On input streams, I would allow arbitrary seeking. On output streams I would only allow it to get called with an offset of 0 (no actual seeking) which boost streams uses to find the current location.

2) I had situations where the client would not close the files appropriately. This meant that other users would not see the file even though the writing was done because the data did not appear until the file is closed. To fix this situation, I put a TimerTask on each fd which would timeout (and close the file) after some period. Whenever an operation was called on a Handle its TimerTask would reset.

I would attach code but we are currently being merged into Microsoft and in the process of figuring out how open source contributions will work.

Cheers,
-n

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment:     (was: thrift1.patch)

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12625901#action_12625901 ] 

dhruba borthakur commented on HADOOP-3754:
------------------------------------------

Also, the release-audit warnings are from thrift binary files and jars. However, the patch has a LICENSE file that describes the license for thrift.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620286#action_12620286 ] 

Steve Loughran commented on HADOOP-3754:
----------------------------------------

1. until something comes out of incubation, apache can't release it. So checking in the relevant artifacts and releasing downstream versions is going to be on the forbidden list.

2. it may/should be possible to do something about sticking the artifacts up on the maven repository, though there CPU versions and such like complicate binding. 

3. regarding a client that doesnt need java/hadoop.jar, well, you' ve just moved the dependencies, haven't you, and switched to a different kind of RPC. What would appeal to me more would be a RESTy interface that works long-haul, so you could submit work a remote cluster, then monitor that work through any Atom feed reader.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12619947#action_12619947 ] 

Pete Wyckoff commented on HADOOP-3754:
--------------------------------------

+1

but, a few more nits:

1. I do think that requiring people to download and compile thrift will be too much of a hassle given that the compiler is in C++ so checking in the generated code, really is the way to go - I think :)  And of course, this requires checking in the needed libraries in various langauges - the libthrift,.jar, thrift.so, thrift.py, ...  But, we can require it, it just makes it more of a hassle for the user, but in this case, I think we need to have a README that tells people how to do that. Also, why do we need to check in the limited_relection header if the user has to download thrift??

2. The exceptions thrown by the library are very general and do not match the client lib - e.g., IOException, ... although this could be a later add on.

3. A note saying the chown is not atomic - i.e., the group in theory could change between the get and the set

4. I think copy from local would be more robust if one could optionally add a checksum so the server could ensure it's looking at the right file and if not and/or the path does not exist, a meaningful exception is thrown but again could be a later add on

5. Not needed now, but the command line isn't very robust to errors or friendly about printing them out in a meaningful user friends way.

6. Generally a README that explains what this is and/or a bigger release note.

7. Not now, but I would be super, super interested in knowing the performance of read/writes from this server.

8. as we saw with the metastore, it would be cool to have an optional #of minimum threads in the worker pool.

9. I don't quite understand why src/contrib/build-contrib.xml needs to change for adding this??

10. would be better to inherit from thrift/src/contrib/fb303 but could be done later and then include counts for each operation. 

But, this is a killer application since no Java or Hadoop is needed on the client whatsoever! Congratulations! Would be cool even to use the Java bindings from a thin client to show no need for all of hadoop.

I would really, really love to see:

List<BlockAddresses> readBlocks(string filename) throws IOException ;
List<BlockAddresses> writeBlocks(string filename, i64 length) throws IOException;

which give you access to reading/writing directly from the data node over TCP :)

Overall looks very good on the first cut.

pete



> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: thrift1.patch

This patch exposes a Thrift API from the FileSystem client. This is not the most optimal solution. A better and more-performant solution would be to expose Thrift APIs from the Datanode and NameNode directly. 

This patch  requires that a FileSystemClientProxy server be started to expose the Thrift API. The code is, at  this point, more a demonstration and proof-of-concept, rather than a finished piece of development. The Thrift API is a subset of the FileSystem API at present.

The Hadoop Thrift API is in src/contrib/thrift/if. Running "ant package" should compile all that is required. The FileSystemProxyClient can be started by running src/contrib/thrift/start_thrift_server.sh. There is a python command line interface hdfs.py that is used to demonstrate that a python client can use the Thrift FileSystem API.

More information about thrift can be found at http://incubator.apache.org/thrift/index.html



> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Steve Loughran (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627053#action_12627053 ] 

Steve Loughran commented on HADOOP-3754:
----------------------------------------

what may make sense for the future releases of apache thrift is to make sure that the thrift team release versions on the relevant platform for hadoop (linux, windows, 32, 64 bit, etc), and provide source and binary RPMs. If there was a thrift RPM, it would be that much easy for a hadoop RPM to depend on it, and so get the right libraries onto the target OS. 

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12622079#action_12622079 ] 

Hadoop QA commented on HADOOP-3754:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12387895/hadoopthrift6.patch
  against trunk revision 685406.

    -1 @author.  The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to cause Findbugs to fail.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed core unit tests.

    -1 contrib tests.  The patch failed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3049/testReport/
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3049/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3049/console

This message is automatically generated.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626312#action_12626312 ] 

Owen O'Malley commented on HADOOP-3754:
---------------------------------------

I rolled this back in Subversion until the binary is removed, the library is moved down into the contrib lib, and the README is added to the lib.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626590#action_12626590 ] 

Hudson commented on HADOOP-3754:
--------------------------------

Integrated in Hadoop-trunk #586 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/586/])

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626476#action_12626476 ] 

Hadoop QA commented on HADOOP-3754:
-----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12389059/hadoopthrift.tar.gz
  against trunk revision 689733.

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no tests are needed for this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3133/console

This message is automatically generated.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Reopened: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley reopened HADOOP-3754:
-----------------------------------


I think we should revert this change.
  1. No one +1'ed it.
  2. It checks a binary executable into lib. I think it would be better to check in the generated files, but not the generator.
  3. It uses a different platform name than the rest of the build.
  4. There is no README in lib/thrift that states where the included binaries and jar came from.

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Status: Patch Available  (was: Open)

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-3754:
-------------------------------------

    Attachment: hadoopthrift4.patch

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Owen O'Malley updated HADOOP-3754:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just recommitted this. Thanks, Dhruba!

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-3754) Support a Thrift Interface to access files/directories in HDFS

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627490#action_12627490 ] 

Hudson commented on HADOOP-3754:
--------------------------------

Integrated in Hadoop-trunk #589 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/589/])

> Support a Thrift Interface to access files/directories in HDFS
> --------------------------------------------------------------
>
>                 Key: HADOOP-3754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3754
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/thiftfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>
>         Attachments: hadoopthrift.tar.gz, hadoopthrift2.patch, hadoopthrift3.patch, hadoopthrift4.patch, hadoopthrift5.patch, hadoopthrift6.patch, hadoopthrift7.patch, hadoopthrift8.patch, hadoopthrift9.patch, thrift1.patch
>
>
> Thrift is a cross-language RPC framework. It supports automatic code generation for a variety of languages (Java, C++, python, PHP, etc) It would be nice if HDFS APIs are exposed though Thirft. It will allow applications written in any programming language to access HDFS.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.