You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Bram Neijt <bn...@gmail.com> on 2010/12/27 17:09:42 UTC

Re: Need feedback on validate_doc_read

Hi all,

After getting to know erlang the hard way, I've found a place to put
in a patch and hook up validate_doc_read code.

I've got a patch which implements a validate_doc_read in the same
manner as validate_doc_update is implemented. The patch is available
at
https://github.com/bneijt/couchdb

Things that are still on the TODO are the following:
- Check the functioning when it comes to replication
- Test the performance hit this will have on the server
- Create a configuration option to enable or disable support for this

An even bigger question is: is this a feature we would ever want in
the mainstream couchdb releases? Is this something the couchdb team
would like to support?

But appart from that question, I would love some feedback on the
implementation, the erlang and structure of it all, so please consider
checking it out and posting some comments.

Greets,

Bram


On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
> Dear developers,
>
>
> After going into the theoretical depths[1] of what performance hits
> there may be and how replication will be affected, I've decided to
> just implement a simple solution and see how far I can get.
>
> I've decided to try to implement a validate_doc_get function, in the
> same manner as the validate_doc_update has been implemented. I've been
> reading the code and I've gotten as far as finding
> prep_and_validate_updates and handle_doc_show and I'm now thinking of
> copying and pasting some logic in to see where it gets me.
>
> I would love some advice on the matter and welcome any comments/feedback.
>
>
> Greets,
>
> Bram
>
> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
> PS If properly implemented, validate_doc_get will not fix all problems
> and you will still need a firewall like system, however it may give
> some insight into where to go from there.
>

Re: Need feedback on validate_doc_read

Posted by Jan Lehnardt <ja...@apache.org>.
Just clearing up an aside:

On 28 Dec 2010, at 12:56, Filipe David Manana wrote:

> Overall I'm not sure this as a feature that CouchDB needs, but that's
> up to the community/PMCs to decide.

The community decides and the committers ultimately commit. The PMC helps running the project, but doesn't dictate project future. Of course the PMC is all committers, but it is important to understand the different hats: 

  http://www.apache.org/foundation/how-it-works.html

Cheers
Jan
-- 


> 
> Nevertheless, thanks for your efforts.
> 
> On Mon, Dec 27, 2010 at 4:09 PM, Bram Neijt <bn...@gmail.com> wrote:
>> Hi all,
>> 
>> After getting to know erlang the hard way, I've found a place to put
>> in a patch and hook up validate_doc_read code.
>> 
>> I've got a patch which implements a validate_doc_read in the same
>> manner as validate_doc_update is implemented. The patch is available
>> at
>> https://github.com/bneijt/couchdb
>> 
>> Things that are still on the TODO are the following:
>> - Check the functioning when it comes to replication
>> - Test the performance hit this will have on the server
>> - Create a configuration option to enable or disable support for this
>> 
>> An even bigger question is: is this a feature we would ever want in
>> the mainstream couchdb releases? Is this something the couchdb team
>> would like to support?
>> 
>> But appart from that question, I would love some feedback on the
>> implementation, the erlang and structure of it all, so please consider
>> checking it out and posting some comments.
>> 
>> Greets,
>> 
>> Bram
>> 
>> 
>> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>>> Dear developers,
>>> 
>>> 
>>> After going into the theoretical depths[1] of what performance hits
>>> there may be and how replication will be affected, I've decided to
>>> just implement a simple solution and see how far I can get.
>>> 
>>> I've decided to try to implement a validate_doc_get function, in the
>>> same manner as the validate_doc_update has been implemented. I've been
>>> reading the code and I've gotten as far as finding
>>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>>> copying and pasting some logic in to see where it gets me.
>>> 
>>> I would love some advice on the matter and welcome any comments/feedback.
>>> 
>>> 
>>> Greets,
>>> 
>>> Bram
>>> 
>>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>>> PS If properly implemented, validate_doc_get will not fix all problems
>>> and you will still need a firewall like system, however it may give
>>> some insight into where to go from there.
>>> 
>> 
> 
> 
> 
> -- 
> Filipe David Manana,
> fdmanana@gmail.com, fdmanana@apache.org
> 
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."


Re: Need feedback on validate_doc_read

Posted by Paul Davis <pa...@gmail.com>.
On Tue, Dec 28, 2010 at 2:50 PM, Bram Neijt <bn...@gmail.com> wrote:
> Hi all,
>
> First of all, thank you all for your replies. I've had a great lot of
> fun working my way though Erlang, and I'm glad this patch may help in
> clearing up the question on whether this is something that should be
> on the CouchDB roadmap or not.
>
> Robert is completely right in pointing out that this does not protect
> views at all, (and I have not even tested it on show functions
> either). Protecting a view is not really possible as that would
> require a view per user/role, I can not think of any straight forward
> way of implementing that. Using a database per user is a solution
> here, because that forces a separate view cache as well.
>
> I'll clean up the patch using the comments given, which should make it
> possible for people to try the performance impact. I myself have never
> really cared for the performance impact. IMHO performance will be met
> as soon as the feature is accepted and used enough ;)
>
> Anybody wanting this feature should test the performance impact and
> convince as many people as possible!
>
> Greets,
>
> Bram

Me and Rob were doing random spitballing on IRC about how one might
use something like this to protect views. We ended up coming down on
having a function that takes the view URL and query string and does
the guards on that. As opposed to trying something ridiculous like a
row-based authorization based on doc ids or some such.

Also, for this to hit trunk its definitely going to need some
performance characterization for people to discuss on whether its a
good trade off. Anyone that wants to see this in trunk will have to
show that (especially when not used) it doesn't affect base
performance and that performance isn't *too* bad when it is being
used.

>
> On Tue, Dec 28, 2010 at 12:56 PM, Filipe David Manana
> <fd...@apache.org> wrote:
>> Hi Bram,
>>
>> I second what Paul and Robert said before.
>>
>> Here:
>> https://github.com/bneijt/couchdb/commit/5d53b79145a267d0f955668f0e7253b74cfda6cc#L3R127
>>
>> You're assuming Else is always {_, Doc} tuple, which is wrong. It
>> might be an error for e.g. You'll likely want to call
>> validate_doc_read for couch_db:open_doc_revs/3.
>>
>> Overall I'm not sure this as a feature that CouchDB needs, but that's
>> up to the community/PMCs to decide.
>>
>> Nevertheless, thanks for your efforts.
>>
>> On Mon, Dec 27, 2010 at 4:09 PM, Bram Neijt <bn...@gmail.com> wrote:
>>> Hi all,
>>>
>>> After getting to know erlang the hard way, I've found a place to put
>>> in a patch and hook up validate_doc_read code.
>>>
>>> I've got a patch which implements a validate_doc_read in the same
>>> manner as validate_doc_update is implemented. The patch is available
>>> at
>>> https://github.com/bneijt/couchdb
>>>
>>> Things that are still on the TODO are the following:
>>> - Check the functioning when it comes to replication
>>> - Test the performance hit this will have on the server
>>> - Create a configuration option to enable or disable support for this
>>>
>>> An even bigger question is: is this a feature we would ever want in
>>> the mainstream couchdb releases? Is this something the couchdb team
>>> would like to support?
>>>
>>> But appart from that question, I would love some feedback on the
>>> implementation, the erlang and structure of it all, so please consider
>>> checking it out and posting some comments.
>>>
>>> Greets,
>>>
>>> Bram
>>>
>>>
>>> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>>>> Dear developers,
>>>>
>>>>
>>>> After going into the theoretical depths[1] of what performance hits
>>>> there may be and how replication will be affected, I've decided to
>>>> just implement a simple solution and see how far I can get.
>>>>
>>>> I've decided to try to implement a validate_doc_get function, in the
>>>> same manner as the validate_doc_update has been implemented. I've been
>>>> reading the code and I've gotten as far as finding
>>>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>>>> copying and pasting some logic in to see where it gets me.
>>>>
>>>> I would love some advice on the matter and welcome any comments/feedback.
>>>>
>>>>
>>>> Greets,
>>>>
>>>> Bram
>>>>
>>>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>>>> PS If properly implemented, validate_doc_get will not fix all problems
>>>> and you will still need a firewall like system, however it may give
>>>> some insight into where to go from there.
>>>>
>>>
>>
>>
>>
>> --
>> Filipe David Manana,
>> fdmanana@gmail.com, fdmanana@apache.org
>>
>> "Reasonable men adapt themselves to the world.
>>  Unreasonable men adapt the world to themselves.
>>  That's why all progress depends on unreasonable men."
>>
>

Re: Need feedback on validate_doc_read

Posted by Bram Neijt <bn...@gmail.com>.
Hi all,

First of all, thank you all for your replies. I've had a great lot of
fun working my way though Erlang, and I'm glad this patch may help in
clearing up the question on whether this is something that should be
on the CouchDB roadmap or not.

Robert is completely right in pointing out that this does not protect
views at all, (and I have not even tested it on show functions
either). Protecting a view is not really possible as that would
require a view per user/role, I can not think of any straight forward
way of implementing that. Using a database per user is a solution
here, because that forces a separate view cache as well.

I'll clean up the patch using the comments given, which should make it
possible for people to try the performance impact. I myself have never
really cared for the performance impact. IMHO performance will be met
as soon as the feature is accepted and used enough ;)

Anybody wanting this feature should test the performance impact and
convince as many people as possible!

Greets,

Bram

On Tue, Dec 28, 2010 at 12:56 PM, Filipe David Manana
<fd...@apache.org> wrote:
> Hi Bram,
>
> I second what Paul and Robert said before.
>
> Here:
> https://github.com/bneijt/couchdb/commit/5d53b79145a267d0f955668f0e7253b74cfda6cc#L3R127
>
> You're assuming Else is always {_, Doc} tuple, which is wrong. It
> might be an error for e.g. You'll likely want to call
> validate_doc_read for couch_db:open_doc_revs/3.
>
> Overall I'm not sure this as a feature that CouchDB needs, but that's
> up to the community/PMCs to decide.
>
> Nevertheless, thanks for your efforts.
>
> On Mon, Dec 27, 2010 at 4:09 PM, Bram Neijt <bn...@gmail.com> wrote:
>> Hi all,
>>
>> After getting to know erlang the hard way, I've found a place to put
>> in a patch and hook up validate_doc_read code.
>>
>> I've got a patch which implements a validate_doc_read in the same
>> manner as validate_doc_update is implemented. The patch is available
>> at
>> https://github.com/bneijt/couchdb
>>
>> Things that are still on the TODO are the following:
>> - Check the functioning when it comes to replication
>> - Test the performance hit this will have on the server
>> - Create a configuration option to enable or disable support for this
>>
>> An even bigger question is: is this a feature we would ever want in
>> the mainstream couchdb releases? Is this something the couchdb team
>> would like to support?
>>
>> But appart from that question, I would love some feedback on the
>> implementation, the erlang and structure of it all, so please consider
>> checking it out and posting some comments.
>>
>> Greets,
>>
>> Bram
>>
>>
>> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>>> Dear developers,
>>>
>>>
>>> After going into the theoretical depths[1] of what performance hits
>>> there may be and how replication will be affected, I've decided to
>>> just implement a simple solution and see how far I can get.
>>>
>>> I've decided to try to implement a validate_doc_get function, in the
>>> same manner as the validate_doc_update has been implemented. I've been
>>> reading the code and I've gotten as far as finding
>>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>>> copying and pasting some logic in to see where it gets me.
>>>
>>> I would love some advice on the matter and welcome any comments/feedback.
>>>
>>>
>>> Greets,
>>>
>>> Bram
>>>
>>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>>> PS If properly implemented, validate_doc_get will not fix all problems
>>> and you will still need a firewall like system, however it may give
>>> some insight into where to go from there.
>>>
>>
>
>
>
> --
> Filipe David Manana,
> fdmanana@gmail.com, fdmanana@apache.org
>
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."
>

Re: Need feedback on validate_doc_read

Posted by Filipe David Manana <fd...@apache.org>.
Hi Bram,

I second what Paul and Robert said before.

Here:
https://github.com/bneijt/couchdb/commit/5d53b79145a267d0f955668f0e7253b74cfda6cc#L3R127

You're assuming Else is always {_, Doc} tuple, which is wrong. It
might be an error for e.g. You'll likely want to call
validate_doc_read for couch_db:open_doc_revs/3.

Overall I'm not sure this as a feature that CouchDB needs, but that's
up to the community/PMCs to decide.

Nevertheless, thanks for your efforts.

On Mon, Dec 27, 2010 at 4:09 PM, Bram Neijt <bn...@gmail.com> wrote:
> Hi all,
>
> After getting to know erlang the hard way, I've found a place to put
> in a patch and hook up validate_doc_read code.
>
> I've got a patch which implements a validate_doc_read in the same
> manner as validate_doc_update is implemented. The patch is available
> at
> https://github.com/bneijt/couchdb
>
> Things that are still on the TODO are the following:
> - Check the functioning when it comes to replication
> - Test the performance hit this will have on the server
> - Create a configuration option to enable or disable support for this
>
> An even bigger question is: is this a feature we would ever want in
> the mainstream couchdb releases? Is this something the couchdb team
> would like to support?
>
> But appart from that question, I would love some feedback on the
> implementation, the erlang and structure of it all, so please consider
> checking it out and posting some comments.
>
> Greets,
>
> Bram
>
>
> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>> Dear developers,
>>
>>
>> After going into the theoretical depths[1] of what performance hits
>> there may be and how replication will be affected, I've decided to
>> just implement a simple solution and see how far I can get.
>>
>> I've decided to try to implement a validate_doc_get function, in the
>> same manner as the validate_doc_update has been implemented. I've been
>> reading the code and I've gotten as far as finding
>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>> copying and pasting some logic in to see where it gets me.
>>
>> I would love some advice on the matter and welcome any comments/feedback.
>>
>>
>> Greets,
>>
>> Bram
>>
>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>> PS If properly implemented, validate_doc_get will not fix all problems
>> and you will still need a firewall like system, however it may give
>> some insight into where to go from there.
>>
>



-- 
Filipe David Manana,
fdmanana@gmail.com, fdmanana@apache.org

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

Re: Need feedback on validate_doc_read

Posted by Paul Davis <pa...@gmail.com>.
On Mon, Dec 27, 2010 at 12:11 PM, Robert Newson <ro...@gmail.com> wrote:
> Firstly, the patch contains some whitespace changes (couch_db.erl line
> 122 and couch_httpd.erl line 66, and some spurious INFO logs,
> couch_httpd.erl line 129) that should be removed.
>
> Secondly, the patch appears not to protect document content held in
> views at all, correct? If so, I'm -1 on the patch as it stands. A view
> that emits the doc as a value would bypass this access control
> entirely, I think.

Also, include_docs=true should be considered as well.

>
> Thirdly, no tests.
>
> B.
>
>
> On Mon, Dec 27, 2010 at 4:46 PM, Paul Davis <pa...@gmail.com> wrote:
>> On Mon, Dec 27, 2010 at 11:09 AM, Bram Neijt <bn...@gmail.com> wrote:
>>> Hi all,
>>>
>>> After getting to know erlang the hard way, I've found a place to put
>>> in a patch and hook up validate_doc_read code.
>>>
>>> I've got a patch which implements a validate_doc_read in the same
>>> manner as validate_doc_update is implemented. The patch is available
>>> at
>>> https://github.com/bneijt/couchdb
>>>
>>> Things that are still on the TODO are the following:
>>> - Check the functioning when it comes to replication
>>> - Test the performance hit this will have on the server
>>> - Create a configuration option to enable or disable support for this
>>>
>>> An even bigger question is: is this a feature we would ever want in
>>> the mainstream couchdb releases? Is this something the couchdb team
>>> would like to support?
>>>
>>> But appart from that question, I would love some feedback on the
>>> implementation, the erlang and structure of it all, so please consider
>>> checking it out and posting some comments.
>>>
>>> Greets,
>>>
>>> Bram
>>>
>>
>> Skimming that last commit everything looks sane. The answer to whether
>> we'd pull something like that into trunk pretty much depends on your
>> three questions. It was never implemented because we assumed that it'd
>> be a performance killer and no one wanted to try it out to see how bad
>> it was.
>>
>> I'm not sure I'd like it to be configurable though. Being able to turn
>> something like that off doesn't seem like a good idea. Though, making
>> sure that the case where no design doc has a validate function doesn't
>> hurt performance would be close enough for most people I think.
>>
>>>
>>> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>>>> Dear developers,
>>>>
>>>>
>>>> After going into the theoretical depths[1] of what performance hits
>>>> there may be and how replication will be affected, I've decided to
>>>> just implement a simple solution and see how far I can get.
>>>>
>>>> I've decided to try to implement a validate_doc_get function, in the
>>>> same manner as the validate_doc_update has been implemented. I've been
>>>> reading the code and I've gotten as far as finding
>>>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>>>> copying and pasting some logic in to see where it gets me.
>>>>
>>>> I would love some advice on the matter and welcome any comments/feedback.
>>>>
>>>>
>>>> Greets,
>>>>
>>>> Bram
>>>>
>>>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>>>> PS If properly implemented, validate_doc_get will not fix all problems
>>>> and you will still need a firewall like system, however it may give
>>>> some insight into where to go from there.
>>>>
>>>
>>
>

Re: Need feedback on validate_doc_read

Posted by Robert Newson <ro...@gmail.com>.
Firstly, the patch contains some whitespace changes (couch_db.erl line
122 and couch_httpd.erl line 66, and some spurious INFO logs,
couch_httpd.erl line 129) that should be removed.

Secondly, the patch appears not to protect document content held in
views at all, correct? If so, I'm -1 on the patch as it stands. A view
that emits the doc as a value would bypass this access control
entirely, I think.

Thirdly, no tests.

B.


On Mon, Dec 27, 2010 at 4:46 PM, Paul Davis <pa...@gmail.com> wrote:
> On Mon, Dec 27, 2010 at 11:09 AM, Bram Neijt <bn...@gmail.com> wrote:
>> Hi all,
>>
>> After getting to know erlang the hard way, I've found a place to put
>> in a patch and hook up validate_doc_read code.
>>
>> I've got a patch which implements a validate_doc_read in the same
>> manner as validate_doc_update is implemented. The patch is available
>> at
>> https://github.com/bneijt/couchdb
>>
>> Things that are still on the TODO are the following:
>> - Check the functioning when it comes to replication
>> - Test the performance hit this will have on the server
>> - Create a configuration option to enable or disable support for this
>>
>> An even bigger question is: is this a feature we would ever want in
>> the mainstream couchdb releases? Is this something the couchdb team
>> would like to support?
>>
>> But appart from that question, I would love some feedback on the
>> implementation, the erlang and structure of it all, so please consider
>> checking it out and posting some comments.
>>
>> Greets,
>>
>> Bram
>>
>
> Skimming that last commit everything looks sane. The answer to whether
> we'd pull something like that into trunk pretty much depends on your
> three questions. It was never implemented because we assumed that it'd
> be a performance killer and no one wanted to try it out to see how bad
> it was.
>
> I'm not sure I'd like it to be configurable though. Being able to turn
> something like that off doesn't seem like a good idea. Though, making
> sure that the case where no design doc has a validate function doesn't
> hurt performance would be close enough for most people I think.
>
>>
>> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>>> Dear developers,
>>>
>>>
>>> After going into the theoretical depths[1] of what performance hits
>>> there may be and how replication will be affected, I've decided to
>>> just implement a simple solution and see how far I can get.
>>>
>>> I've decided to try to implement a validate_doc_get function, in the
>>> same manner as the validate_doc_update has been implemented. I've been
>>> reading the code and I've gotten as far as finding
>>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>>> copying and pasting some logic in to see where it gets me.
>>>
>>> I would love some advice on the matter and welcome any comments/feedback.
>>>
>>>
>>> Greets,
>>>
>>> Bram
>>>
>>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>>> PS If properly implemented, validate_doc_get will not fix all problems
>>> and you will still need a firewall like system, however it may give
>>> some insight into where to go from there.
>>>
>>
>

Re: Need feedback on validate_doc_read

Posted by Paul Davis <pa...@gmail.com>.
On Mon, Dec 27, 2010 at 11:09 AM, Bram Neijt <bn...@gmail.com> wrote:
> Hi all,
>
> After getting to know erlang the hard way, I've found a place to put
> in a patch and hook up validate_doc_read code.
>
> I've got a patch which implements a validate_doc_read in the same
> manner as validate_doc_update is implemented. The patch is available
> at
> https://github.com/bneijt/couchdb
>
> Things that are still on the TODO are the following:
> - Check the functioning when it comes to replication
> - Test the performance hit this will have on the server
> - Create a configuration option to enable or disable support for this
>
> An even bigger question is: is this a feature we would ever want in
> the mainstream couchdb releases? Is this something the couchdb team
> would like to support?
>
> But appart from that question, I would love some feedback on the
> implementation, the erlang and structure of it all, so please consider
> checking it out and posting some comments.
>
> Greets,
>
> Bram
>

Skimming that last commit everything looks sane. The answer to whether
we'd pull something like that into trunk pretty much depends on your
three questions. It was never implemented because we assumed that it'd
be a performance killer and no one wanted to try it out to see how bad
it was.

I'm not sure I'd like it to be configurable though. Being able to turn
something like that off doesn't seem like a good idea. Though, making
sure that the case where no design doc has a validate function doesn't
hurt performance would be close enough for most people I think.

>
> On Tue, Dec 7, 2010 at 1:18 PM, Bram Neijt <bn...@gmail.com> wrote:
>> Dear developers,
>>
>>
>> After going into the theoretical depths[1] of what performance hits
>> there may be and how replication will be affected, I've decided to
>> just implement a simple solution and see how far I can get.
>>
>> I've decided to try to implement a validate_doc_get function, in the
>> same manner as the validate_doc_update has been implemented. I've been
>> reading the code and I've gotten as far as finding
>> prep_and_validate_updates and handle_doc_show and I'm now thinking of
>> copying and pasting some logic in to see where it gets me.
>>
>> I would love some advice on the matter and welcome any comments/feedback.
>>
>>
>> Greets,
>>
>> Bram
>>
>> [1] http://wiki.apache.org/couchdb/PerDocumentAuthorization
>> PS If properly implemented, validate_doc_get will not fix all problems
>> and you will still need a firewall like system, however it may give
>> some insight into where to go from there.
>>
>