You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2018/10/08 21:11:05 UTC

[GitHub] cluxter commented on issue #1540: Streaming API for attachment data

cluxter commented on issue #1540: Streaming API for attachment data
URL: https://github.com/apache/couchdb/issues/1540#issuecomment-427980458

**In an ideal situation, I would like to be able to**:
1) **upload attachments of unlimited size**, ie. only limited by the file system, not by the CouchDB storage system (so nothing like this: https://github.com/apache/couchdb/pull/1253 )
2) **have a smooth replication of these attachments between the CouchDB instances**, ie. huge attachments replications won't clog up CouchDB in any way (which doesn't mean the replication wouldn't be slowed down, obviously; we don't have unlimited bandwidth).

**This desire implies that**:
1) **being able to store huge attachments in a database is not seen as bad practice**. I'm certain some people will come up and say "Hey, ending up storing files of thousands of gigabytes in a database is silly, this means that your storage design is wrong, go fix that now instead of using CouchDB as a file system". Well, in 10 or 15 years, files of hundreds of gigabytes might be normal for some activities and I would like CouchDB to be able to scale by design, not because of the hardware available through time. The idea here is _not_ to use CouchDB as a file system, but being able to have a place in which _all_ data of a software system could fit. I don't like the idea that I have to use one storage system for small files (CouchDB) and one other storage system for big files, especially when the size limit of the files is arbitrary and depends on the bandwidth/CPU available (or some vague notion). Basically putting a maximum size limit on attachments means that we don't want to deal with this issue and that we let it for another system to fix it. Or worse: we make people believe that they can use attachments but... not really actually.
2) **we need a strong resilient and reliable replication system which can operate under bad conditions**. This would align on the strong resiliency CouchDB already offers with regards to unexpected shutdowns. My instinct tells me that a P2P system similar to Kazaa/eMule/Bittorrent (I'm looking at the multi-sources P2P paradigm, not the protocols per se) would be ideal because it's fast, efficient and resilient. But maybe this is not well suited for CouchDB. Or maybe we are using this already (not what I understood so far though). I'm pretty sure this would require _a lot_ of work, but I would at least like to know that it's somewhere on the long term road-map.

Now **this is a personal vision** of what CouchDB should look like but maybe this is not shared by many other people. Or maybe it is. Please don't hesitate to (respectfully and constructively) criticize my views and argument on them, I'm eager to learn more about why this should or should not be done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services