You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Mehdi El Fadil <me...@mango-is.com> on 2011/06/21 13:15:03 UTC

data integration to couchdb

Hello,

I have to integrate data into couchdb. The following situations will be
possible:

   - extract data in real time (or near real-time) from a somewhere online,
   process it and insert in couchdb
   - run batches to get data from some location (eg flat file available
   online), process data and insert / update documents inside couchdb

My initial plan was to use couchdb both for storage and as application
server, to keep a simple architecture. However, my data integration
requirement might not be feasible easily using only views, list, and show
functions.

   1. How would I listen to real-time messages from the outside world?
   2. Would there be any way to have reasonable performance in the data
   processing (filtering, aggregation) if done from couchdb?

Thanks in advance for your advice.

cheers,

mehdi

Re: data integration to couchdb

Posted by Mehdi El Fadil <me...@mango-is.com>.
Sean,

Thanks again for your answer and for the hint on OS daemons support, I'm
having a look at it.

Cheers,
m

On Tue, Jun 21, 2011 at 3:10 PM, Sean Copenhaver
<se...@gmail.com>wrote:

> Oh, I'm sorry. I completely misunderstood. Your understanding of the
> _changes API is correct. It's for the outside world to listen to couch. I
> do
> not think that couch can listen to the outside world only be told things
> via
> HTTP requests.
>
> I believe this sort of thing would be done with an external process, which
> couch has a basic feature to help manage them. It doesn't have to be
> node.js
> just something to do the ETL process.
>
> Feature I mentioned above. Like I said very basic but does give you
> auto-restart and helps keep the pieces fairly couch centered.
>
> http://docs.couchbase.org/couchdb-release-1.1/index.html#couchdb-release-1.1-osprocess
>
> On Tue, Jun 21, 2011 at 8:57 AM, Mehdi El Fadil
> <me...@mango-is.com>wrote:
>
> > Thanks for the answer Sean,
> >
> > but I am not sure whether this is what I need. From my understanding, the
> > _changes API only listens to what happens inside couchdb. What I need to
> do
> > is to listen to the outside world (via websites APIs, web services,
> RSS...)
> > at server side, and then only insert or update my data in couchdb.
> >
> > Am I getting the _changes API wrong, would it allow to do so? I feel like
> I
> > would need another server side technology such as node.js.
> >
> > Mehdi.
> >
> >
> > On Tue, Jun 21, 2011 at 2:18 PM, Sean Copenhaver
> > <se...@gmail.com>wrote:
> >
> > > I could not comment on the performance but you can use the _changes API
> > to
> > > get a soft real-time stream of changes to the database. It has a few
> > > options
> > > including listening continuously and applying filters to only receive
> > what
> > > you are interested in:
> > >
> > > http://guide.couchdb.org/draft/notifications.html
> > >
> > > On Tue, Jun 21, 2011 at 7:15 AM, Mehdi El Fadil
> > > <me...@mango-is.com>wrote:
> > >
> > > > Hello,
> > > >
> > > > I have to integrate data into couchdb. The following situations will
> be
> > > > possible:
> > > >
> > > >   - extract data in real time (or near real-time) from a somewhere
> > > online,
> > > >   process it and insert in couchdb
> > > >   - run batches to get data from some location (eg flat file
> available
> > > >   online), process data and insert / update documents inside couchdb
> > > >
> > > > My initial plan was to use couchdb both for storage and as
> application
> > > > server, to keep a simple architecture. However, my data integration
> > > > requirement might not be feasible easily using only views, list, and
> > show
> > > > functions.
> > > >
> > > >   1. How would I listen to real-time messages from the outside world?
> > > >   2. Would there be any way to have reasonable performance in the
> data
> > > >   processing (filtering, aggregation) if done from couchdb?
> > > >
> > > > Thanks in advance for your advice.
> > > >
> > > > cheers,
> > > >
> > > > mehdi
> > > >
> > >
> > >
> > >
> > > --
> > > “The limits of language are the limits of one's world. “ -Ludwig von
> > > Wittgenstein
> > >
> >
>
>
>
> --
> “The limits of language are the limits of one's world. “ -Ludwig von
> Wittgenstein
>



-- 
*
Mehdi El Fadil
*
<http://www.mango-is.com>

Re: data integration to couchdb

Posted by Sean Copenhaver <se...@gmail.com>.
Oh, I'm sorry. I completely misunderstood. Your understanding of the
_changes API is correct. It's for the outside world to listen to couch. I do
not think that couch can listen to the outside world only be told things via
HTTP requests.

I believe this sort of thing would be done with an external process, which
couch has a basic feature to help manage them. It doesn't have to be node.js
just something to do the ETL process.

Feature I mentioned above. Like I said very basic but does give you
auto-restart and helps keep the pieces fairly couch centered.
http://docs.couchbase.org/couchdb-release-1.1/index.html#couchdb-release-1.1-osprocess

On Tue, Jun 21, 2011 at 8:57 AM, Mehdi El Fadil
<me...@mango-is.com>wrote:

> Thanks for the answer Sean,
>
> but I am not sure whether this is what I need. From my understanding, the
> _changes API only listens to what happens inside couchdb. What I need to do
> is to listen to the outside world (via websites APIs, web services, RSS...)
> at server side, and then only insert or update my data in couchdb.
>
> Am I getting the _changes API wrong, would it allow to do so? I feel like I
> would need another server side technology such as node.js.
>
> Mehdi.
>
>
> On Tue, Jun 21, 2011 at 2:18 PM, Sean Copenhaver
> <se...@gmail.com>wrote:
>
> > I could not comment on the performance but you can use the _changes API
> to
> > get a soft real-time stream of changes to the database. It has a few
> > options
> > including listening continuously and applying filters to only receive
> what
> > you are interested in:
> >
> > http://guide.couchdb.org/draft/notifications.html
> >
> > On Tue, Jun 21, 2011 at 7:15 AM, Mehdi El Fadil
> > <me...@mango-is.com>wrote:
> >
> > > Hello,
> > >
> > > I have to integrate data into couchdb. The following situations will be
> > > possible:
> > >
> > >   - extract data in real time (or near real-time) from a somewhere
> > online,
> > >   process it and insert in couchdb
> > >   - run batches to get data from some location (eg flat file available
> > >   online), process data and insert / update documents inside couchdb
> > >
> > > My initial plan was to use couchdb both for storage and as application
> > > server, to keep a simple architecture. However, my data integration
> > > requirement might not be feasible easily using only views, list, and
> show
> > > functions.
> > >
> > >   1. How would I listen to real-time messages from the outside world?
> > >   2. Would there be any way to have reasonable performance in the data
> > >   processing (filtering, aggregation) if done from couchdb?
> > >
> > > Thanks in advance for your advice.
> > >
> > > cheers,
> > >
> > > mehdi
> > >
> >
> >
> >
> > --
> > “The limits of language are the limits of one's world. “ -Ludwig von
> > Wittgenstein
> >
>



-- 
“The limits of language are the limits of one's world. “ -Ludwig von
Wittgenstein

Re: data integration to couchdb

Posted by Mehdi El Fadil <me...@mango-is.com>.
Thanks for the answer Sean,

but I am not sure whether this is what I need. From my understanding, the
_changes API only listens to what happens inside couchdb. What I need to do
is to listen to the outside world (via websites APIs, web services, RSS...)
at server side, and then only insert or update my data in couchdb.

Am I getting the _changes API wrong, would it allow to do so? I feel like I
would need another server side technology such as node.js.

Mehdi.


On Tue, Jun 21, 2011 at 2:18 PM, Sean Copenhaver
<se...@gmail.com>wrote:

> I could not comment on the performance but you can use the _changes API to
> get a soft real-time stream of changes to the database. It has a few
> options
> including listening continuously and applying filters to only receive what
> you are interested in:
>
> http://guide.couchdb.org/draft/notifications.html
>
> On Tue, Jun 21, 2011 at 7:15 AM, Mehdi El Fadil
> <me...@mango-is.com>wrote:
>
> > Hello,
> >
> > I have to integrate data into couchdb. The following situations will be
> > possible:
> >
> >   - extract data in real time (or near real-time) from a somewhere
> online,
> >   process it and insert in couchdb
> >   - run batches to get data from some location (eg flat file available
> >   online), process data and insert / update documents inside couchdb
> >
> > My initial plan was to use couchdb both for storage and as application
> > server, to keep a simple architecture. However, my data integration
> > requirement might not be feasible easily using only views, list, and show
> > functions.
> >
> >   1. How would I listen to real-time messages from the outside world?
> >   2. Would there be any way to have reasonable performance in the data
> >   processing (filtering, aggregation) if done from couchdb?
> >
> > Thanks in advance for your advice.
> >
> > cheers,
> >
> > mehdi
> >
>
>
>
> --
> “The limits of language are the limits of one's world. “ -Ludwig von
> Wittgenstein
>

Re: data integration to couchdb

Posted by Sean Copenhaver <se...@gmail.com>.
I could not comment on the performance but you can use the _changes API to
get a soft real-time stream of changes to the database. It has a few options
including listening continuously and applying filters to only receive what
you are interested in:

http://guide.couchdb.org/draft/notifications.html

On Tue, Jun 21, 2011 at 7:15 AM, Mehdi El Fadil
<me...@mango-is.com>wrote:

> Hello,
>
> I have to integrate data into couchdb. The following situations will be
> possible:
>
>   - extract data in real time (or near real-time) from a somewhere online,
>   process it and insert in couchdb
>   - run batches to get data from some location (eg flat file available
>   online), process data and insert / update documents inside couchdb
>
> My initial plan was to use couchdb both for storage and as application
> server, to keep a simple architecture. However, my data integration
> requirement might not be feasible easily using only views, list, and show
> functions.
>
>   1. How would I listen to real-time messages from the outside world?
>   2. Would there be any way to have reasonable performance in the data
>   processing (filtering, aggregation) if done from couchdb?
>
> Thanks in advance for your advice.
>
> cheers,
>
> mehdi
>



-- 
“The limits of language are the limits of one's world. “ -Ludwig von
Wittgenstein