You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Dorian Hoxha <do...@gmail.com> on 2016/11/17 07:46:22 UTC

Using solr(cloud) as source-of-truth for data (with no backing external db)

Hi,

Anyone use solr for source-of-data with no `normal` db (of course with
normal backups/replication) ?

Are there any drawbacks ?

Thank You

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Alexandre Rafalovitch <ar...@gmail.com>.

Sure. And the people do it. Especially for their first deployment. I
have some prototypes/proof-of-concepts like that myself.

Just later don't say you didn't ask and we didn't tell :-)

Regards,
    Alex.
----
Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/


On 18 November 2016 at 20:45, Dorian Hoxha <do...@gmail.com> wrote:
> @alex
> That makes sense, but it can be ~fixed by just storing every field that you
> need.
>
> @Walter
> Many of those things are missing from many nosql dbs yet they're used as
> source of data.
> As long as the backup is "point in time", meaning consistent timestamp
> across all shards it ~should be ok for many usecases.
>
> The 1-line-curl may need a patch to be disabled from config.
>
> On Thu, Nov 17, 2016 at 6:29 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
>
>> I agree, it is a bad idea.
>>
>> Solr is missing nearly everything you want in a repository, because it is
>> not designed to be a repository.
>>
>> Does not have:
>>
>> * access control
>> * transactions
>> * transactional backup
>> * dump and load
>> * schema migration
>> * versioning
>>
>> And so on.
>>
>> Also, I’m glad to share a one-line curl command that will delete all the
>> documents
>> in your collection.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>> > On Nov 17, 2016, at 1:20 AM, Alexandre Rafalovitch <ar...@gmail.com>
>> wrote:
>> >
>> > I've heard of people doing it but it is not recommended.
>> >
>> > One of the biggest implementation breakthroughs is that - after the
>> > initial learning curve - you will start mapping your input data to
>> > signals. Those signals will not look very much like your original data
>> > and therefore are not terribly suitable to be the source of it.
>> >
>> > We are talking copyFields, UpdateRequestProcessor pre-processing,
>> > fields that are not stored, nested documents flattening,
>> > denormalization, etc. Getting back from that to original shape of data
>> > is painful.
>> >
>> > Regards,
>> >   Alex.
>> > ----
>> > Solr Example reading group is starting November 2016, join us at
>> > http://j.mp/SolrERG
>> > Newsletter and resources for Solr beginners and intermediates:
>> > http://www.solr-start.com/
>> >
>> >
>> > On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com>
>> wrote:
>> >> Hi,
>> >>
>> >> Anyone use solr for source-of-data with no `normal` db (of course with
>> >> normal backups/replication) ?
>> >>
>> >> Are there any drawbacks ?
>> >>
>> >> Thank You
>>
>>

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Dorian Hoxha <do...@gmail.com>.

Yeah that looks like the _source that elasticsearch has.

On Mon, Nov 21, 2016 at 9:20 PM, Michael Joyner <mi...@newsrx.com> wrote:

> Have a "store only" text field that contains a serialized (json?) of the
> master object for deserilization as part of the results parsing if you are
> wanting to save a DB lookup.
>
> I would still store everything in a DB though to have a "master" copy of
> everthing.
>
>
>
> On 11/18/2016 04:45 AM, Dorian Hoxha wrote:
>
>> @alex
>> That makes sense, but it can be ~fixed by just storing every field that
>> you
>> need.
>>
>> @Walter
>> Many of those things are missing from many nosql dbs yet they're used as
>> source of data.
>> As long as the backup is "point in time", meaning consistent timestamp
>> across all shards it ~should be ok for many usecases.
>>
>> The 1-line-curl may need a patch to be disabled from config.
>>
>> On Thu, Nov 17, 2016 at 6:29 PM, Walter Underwood <wu...@wunderwood.org>
>> wrote:
>>
>> I agree, it is a bad idea.
>>>
>>> Solr is missing nearly everything you want in a repository, because it is
>>> not designed to be a repository.
>>>
>>> Does not have:
>>>
>>> * access control
>>> * transactions
>>> * transactional backup
>>> * dump and load
>>> * schema migration
>>> * versioning
>>>
>>> And so on.
>>>
>>> Also, I’m glad to share a one-line curl command that will delete all the
>>> documents
>>> in your collection.
>>>
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>>
>>>
>>> On Nov 17, 2016, at 1:20 AM, Alexandre Rafalovitch <ar...@gmail.com>
>>>>
>>> wrote:
>>>
>>>> I've heard of people doing it but it is not recommended.
>>>>
>>>> One of the biggest implementation breakthroughs is that - after the
>>>> initial learning curve - you will start mapping your input data to
>>>> signals. Those signals will not look very much like your original data
>>>> and therefore are not terribly suitable to be the source of it.
>>>>
>>>> We are talking copyFields, UpdateRequestProcessor pre-processing,
>>>> fields that are not stored, nested documents flattening,
>>>> denormalization, etc. Getting back from that to original shape of data
>>>> is painful.
>>>>
>>>> Regards,
>>>>    Alex.
>>>> ----
>>>> Solr Example reading group is starting November 2016, join us at
>>>> http://j.mp/SolrERG
>>>> Newsletter and resources for Solr beginners and intermediates:
>>>> http://www.solr-start.com/
>>>>
>>>>
>>>> On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com>
>>>>
>>> wrote:
>>>
>>>> Hi,
>>>>>
>>>>> Anyone use solr for source-of-data with no `normal` db (of course with
>>>>> normal backups/replication) ?
>>>>>
>>>>> Are there any drawbacks ?
>>>>>
>>>>> Thank You
>>>>>
>>>>
>>>
>

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Michael Joyner <mi...@newsrx.com>.

Have a "store only" text field that contains a serialized (json?) of the 
master object for deserilization as part of the results parsing if you 
are wanting to save a DB lookup.

I would still store everything in a DB though to have a "master" copy of 
everthing.


On 11/18/2016 04:45 AM, Dorian Hoxha wrote:
> @alex
> That makes sense, but it can be ~fixed by just storing every field that you
> need.
>
> @Walter
> Many of those things are missing from many nosql dbs yet they're used as
> source of data.
> As long as the backup is "point in time", meaning consistent timestamp
> across all shards it ~should be ok for many usecases.
>
> The 1-line-curl may need a patch to be disabled from config.
>
> On Thu, Nov 17, 2016 at 6:29 PM, Walter Underwood <wu...@wunderwood.org>
> wrote:
>
>> I agree, it is a bad idea.
>>
>> Solr is missing nearly everything you want in a repository, because it is
>> not designed to be a repository.
>>
>> Does not have:
>>
>> * access control
>> * transactions
>> * transactional backup
>> * dump and load
>> * schema migration
>> * versioning
>>
>> And so on.
>>
>> Also, I\u2019m glad to share a one-line curl command that will delete all the
>> documents
>> in your collection.
>>
>> wunder
>> Walter Underwood
>> wunder@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>
>>> On Nov 17, 2016, at 1:20 AM, Alexandre Rafalovitch <ar...@gmail.com>
>> wrote:
>>> I've heard of people doing it but it is not recommended.
>>>
>>> One of the biggest implementation breakthroughs is that - after the
>>> initial learning curve - you will start mapping your input data to
>>> signals. Those signals will not look very much like your original data
>>> and therefore are not terribly suitable to be the source of it.
>>>
>>> We are talking copyFields, UpdateRequestProcessor pre-processing,
>>> fields that are not stored, nested documents flattening,
>>> denormalization, etc. Getting back from that to original shape of data
>>> is painful.
>>>
>>> Regards,
>>>    Alex.
>>> ----
>>> Solr Example reading group is starting November 2016, join us at
>>> http://j.mp/SolrERG
>>> Newsletter and resources for Solr beginners and intermediates:
>>> http://www.solr-start.com/
>>>
>>>
>>> On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com>
>> wrote:
>>>> Hi,
>>>>
>>>> Anyone use solr for source-of-data with no `normal` db (of course with
>>>> normal backups/replication) ?
>>>>
>>>> Are there any drawbacks ?
>>>>
>>>> Thank You
>>

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Dorian Hoxha <do...@gmail.com>.

@alex
That makes sense, but it can be ~fixed by just storing every field that you
need.

@Walter
Many of those things are missing from many nosql dbs yet they're used as
source of data.
As long as the backup is "point in time", meaning consistent timestamp
across all shards it ~should be ok for many usecases.

The 1-line-curl may need a patch to be disabled from config.

On Thu, Nov 17, 2016 at 6:29 PM, Walter Underwood <wu...@wunderwood.org>
wrote:

> I agree, it is a bad idea.
>
> Solr is missing nearly everything you want in a repository, because it is
> not designed to be a repository.
>
> Does not have:
>
> * access control
> * transactions
> * transactional backup
> * dump and load
> * schema migration
> * versioning
>
> And so on.
>
> Also, I’m glad to share a one-line curl command that will delete all the
> documents
> in your collection.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Nov 17, 2016, at 1:20 AM, Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
> >
> > I've heard of people doing it but it is not recommended.
> >
> > One of the biggest implementation breakthroughs is that - after the
> > initial learning curve - you will start mapping your input data to
> > signals. Those signals will not look very much like your original data
> > and therefore are not terribly suitable to be the source of it.
> >
> > We are talking copyFields, UpdateRequestProcessor pre-processing,
> > fields that are not stored, nested documents flattening,
> > denormalization, etc. Getting back from that to original shape of data
> > is painful.
> >
> > Regards,
> >   Alex.
> > ----
> > Solr Example reading group is starting November 2016, join us at
> > http://j.mp/SolrERG
> > Newsletter and resources for Solr beginners and intermediates:
> > http://www.solr-start.com/
> >
> >
> > On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com>
> wrote:
> >> Hi,
> >>
> >> Anyone use solr for source-of-data with no `normal` db (of course with
> >> normal backups/replication) ?
> >>
> >> Are there any drawbacks ?
> >>
> >> Thank You
>
>

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Walter Underwood <wu...@wunderwood.org>.

I agree, it is a bad idea.

Solr is missing nearly everything you want in a repository, because it is
not designed to be a repository.

Does not have:

* access control
* transactions
* transactional backup
* dump and load
* schema migration
* versioning

And so on.

Also, I’m glad to share a one-line curl command that will delete all the documents
in your collection.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Nov 17, 2016, at 1:20 AM, Alexandre Rafalovitch <ar...@gmail.com> wrote:
> 
> I've heard of people doing it but it is not recommended.
> 
> One of the biggest implementation breakthroughs is that - after the
> initial learning curve - you will start mapping your input data to
> signals. Those signals will not look very much like your original data
> and therefore are not terribly suitable to be the source of it.
> 
> We are talking copyFields, UpdateRequestProcessor pre-processing,
> fields that are not stored, nested documents flattening,
> denormalization, etc. Getting back from that to original shape of data
> is painful.
> 
> Regards,
>   Alex.
> ----
> Solr Example reading group is starting November 2016, join us at
> http://j.mp/SolrERG
> Newsletter and resources for Solr beginners and intermediates:
> http://www.solr-start.com/
> 
> 
> On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com> wrote:
>> Hi,
>> 
>> Anyone use solr for source-of-data with no `normal` db (of course with
>> normal backups/replication) ?
>> 
>> Are there any drawbacks ?
>> 
>> Thank You

Re: Using solr(cloud) as source-of-truth for data (with no backing external db)

Posted by Alexandre Rafalovitch <ar...@gmail.com>.

I've heard of people doing it but it is not recommended.

One of the biggest implementation breakthroughs is that - after the
initial learning curve - you will start mapping your input data to
signals. Those signals will not look very much like your original data
and therefore are not terribly suitable to be the source of it.

We are talking copyFields, UpdateRequestProcessor pre-processing,
fields that are not stored, nested documents flattening,
denormalization, etc. Getting back from that to original shape of data
is painful.

Regards,
   Alex.
----
Solr Example reading group is starting November 2016, join us at
http://j.mp/SolrERG
Newsletter and resources for Solr beginners and intermediates:
http://www.solr-start.com/

On 17 November 2016 at 18:46, Dorian Hoxha <do...@gmail.com> wrote:
> Hi,
>
> Anyone use solr for source-of-data with no `normal` db (of course with
> normal backups/replication) ?
>
> Are there any drawbacks ?
>
> Thank You