You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jan Høydahl <ja...@cominvent.com> on 2011/04/06 11:44:07 UTC

Solr architecture diagram

Hi,

At Cominvent we've often had the need to visualize the internal architecture of Apache Solr in order to explain both the relationships of the components as well as the flow of data and queries. The result is a conceptual architecture diagram, clearly showing how Solr relates to the app-server, how cores relate to a Solr instance, how documents enter through an UpdateRequestHandler, through an UpdateChain and Analysis and into the Lucene index etc.

The drawing is created using Google draw, and the original is shared on Google Docs. We have licensed the diagram under the permissive Creative Commons "CC-by" license which lets you use, modify and re-distribute the diagram, even commercially, as long as you attribute us with a link.

Check it out at http://ow.ly/4sOTm
We'd love your comments

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


Re: Solr architecture diagram

Posted by David MARTIN <dm...@gmail.com>.
Hi,

Thank you for this contribution. Such a diagram could be useful in the
official documentation.

David

On Thu, Apr 7, 2011 at 12:15 PM, Jeffrey Chang <jc...@gmail.com> wrote:

> This is awesome; thank you!
>
> On Thu, Apr 7, 2011 at 6:09 PM, Jan Høydahl <ja...@cominvent.com> wrote:
>
> > Hi,
> >
> > Glad you liked it. You'd like to model the inner architecture of SolrJ as
> > well, do you? Perhaps that should be a separate diagram.
> >
> > --
> > Jan Høydahl, search solution architect
> > Cominvent AS - www.cominvent.com
> >
> >  On 6. apr. 2011, at 12.06, Stevo Slavić wrote:
> >
> > > Nice, thank you!
> > >
> > > Wish there was something similar or extra to this one depicting where
> > > do SolrJ's CommonsHttpSolrServer and EmbeddedSolrServer fit in.
> > >
> > > Regards,
> > > Stevo.
> > >
> > > On Wed, Apr 6, 2011 at 11:44 AM, Jan Høydahl <ja...@cominvent.com>
> > wrote:
> > >> Hi,
> > >>
> > >> At Cominvent we've often had the need to visualize the internal
> > architecture of Apache Solr in order to explain both the relationships of
> > the components as well as the flow of data and queries. The result is a
> > conceptual architecture diagram, clearly showing how Solr relates to the
> > app-server, how cores relate to a Solr instance, how documents enter
> through
> > an UpdateRequestHandler, through an UpdateChain and Analysis and into the
> > Lucene index etc.
> > >>
> > >> The drawing is created using Google draw, and the original is shared
> on
> > Google Docs. We have licensed the diagram under the permissive Creative
> > Commons "CC-by" license which lets you use, modify and re-distribute the
> > diagram, even commercially, as long as you attribute us with a link.
> > >>
> > >> Check it out at http://ow.ly/4sOTm
> > >> We'd love your comments
> > >>
> > >> --
> > >> Jan Høydahl, search solution architect
> > >> Cominvent AS - www.cominvent.com
> > >>
> > >>
> >
> >
>

Re: Solr architecture diagram

Posted by Jeffrey Chang <jc...@gmail.com>.
This is awesome; thank you!

On Thu, Apr 7, 2011 at 6:09 PM, Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> Glad you liked it. You'd like to model the inner architecture of SolrJ as
> well, do you? Perhaps that should be a separate diagram.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
>  On 6. apr. 2011, at 12.06, Stevo Slavić wrote:
>
> > Nice, thank you!
> >
> > Wish there was something similar or extra to this one depicting where
> > do SolrJ's CommonsHttpSolrServer and EmbeddedSolrServer fit in.
> >
> > Regards,
> > Stevo.
> >
> > On Wed, Apr 6, 2011 at 11:44 AM, Jan Høydahl <ja...@cominvent.com>
> wrote:
> >> Hi,
> >>
> >> At Cominvent we've often had the need to visualize the internal
> architecture of Apache Solr in order to explain both the relationships of
> the components as well as the flow of data and queries. The result is a
> conceptual architecture diagram, clearly showing how Solr relates to the
> app-server, how cores relate to a Solr instance, how documents enter through
> an UpdateRequestHandler, through an UpdateChain and Analysis and into the
> Lucene index etc.
> >>
> >> The drawing is created using Google draw, and the original is shared on
> Google Docs. We have licensed the diagram under the permissive Creative
> Commons "CC-by" license which lets you use, modify and re-distribute the
> diagram, even commercially, as long as you attribute us with a link.
> >>
> >> Check it out at http://ow.ly/4sOTm
> >> We'd love your comments
> >>
> >> --
> >> Jan Høydahl, search solution architect
> >> Cominvent AS - www.cominvent.com
> >>
> >>
>
>

Re: Solr architecture diagram

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

Glad you liked it. You'd like to model the inner architecture of SolrJ as well, do you? Perhaps that should be a separate diagram.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 6. apr. 2011, at 12.06, Stevo Slavić wrote:

> Nice, thank you!
> 
> Wish there was something similar or extra to this one depicting where
> do SolrJ's CommonsHttpSolrServer and EmbeddedSolrServer fit in.
> 
> Regards,
> Stevo.
> 
> On Wed, Apr 6, 2011 at 11:44 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> Hi,
>> 
>> At Cominvent we've often had the need to visualize the internal architecture of Apache Solr in order to explain both the relationships of the components as well as the flow of data and queries. The result is a conceptual architecture diagram, clearly showing how Solr relates to the app-server, how cores relate to a Solr instance, how documents enter through an UpdateRequestHandler, through an UpdateChain and Analysis and into the Lucene index etc.
>> 
>> The drawing is created using Google draw, and the original is shared on Google Docs. We have licensed the diagram under the permissive Creative Commons "CC-by" license which lets you use, modify and re-distribute the diagram, even commercially, as long as you attribute us with a link.
>> 
>> Check it out at http://ow.ly/4sOTm
>> We'd love your comments
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>> 


Re: Solr architecture diagram

Posted by Stevo Slavić <ss...@gmail.com>.
Nice, thank you!

Wish there was something similar or extra to this one depicting where
do SolrJ's CommonsHttpSolrServer and EmbeddedSolrServer fit in.

Regards,
Stevo.

On Wed, Apr 6, 2011 at 11:44 AM, Jan Høydahl <ja...@cominvent.com> wrote:
> Hi,
>
> At Cominvent we've often had the need to visualize the internal architecture of Apache Solr in order to explain both the relationships of the components as well as the flow of data and queries. The result is a conceptual architecture diagram, clearly showing how Solr relates to the app-server, how cores relate to a Solr instance, how documents enter through an UpdateRequestHandler, through an UpdateChain and Analysis and into the Lucene index etc.
>
> The drawing is created using Google draw, and the original is shared on Google Docs. We have licensed the diagram under the permissive Creative Commons "CC-by" license which lets you use, modify and re-distribute the diagram, even commercially, as long as you attribute us with a link.
>
> Check it out at http://ow.ly/4sOTm
> We'd love your comments
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
>

Re: Solr architecture diagram

Posted by Lance Norskog <go...@gmail.com>.
Very cool! "The Life Cycle of the IndexSearcher" would also be a great
diagram. The whole dance that happens during a commit is hard to
explain. Also, it would help show why garbage collection can act up
around commits.

Lance

On Sun, Apr 10, 2011 at 2:05 AM, Jan Høydahl <ja...@cominvent.com> wrote:
>> Looks really good, but two bits that i think might confuse people are
>> the implications that a "Query Parser" then invokes a series of search
>> components; and that "analysis" (and the pieces of an analyzer chain)
>> are what to lookups in the underlying lucene index.
>>
>> the first might just be the ambiguity of "Query" .. using the term
>> "request parser" might make more sense, in comparison to the "update
>> parsing" from the other side of hte diagram.
>
> Thanks for commenting.
>
> Yea, the purpose is more to show a conceptual rather than actual relation
> between the different components, focusing on the flow. A 100% technical
> correct diagram would be too complex for beginners to comprehend,
> although it could certainly be useful for developers.
>
> I've removed the arrow between QueryParser and search components to clarify.
> The boxes first and foremost show that query parsing and response writers
> are within the realm of search request handler.
>
>> the analysis piece is a little harder to fix cleanly.  you really want the
>> end of the analysis chain to feed back up to the searh components, and
>> then show it (most of hte search components really) talking to the Lucene
>> index.
>
> Yea, I know. Showing how Faceting communicate with the main index and
> spellchecker with its spellchecker index could also be useful, but I think
> that would be for another more detailed diagram.
>
> I felt it was more important for beginners to realize visually that
> analysis happens both at index and search time, and that the analyzers
> align 1:1. At this stage in the digram I often explain the importance
> of matching up the analysis on both sides to get a match in the index.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Solr architecture diagram

Posted by Jan Høydahl <ja...@cominvent.com>.
> Looks really good, but two bits that i think might confuse people are 
> the implications that a "Query Parser" then invokes a series of search 
> components; and that "analysis" (and the pieces of an analyzer chain) 
> are what to lookups in the underlying lucene index.
> 
> the first might just be the ambiguity of "Query" .. using the term 
> "request parser" might make more sense, in comparison to the "update 
> parsing" from the other side of hte diagram.

Thanks for commenting.

Yea, the purpose is more to show a conceptual rather than actual relation
between the different components, focusing on the flow. A 100% technical
correct diagram would be too complex for beginners to comprehend,
although it could certainly be useful for developers.

I've removed the arrow between QueryParser and search components to clarify.
The boxes first and foremost show that query parsing and response writers
are within the realm of search request handler.

> the analysis piece is a little harder to fix cleanly.  you really want the 
> end of the analysis chain to feed back up to the searh components, and 
> then show it (most of hte search components really) talking to the Lucene 
> index.

Yea, I know. Showing how Faceting communicate with the main index and 
spellchecker with its spellchecker index could also be useful, but I think
that would be for another more detailed diagram.

I felt it was more important for beginners to realize visually that
analysis happens both at index and search time, and that the analyzers
align 1:1. At this stage in the digram I often explain the importance
of matching up the analysis on both sides to get a match in the index.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


Re: Solr architecture diagram

Posted by Chris Hostetter <ho...@fucit.org>.
: of the components as well as the flow of data and queries. The result is 
: a conceptual architecture diagram, clearly showing how Solr relates to 
: the app-server, how cores relate to a Solr instance, how documents enter 
: through an UpdateRequestHandler, through an UpdateChain and Analysis and 
: into the Lucene index etc.

Looks really good, but two bits that i think might confuse people are 
the implications that a "Query Parser" then invokes a series of search 
components; and that "analysis" (and the pieces of an analyzer chain) 
are what to lookups in the underlying lucene index.

the first might just be the ambiguity of "Query" .. using the term 
"request parser" might make more sense, in comparison to the "update 
parsing" from the other side of hte diagram.

the analysis piece is a little harder to fix cleanly.  you really want the 
end of the analysis chain to feed back up to the searh components, and 
then show it (most of hte search components really) talking to the Lucene 
index.

FWIW: the last time i tried to do an arcitecture diagram for solr was my 
"Beyond the Box" talk a few years back, targeted at people interested in 
writing plugins.  I made my job a lot easier then what you tackled by 
keeping it at the 50,000 foot level where the SOlrRequestHandler was the 
smallest unit of work i described.  From that fiew there are nice 
parallels that can be drawn with more traditional MVC architectures which 
make it a little easier for people to understand...

	http://people.apache.org/~hossman/apachecon2008us/btb/
	Slides #9 & 10


-Hoss