You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by rw...@apache.org on 2012/07/10 09:46:00 UTC

svn commit: r1359510 - in /incubator/stanbol/site/trunk/content/stanbol/docs/trunk: entityhub/ utils/

Author: rwesten
Date: Tue Jul 10 07:45:59 2012
New Revision: 1359510

URL: http://svn.apache.org/viewvc?rev=1359510&view=rev
Log:
ManagedSite documentation; Copied README.md from commons/solr to the webpage

Added:
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-clerezzayard-config.png   (with props)
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-yardsite-config.png   (with props)
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-manatedsite-solryard-config.png   (with props)
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/managedsite.mdtext
    incubator/stanbol/site/trunk/content/stanbol/docs/trunk/utils/commons-solr.mdtext

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-clerezzayard-config.png
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-clerezzayard-config.png?rev=1359510&view=auto
==============================================================================
Binary file - no diff available.

Propchange: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-clerezzayard-config.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-yardsite-config.png
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-yardsite-config.png?rev=1359510&view=auto
==============================================================================
Binary file - no diff available.

Propchange: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-managedsite-yardsite-config.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-manatedsite-solryard-config.png
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-manatedsite-solryard-config.png?rev=1359510&view=auto
==============================================================================
Binary file - no diff available.

Propchange: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/entityhub-manatedsite-solryard-config.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/managedsite.mdtext
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/managedsite.mdtext?rev=1359510&view=auto
==============================================================================
--- incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/managedsite.mdtext (added)
+++ incubator/stanbol/site/trunk/content/stanbol/docs/trunk/entityhub/managedsite.mdtext Tue Jul 10 07:45:59 2012
@@ -0,0 +1,105 @@
+Title: ManagedSite
+
+A ManagedSite allow users to manage a collection of Entities by using the RESTful API of the Entityhub. Other than the ReferencedSite implementation it does not allow to refer to remote services. Therefor all changes to Entities managed by a ManagedSite are preformed via the RESTful API of the Entityhub.
+
+Users can configure multiple ManagedSites with the Stanbol Entitiyhub. They are identified by their id and share the id-space with other Sites (e.g. other ReferencedSite). The RESTful services of a ManagedSite are available via the URL pattern
+
+    http://{stanbol-instance}/entityhub/site/{siteId}
+
+_NOTE:_ To make this documentation less abstract it will use a scenario that assumes that someone wants to managing the [IPTC Descriptive NewsCodes](http://www.iptc.org/cms/site/index.html?channel=CH0103#descrncd) by using a ManagedSite. Typical Stanbol users will want to manage their own Entities (e.g. Tags/Categories of their CMS) instead.
+
+### Manage Entities by using RESTful services
+
+The RESTful API of Managed Sites is the same as of other Sites only the "/entity" Endpoint  does also support to create, update and delete Entities.
+
+The following Example shows how to upload a SKOS vocabulary to a ManagedSite:
+
+    :::bash
+    curl -i -X PUT -H "Content-Type: application/rdf+xml" -T subject-code.rdf \
+        "http://localhost:8080/site/iptc/entity"
+
+This example assumes that Stanbol is running on 'localhost' port '8080' and that a ManagedSite with the id 'iptc' was configured. The uploaded file 'subject-code.rdf' contains the IPTC [subject-codes](http://cv.iptc.org/newscodes/subjectcode/). To upload also the vocabulary containing the [genre](http://cv.iptc.org/newscodes/genre/)s one needs to call
+
+    :::bash
+    curl -i -X PUT -H "Content-Type: application/rdf+xml" -T genre.rdf "http://localhost:8080/site/iptc/entity"
+
+Calls like that will create/update all Entities contained in the parsed RDF data. If one wants to ensure that only a single Entity is created/updated one can specify the 'id' parameter.
+
+    :::bash
+    curl -i -X PUT -H "Content-Type: application/rdf+xml" -T genre.rdf "http://localhost:8080/site/iptc/entity?id=http://cv.iptc.org/newscodes/genre/Exclusive"
+
+This will ignore all other RDF data but only update the 'genre:Exclusive' entity.
+
+For the full documentation of the CRUD interface of the '/entity' endpoint of a ManagedSite please have a look at the RESTful API documentation served by the Web UI of the Stanbol Entityhub.
+
+### Configuration of ManagedSites
+
+Currently their is a single implementation of the ManagesSite interface that uses a <code>Yard</code> instance for managing the entities.
+
+For using a YardSite users need to configure two Services:
+
+1. Yard: The Entityhub currently includes two different Yard implementations. The SolrYard and the ClerezzaYard. The SolrYard is optimal for the use with the Stanbol Enhancer as it allows very fast label based retrieval of Entities. So if you plan to use the ManagedSite primarily with the Stanbol Enhancer this is definitely the Yard implementation to choose. The ClerezzaYard stores the managed Entities within a TripleStore. While the ClerezzaYard is not as efficient for the use with the StanbolEnhancer its data can be queried by using the SPARQL endpoint of Apache Stanbol.
+2. YardSite: This configures the ManagedSite. This configuration links to the configured Yard via its id.
+
+#### Configuration of a SolrYard:
+
+This describes how to configure an SolrYard to be used with an YardSite by using the Configuration tab of the Apache Felix Webconsole [http://{stanbol-instance}/system/console/configMgr](http://localhost:8080/system/console/configMgr).
+
+![Typical SolrYard configuration for a YardSite](entityhub-manatedsite-solryard-config.png)
+
+The above figure shows a typical SolrYard configuration for a YardSite. Important properties are 
+
+* __ID__: This MUST BE unique to all other Yards. It is recommended to use "{siteId}Yard".
+* __Solr Index/Core__: This is the name of the SolrCore that will be used to store the data. Here it is recommended to use the same name as the {siteId}. This is because the RESTful API of the SolrCore is published under <code>http://{stanbol-instance}/solr/default/{solrCore}</code>. So using the same name as {siteId} and {solrCore} makes it easier for map the RESTful API of the SolrCore with the ManagedSite published under <code>http://{stanbol-instance}/entityhub/stite/{siteId}</code>.
+* __Use default SolrCore configuration__: If enabled the SolrCore will be automatically created by using the default configuration. Users will typically want to use this option. Only users that want to use a special SolrCore configuration will need to deactivate this option and to provide a <code>{solrCore}.solrindex.zip</code> archive containing the special configuration in the <code>{stanbol-workingdir}/stanbol/datafiles</code> directory. See the[Managing Solr Indexes](../utils/commons-solr.html#managingsolrindexes) section for detailed information. 
+
+#### Configuration of a ClerezzaYard:
+
+This describes how to configure an ClerezzaYard to be used with an YardSite by using the Configuration tab of the Apache Felix Webconsole [http://{stanbol-instance}/system/console/configMgr](http://localhost:8080/system/console/configMgr).
+
+![Typical ClerezzaYard configuration for a YardSite](entityhub-managedsite-clerezzayard-config.png)
+
+The above figure shows a typical ClerezzaYard configuration for a YardSite. Important properties are
+
+* __ID__: This MUST BE unique to all other Yards. It is recommended to use "{siteId}Yard".
+* __Graph URI__: This allows to configure the URI of the named graph used to store the RDF data. If a graph with this URL is already present than it will be reused by this Yard. Otherwise an empty graph with this URI is created using the Clerezza [TcManager](http://incubator.apache.org/clerezza/mvn-site/rdf.core/apidocs/org/apache/clerezza/rdf/core/access/TcManager.html). If this field is empty an URN will be used as default groph URI.
+
+The ClerezzaYard also registers the its RDF graph with the Apache Stanbol SPARQL service available at <code>http://{stanbol-instance}/sparql</code>
+
+To query the RDF graph of a ClerezzaYard you need to specify the its configured Graph URI in SPARQL queries posted to the Stanbol SPARQL endpoint
+
+    :::bash
+    curl -i -X POST -d "graphuri=http://cv.iptc.org/newscodes" \
+        --data-urlencode "query@sparqlQuery.txt" \
+        "http://localhost:8080/sparql"
+
+where 'sparqlQuery.txt' refers to a file containing the SPARQL query e.g.
+
+    PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
+    SELECT distinct ?concept ?prefLabel ?altLabel ?parent
+    WHERE {
+        ?concept a skos:Concept .
+        ?concept skos:prefLabel ?prefLabel .
+        OPTIONAL {
+            ?concept skos:altLabel ?altLabel .
+        }
+    }
+
+#### Configuration of the YardSite
+
+Finally you need to configure the YardSite that uses the previously configured Yard instance (either SolrYard or ClerezzaYard). Again this will show how to configure the YardSite by using the Configuration tab of the Apache Felix Webconsole [http://{stanbol-instance}/system/console/configMgr](http://localhost:8080/system/console/configMgr).
+
+![Typical YardSite configuration](entityhub-managedsite-yardsite-config.png)
+
+The above figure shows the configuration of the YardSite. The important properties are
+
+* __ID__: This is the {siteId} used to map this ManagedSite to the RESTful API of the Stanbol Entityhub. Make sure that the ID is unique over all configured Sites.
+* __Yard ID__: Here you need to put the ID of the Yard configured in the previous step. If no Yard with that ID is active the ManagedSite will not be initialized and therefore be not available on the RESTful API
+
+The __Entity Prefix(es)__ are an optional configuration. This is used by the SiteManager (the "/entityhub/sites" endpoint) if requested entities can be dereferenced via a registered site. If not present the SiteManager will try to dereference every request by using this ManagedSite. So correctly configuring this may slightly improve performance by avoiding unnecessary requests.
+
+The __Field Mappings__ can be used to copy property values of created/updates Entities to other properties. The mappings used in the above figure ensure that SKOS preferred/alternate labels, FOAF (Friend of a Friend) names, Dublin Core titles as well as the name property of the schema.org ontology are copied over to rdfs:label. This configuration is the default as the Stanbol Enhancer uses <code>rdfs:label</code> as default property for linking entities based on their names.
+
+After completing all those steps you should see a new empty ManagedSite under
+
+    http://{stanbol-instance}/entityhub/site/iptc

Added: incubator/stanbol/site/trunk/content/stanbol/docs/trunk/utils/commons-solr.mdtext
URL: http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/utils/commons-solr.mdtext?rev=1359510&view=auto
==============================================================================
--- incubator/stanbol/site/trunk/content/stanbol/docs/trunk/utils/commons-solr.mdtext (added)
+++ incubator/stanbol/site/trunk/content/stanbol/docs/trunk/utils/commons-solr.mdtext Tue Jul 10 07:45:59 2012
@@ -0,0 +1,311 @@
+Title: Stanbol Commons Solr
+
+Solr is used by several Apache Stanbol components. The Apache Stanbol Solr Commons artifacts provide a set of utilities that ease the use of Solr within OSGi, allow the initialization and management of Solr indexes as well as the publishing of Solrs RESTful interface on the OSGi HttpService.
+
+Although this utilities where implemented with the requirements of Apache Stanbol in mind they do not depend on other Stanbol components that are not themselves part of
+"stanbol.commons".
+
+
+## Solr OSGi Bundle
+
+The "org.apache.commons.solr.core" bundle currently includes all dependencies required by Solr and also exports the client as well as the server API. For details please have a look at the pom file of the "solr.core" artifact.
+
+Please note also the exclusion list, because some libraries currently not directly used by Stanbol are explicitly excluded. Using such features within a "solrConf.xml" or "schema.xml" will result in "ClassNotFoundException" and "ClassNotFoundErrors".
+
+If you require an additional Library that is currently not included please give us a short notice on the stanbol-dev mailing list.
+
+
+## Solr Server Components
+
+This section provides information how to managed and get access to the server side CoreContainer and SolrCore components of Solr.
+
+
+### Accessing CoreContainers and SolrCores
+
+All CoreContainer and SolrCores initialized by the Stanbol Solr framework are registered with the OSGi Service Registry. This means that other Bundels can obtain them by using
+
+    CoreContainer defaultSolrServer;
+    ServiceReference ref = bundleContext.getServiceReference(
+        CoreContainer.class.getName())
+    if (ref != null) {
+        defaultSolrServer = (CoreContainer) bundleContext.getService(ref);
+    } else {
+        defaultSolrServer = null; //no SolrServer available
+    }
+
+It is also possible to track service registration and unregistration events by using the OSGi ServiceTracker utility.
+
+The above Code snippet would always return the SolrServer with the highest priority (the highest value for the "service.ranking" property). However the OSGi Service Registry allows also to obtain/track service by the usage of filters. For specifying such filters it is important to know what metadata are provided when services are registered with the OSGi Service Registry.
+
+
+#### Metadata for CoreContainer:
+
+* **org.apache.solr.core.CoreContainer.name**: The name of the SolrServer. The name MUST BE provided for each Solr CoreContainer registered with this framework. It is a required field for each configuration. If two CoreContainers are registered with the same name the "service.ranking" property shall be used to determine the current active CoreContainer for an request. However others registered for the same name may be used as fallbacks. The container name is used as a URL path component when the `publishREST` parameter is true. It is recommended to use lowercase names without non ASCII characters.
+* **org.apache.solr.core.CoreContainer.dir**: The directory of a CoreContainer. This is the directory containing the "solr.xml" file.
+* **org.apache.solr.core.CoreContainer.solrXml**: The name of the Solr CoreContainer configuration file. Currently always "sold.xml".
+* **org.apache.solr.core.CoreContainer.cores**: A read only collection of the names of all cores registered with the CoreContainer.
+* **service.ranking**: The OSGi "service.ranking" property is used to specify the ranking of a CoreContainer. The CoreContainer with the highest ranking is considered as the default server and will be returned by calls to bundleContext.getServiceReference(..) without the use of an filter.
+* **org.apache.solr.core.CoreContainer.publishREST**: Boolean switch that allows to enable/disable the publishing of the Solr RESTful API on "http://{host}:{port}/solr/{server-name}". Requires the "SolrServerPublishingComponent" to be active.
+
+
+#### Metadata for SolrCores:
+
+* **org.apache.solr.core.SolrCore.name**: The name of the SolrCore as registered with the CoreContainer
+* **org.apache.solr.core.SolrCore.dir**: The instance directory of the SolrCore
+* **org.apache.solr.core.SolrCore.datadir**: The data directory of the SolrCore
+* **org.apache.solr.core.SolrCore.indexdir**: The directory of the index used by this SolrCore
+* **org.apache.solr.core.SolrCore.schema**: The name (excluding the directory) of the Solr schema used by this core
+* **org.apache.solr.core.SolrCore.solrconf**: The name (excluding the directory) of the Solr core configuration file
+
+In addition the following metadata of the CoreContainer for this SolrCore are also available
+
+* **org.apache.solr.core.CoreContainer.id**: The `SERVICE_ID` of the CoreContainer this SolrCore is registered with. This is usually the easiest way to obtain the ServiceReference to the CoreContainer of an SolrCore.
+* **org.apache.solr.core.CoreContainer.name**: The name of the CoreContainer this SolrCore is registered with. Note that multiple CoreContainers may be registered for the same name. Therefore this property MUST NOT be used to filter for the ServiceReference to the CoreContainer of an SolrCore.
+* **org.apache.solr.core.CoreContainer.dir**: The Solr directory of the CoreContainer for this SolrCore.
+* **service.ranking**: The OSGi service.ranking of the CoreContainer this SolrCore is registered with. SolrCores do not define there own service.ranking but use the ranking of the CoreContainer they are registered with.
+
+The the mentioned keys used for metadata of registered CoreContainer and SolrCores are defined as public constants in the [SolrConstants](http://svn.apache.org/repos/asf/incubator/stanbol/trunk/commons/solr/core/src/main/java/org/apache/stanbol/commons/solr/SolrConstants.java) class.
+
+
+### ReferencedSolrServer
+
+This component allows to initialize a Solr server running within the same JVM as Stanbol based on indexes provided by a directory on the local file system. This does not support management capabilities, but it initializes a Solr CoreContainer based on the data in the file system and registers it (including all SolrCores) with the OSGi Service Registry as described above.
+
+The ReferencedSolrServer uses the ManagedServiceFactory pattern. This means that instances are created by parsing configurations to the OSGi ConfigurationAdmin service. Practically this means that:
+
+* users can create instances by using the Configuration tab of the Apache Felix Web Console
+* programmers can directly use the ConfigurationAdmin service to create/update and delete configurations
+* Configurations can also parsed via the Apache Sling [OSGi installer](http://sling.apache.org/site/osgi-installer.html) framework. Meaning configurations can be includes within the Stanbol launchers, Bundles or copied to a directory configured for the [File Provider](http://svn.apache.org/repos/asf/sling/trunk/installer/providers/file/)
+
+Configurations need to include the following properties (see also section "Metadata for CoreContainer" for details about such properties)
+
+* **org.apache.solr.core.CoreContainer.name**: The name for the Solr Server
+* **org.apache.solr.core.CoreContainer.dir**: The path to the directory on the local file system that is used to initialize the CoreContainer
+* **service.ranking**: The OSGi service ranking used to register the CoreContainer and its SolrCores. If not specified '0' will be used as default. The value MUST BE an integer number.
+* **org.apache.solr.core.CoreContainer.publishREST**: Boolean switch that allows to enable/disable the publishing of the Solr RESTful API on "http://{host}:{port}/solr/{server-name}". Requires the "SolrServerPublishingComponent" to be active.
+
+**NOTE:** Keep in mind that of the RESTful API of the SolrServer is published users might use the Admin Request handler to manipulate the SolrConfiguration. In such cases the metadata provided by the ServiceReferences for the CoreContainer and SolrCores might get out of sync with the actual configuration of the Server.
+
+
+### ManagedSolrServer
+
+This component allows to manage a multi core Solr server. It provides an API to create, update and remove SolrCores. In addition cores can be activated and deactivated.
+
+
+#### Creating ManagedServerInstances
+
+The ManagedSolrServer uses the ManagedServiceFactory pattern. This means that instances are created by parsing configurations to the OSGi ConfigurationAdmin service. Practically this means that:
+
+* users can create instances by using the Configuration tab of the Apache Felix Web Console
+* programmers can directly use the ConfigurationAdmin service to create/update and delete configurations
+* Configurations can also parsed via the Apache Sling [OSGi installer](http://sling.apache.org/site/osgi-installer.html) framework. Meaning configurations can be includes within the Stanbol launchers, Bundles or copied to a directory configured for the [File Provider](http://svn.apache.org/repos/asf/sling/trunk/installer/providers/file/)
+
+Configurations need to include the following properties (see also section "Metadata for CoreContainer" for details about such properties). Although the properties are the same as for the ReferencedSolrServer their semantics differs in some aspects.
+
+* **org.apache.solr.core.CoreContainer.name**: The name for the Solr Server
+* **org.apache.solr.core.CoreContainer.dir**: Optionally an directory to store the data. If not specified the data will be stored in an directory with the configured server-name at the default location (currently "${sling.home}/indexes/" or "indexes/" if the environment variable 'sling.home' is not present). Users that want to create multiple ManagedSolrServer with the same name need to specify the directory or servers will override each others data.
+* **service.ranking**: The OSGi service ranking used to register the CoreContainer and its SolrCores. If not specified '0' will be used as default. The value MUST BE an integer number. In scenarios where a single ManagedSolrServer is expected it is highly recommended to specify `Integer.MAX_VALUE` (2147483647) as service ranking. This will ensure that this server can not be overridden by others.
+* **org.apache.solr.core.CoreContainer.publishREST**: Boolean switch that allows to enable/disable the publishing of the Solr RESTful API on "http://{host}:{port}/solr/{server-name}". Requires the "SolrServerPublishingComponent" to be active.
+
+**NOTE:** Keep in mind that of the RESTful API of the SolrServer is published users might use the Admin Request handler to manipulate the SolrConfiguration. In such cases the metadata provided by the ServiceReferences for the CoreContainer and SolrCores might get out of sync with the actual configuration of the Server.
+
+
+#### Managing Solr Indexes
+
+This describes how to manage (create, update, remove, activate, deactivate) Indexes on a ManagedSolrServer.
+
+Managed Indexes do not 1:1 correspond to SolrCores registered on the CoreContainer. However all SolrCores on the CoreContainer do have a 1:1 mapping with a managed index on the Managed SolrServer.
+
+Managed Index can be in one of the following States (defined by the ManagedIndexState enumeration):
+
+* **UNINITIALISED**: An index that was created but is still missing the configuration and/or index data is in that state. The ManagedSolrServer API allows to create indexes by referring to a Solr-Index-Archive. Such archives are than requested via the Stanbol DataFileProvider service. Usually users can provide them by copying the lined index to the "/sling/datafiles" folder.
+* **INACTIVE**: This indicated that an index is was deactivated via the ManagedSolrServer API. The data are still kept, but the SolrCore was removed from the CoreContainer.
+* **ACTIVE**: This indicates that an index is active and can be used. Only Indexes that are ACTIVE are registered with the CoreContainer.
+* **ERROR**: This state indicates some error during the the initialization. The stack trace of the error is available in the IndexMetadata.
+
+Indexes can not only be managed by calls to the API of the ManagedSolrServer. The "org.apache.stanbol.commons.solr.install" bundle provides also support for installing/uninstalling indexes by using the Apache Sling [OSGi installer](http://sling.apache.org/site/osgi-installer.html) framework. This allows to install indexes by providing Solr-Index-Archives or Solr-Index-Archive-References to any available Provider. By default Apache Stanbol includes Provider for the Launchers and Bundles. However the Sling Installer Framework also includes Providers for Directories on the File and JCR Repositories.
+
+Solr-Index-Archives do use the following name pattern:
+
+    {name}.solrindex[.zip|.gz|.bz2]
+
+* They are normal achieves starting with the instance directory of a Solr Core.
+* The name of this instance directory MUST BE the same as the {name} of the archive.
+* The second extensions specifies the type of the archive. If no extension is specified the type of the Archive might still be detected by reading the first few bytes of the Archive.
+
+Solr-Index-Archive-References are normal Java properties files and do use the following name pattern:
+
+    {name}.solrindex.ref
+
+The following keys are used (see also org.apache.stanbol.commons.solr.managed.ManagedIndexConstants):
+
+* **Index-Archive**: Comma separated list of Solr-Index-Archives that can be used for initializing this index. The first index archive in the list has the highest priority. Higher priority archives will replace the data of lower priority once as soon as they become available. This feature is intended to be used to allow the replacement of a small sample dataset (e.g. shipped within a Bundle or the Launcher) with the full dataset download later from a remote Internet archive or pushed manually to the `sling/datafiles` folder of a previously installed Stanbol instance. For instance the `dbpedia.solrindex.ref` archive reference configuration provided in the default launcher has the line: `Index-Archive=dbpedia.solrindex.zip,dbpedia_43k.solrindex.zip` and only `dbpedia_43k.solrindex.zip` is shipped in the default launchers allowing for override by any archive named `dbpedia.solrindex.zip`.
+* **Index-Name**: The name of the Index. If not specified the {name} part of the first Index-Archive in the list will be used.
+* **Server-Name**: The name of the ManagedSolrServer this Solr index MUST BE deployed on. If not present it will be deployed on the default ManagedSolrServer (the ManagedSolrServer with the highest priority.
+* **Synchronized**: Boolean switch. If enabled the index will be synchronized with the referenced Solr-Index-Archives. That means the DataFileTracker service will be used to periodically track the states of referenced Solr-Index-Archives. This allows to initialize/update and uninitialise managed Solr indexes by simple making Solr-Index-Archives un-/available to the DataFileProvider infrastructure (such as Users copying/deleting files in the "/sling/datafiles" directory).
+* **other Properties**: All parsed properties are forwarded to the DataFileProvider/DataFileTracker service when looking for the referenced Solr-Index-Archives. This components might also define some special keys associated with specific functionalities. Please look at the documentation of this services for details.
+
+
+#### Other interesting Notes
+
+* SolrCore directory names created by the ManagedSolrServer use the current date as suffix. If a directory with that name already exists (e.g. because the same index was already updated on the very same day) than an additional "-{count}" suffix will be added to the end.
+* The Managed SolrServer stores its configuration within the persistent space of the Bundle provided by the OSGi environment. When using one of the default Stanbol launchers this is within "{sling.home}/felix/bundle{bundle-id}/data". The "{bundle-id}" of the "org.apache.stanbol.commons.solr.managed" bundle can be looked up the the [Bundle tab](http://localhost:8080/system/console/bundles) of the Apache Felix Webconsole. The actual configuration of a ManagedSolrServer is than in ".config/index-config/{service.pid}". The "{service.pid}" can be also looked up via the Apache Felix Web-console in the [Configuration Status tab](http://localhost:8080/system/console/config). Within this folder the Solr index reference files (normal java properties files) with all the information about the current state of the managed indexes are present.
+* Errors that occur during the asynchronous initialization of SolrCores are stored within the IndexingProperties. They can therefore be requested via the API of the ManagedSolrServer but also be looked up within the persistent state of the ManagedSolrServer (see above where such files are located).
+
+
+## Solr Client Components
+
+This sections describes how to use Solr servers and indexes referenced and managed by the "org.apache.stanbol.commons.solr" framework.
+Principally there are two possibilities: (1) to directly access Solr indexes via the SolrServer Java API and (2) to publish locally managed index on the OSGi HttpService and than use such indexes via the Solr RESTful API.
+
+The Stanbol Solr framework does not provide utilities for accessing remote Solr servers, because this is already easily possible by using SolrJ.
+
+
+### Java API
+
+This describes how to lookup and access a Solr Server initialized by the "org.apache.stanbol.commons.solr" framework. The client side Java API of Solr is defined by the SolrServer abstract class. The implementation used for accessing a SolrCore running in the same JVM is the EmbeddedSolrServer.
+
+All Solr server (CoreContainer) and Solr indexes (SolrCore) initialized by the ReferencedSolrServer and/or ManagedSolrServer are registered with the OSGi service registry. More information about this can be found in the first part of the "Solr Server Components" of this documentation.
+
+OSGi already provides APIs and utilities to lookup and track registered services. In the following I will provide some examples how to lookup SolrServers registered as OSGi services.
+
+
+#### IndexReference
+
+The IndexReference is a Java class that manages a reference to an Index. It defines a constructor that takes a serverName and coreName. In addition there is a static parse(String ref) method that takes
+
+* file URLs
+* file paths and
+* [server-name:]core-name like references.
+
+The IndexMetadata class also defines a getter to get the IndexReference.
+
+One feature of the IndexReference is also that it provides getters of Filters as used to lookup/track the referenced CoreContainer/SolrCore in the OSGi service Registry. The returned filter include the constraint for the registered interface (OBJECTCLASS). Therefore when using this filters one can parse NULL for the class parameter
+
+To lookup the CoreContainer of the referenced index:
+
+    bundleContext.getServiceReferences(null, indexReference.getServerFilter());
+
+To lookup the SolrCore for the referenced index:
+
+    bundleContext.getServiceReferences(null, indexReference.getIndexFilter());
+
+
+#### Lookup Solr Indexes
+
+This example shows how to lookup the default CoreContainer and create a SolrServer for the core "mydata".
+
+    ComponentContext context; // typically passed to the activate method
+    BundleContext bc = context.getBundleContext();
+    ServiceReference coreContainerRef =
+        bc.getServiceReference(CoreContainer.class.getName());
+    CoreContainer coreContainer = (CoreContainer) bc.getService(coreContainerRef)
+    SolrServer server = new EmbeddedSolrServer(coreContainer, "mydata");
+
+Now there might be cases where several CoreContainers are available and "mydata" is not available on the default one. The "default" refers to the one with the highest "service.ranking" value. In this case we need to know a available property we can use to filter for the right CoreContainer. In this case we assume the index is on a CoreContainer registered with the name "myserver".
+
+    ComponentContext context; // typically passed to the activate method
+    BundleContext bc = context.getBundleContext();
+
+    // Now let's use the IndexReference to create the filter
+    IndexReference indexRef = new IndexReference("myserver", "mydata");
+    ServiceReference[] coreContainerRefs = bc.getServiceReferences(
+        null, indexRef.getServerFilter());
+
+    // TODO: check that coreContainerRefs != null AND not empty!
+    // Now we have all References to CoreContainers with the name "myserver"
+    // Yes one can register several for the same name (e.g. to have fallbacks)
+    // let get the one with the highest service.ranking
+    Arrays.sort(coreContainerRefs, ServiceReferenceRankingComparator.INSTANCE);
+
+    // Create the SolrServer (same as above)
+    CoreContainer coreContainer = (CoreContainer) bc.getService(coreContainerRefs[0])
+    SolrServer server = new EmbeddedSolrServer(coreContainer, indexRef.getIndex());
+
+In cases where one only knows the name of the SolrCore (and not the CoreContainer) the initialization looks like this.
+
+    ComponentContext context; // typically passed to the activate method
+    BundleContext bc = context.getBundleContext();
+    String nameFilter = String.format("(%s=%s)", SolrConstants.PROPERTY_CORE_NAME, "mydata");
+    ServiceReference[] solrCoreRefs = bc.getServiceReferences(
+        SolrCore.class.getName(), nameFilter);
+
+    // TODO: check that != null AND not empty!
+    // Now we have all References to CoreContainer with a SolrCore "mydata"
+    // let get the one with the highest service.ranking
+    Arrays.sort(solrCoreRefs, ServiceReferenceRankingComparator.INSTANCE);
+
+    // Now get the SolrCore and create the SolrServer
+    SolrCore core = (SolrCore) bc.getService(solrCoreRefs[0]);
+
+    // core.getCoreDescriptor() might be null if SolrCore is not
+    // registered with a CoreContainer
+    SolrServer server = new EmbeddedSolrServer(
+        core.getCoreDescriptor().getCoreContainer(), "mydata");
+
+
+#### Tracking Solr Indexes
+
+The above examples do a lookup at a single point in time. However because OSGi is an dynamic environment where services can come the go at every time in most cases users might rather want to track services. To do this OSGi provides the ServiceTracker utility.
+
+To ease the tracking of SolrServers the "org.apache.stanbol.commons.solr.core" bundle provides the RegisteredSolrServerTracker. The following examples show how to create a Managed SolrIndex and than track the SolrServer.
+
+First during the activation we need to check if "mydata" is already created and create it if not. Than we can start tracking the index:
+
+    BundleContext bc;
+    // The ManagedSolrServer instance can be looked up manually using a service
+    // reference or using declarative services / SCR injection
+    IndexMetadata metadata = managedServer.getIndexMetadata("mydata");
+    if (metadata == null) {
+        // No index with that name:
+        // Asynchronously init the index as soon as the solrindex archive is available
+        metadata = managedServer.createSolrIndex("mydata", "mydata.solrindex.zip", null);
+    }
+    RegisteredSolrServerTracker indexTracker =
+        new RegisteredSolrServerTracker(bc, metadata.getIndexReference());
+
+    // Do not forget to close the tracker while deactivating
+    indexTracker.open();
+
+Now every time we need the SolrServer we can retrieve it from the indexTracker
+
+    private SolrServer getServer() {
+        SolrServer server = indexTracker.getService();
+        if(server == null) {
+            // Report the missing server
+            throw new IllegalStateException("Server 'mydata' not active");
+        } else {
+            return server;
+        }
+    }
+
+The RegisteredSolrServerTracker does take "service.ranking" into account. So if there are more Services available that match the passed IndexReference those methods will always return the one with the highest "service.ranking". In case arrays are returned such arrays are sorted accordingly.
+
+
+### RESTful API
+
+The following describes how to publish the RESTful API of CoreContainer registered as OSGi services on the OSGi HttpService. The functionality described in this section is provided by the "org.apache.stanbol.commons.solr.web" artifact.
+
+
+#### SolrServerPublishingComponent
+
+This is an OSGi component that starts immediate and does not require a configuration. Its main purpose is to track all CoreContainers with the property "org.apache.solr.core.CoreContainer.publishREST=true". For all such CoreContainers it publishes the RESTful API under the URL
+
+    http://{host}:{port}/solr/{server-name}
+
+If two CoreContainers with the same {server-name} (the value of the "org.apache.solr.core.CoreContainer.name" property) are registered the one with the highest "service.ranking" is published.
+
+The root-prefix ("/solr" by default) can be configured by setting the "org.apache.stanbol.commons.solr.web.dispatchfilter.prefix" property.
+
+
+#### SolrDispatchFilterComponent
+
+This Component provides the same functionality as the SolrServerPublishingComponent, but can be configured specifically for a CoreContainer. It is intended to be used if one wants to publish the RESTful API of a specific CoreContainer under a specific location. To deactivate the publishing of the same core on the SolrServerPublishingComponent users need to set the "org.apache.solr.core.CoreContainer.publishREST" to false.
+
+This component is configured by two properties
+
+* **org.apache.stanbl.commons.solr.web.dispatchfilter.name**: The {server-name} of the CoreContainer to publish ({server-name} refers to the value of the "org.apache.solr.core.CoreContainer.name" property).
+* **org.apache.stanbl.commons.solr.web.dispatchfilter.prefix**: The prefix path to publish the server. The {server-name} is NOT appended to the configured prefix. Note that a Servlet Filter with `{prefix}/.*` is registered with the OSGi HttpService.
+
+If two CoreContainers with the same {server-name} (the value of the "org.apache.solr.core.CoreContainer.name" property) are registered the one with the highest "service.ranking" is published.
+