You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@clerezza.apache.org by Stephane Gamard <st...@gamard.net> on 2014/01/13 16:16:12 UTC

CRIS field names in Lucene

Hi all, 

I am trying to implement  new conditions for CRIS and I’ve come around a peculiar problem. I’ve create a “BoostCondition” based on the same principle than the WildCardCondition. Here’s it’s Ctor and query method: 

 public BoostCondition(VirtualProperty property, String value, Float boost) {
    this.property = property;
    this.value = value;
    this.boost = boost;
  }
  
  public BoostCondition(UriRef uriRefProperty, String value, Float boost) {
    this(new PropertyHolder(uriRefProperty,false), value,boost);
  }

  @Override
  protected Query query() {
    TermQuery termQuery = new TermQuery(new Term(property.getStringKey(), value));
    termQuery.setBoost(boost);
    return termQuery;
  }

Nothing fancy and here is how it is used: 

       conditions.add(new BoostCondition(RDF.type, "<http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new Float(0.5)));
       
        final List<NonLiteral> matchingNodes = indexService.findResources(conditions, facetCollector);
        node.addPropertyValue(ECS.contentsCount, matchingNodes.size());

All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’ while when indexed it is indexed as: “_STORED_”+RDF.type as per the following lucene Query:
+J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5


Attached is log with and without the custom condition 

INDEXING
========== 
 
13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]: cache full or writes have ceased. Indexing...
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing <http://fusepool.info/doc/pmc/3470790> considering 3 properties ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(http://purl.org/dc/elements/1.1/subject) with value http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value http://fusepool.eu/ontologies/ecs#ContentItem
13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value http://purl.org/ontology/bibo/Document
13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) with value Two barriers for sodium in vascular endothelium? Vascular endothelium plays a key role in blood pressure regulation. Recently, it has been shown that a 5% increase of plasma sodium concentration (sodium excess) stiffens endothelial cells by about 25%, leading to cellular dysfunction. Surface measurements demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is elevated. In view of these results, a two-barrier model for sodium exiting the circulation across the endothelium is suggested. The first sodium barrier is the eGC which selectively buffers sodium ions with its negatively charged prote-oglycans.The second sodium barrier is the endothelial plasma membrane which contains sodium channels. Sodium excess, in the presence of aldosterone, leads to eGC break-down and, in parallel, to an up-regulation of plasma membrane sodium channels. The following hypothesis is postulated: Sodium excess increases vascular sodium permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day ingested sodium, instead of being readily buffered by the eGC and then rapidly excreted by the kidneys, is distributed in the whole body before being finally excreted. Gradually, the sodium overload damages the organism.
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing <http://fusepool.info/doc/pmc/3581062> considering 3 properties ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(http://purl.org/dc/elements/1.1/subject) with value http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(http://purl.org/dc/elements/1.1/subject) with value http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value http://fusepool.eu/ontologies/ecs#ContentItem
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value http://purl.org/ontology/bibo/Document
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] org.apache.clerezza.rdf.cris.GraphIndexer indexing org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) with value Diagnosis and treatment of mitochondrial myopathies Mitochondrial disorders are a heterogeneous group of disorders resulting from primary dysfunction of the respiratory chain. Muscle tissue is highly metabolically active, and therefore myopathy is a common element of the clinical presentation of these disorders, although this may be overshadowed by central neurological features. This review is aimed at a general medical and neurologist readership and provides a clinical approach to the recognition, investigation, and treatment of mitochondrial myopathies. Emphasis is placed on practical management considerations while including some recent updates in the field.
 
 
 
 SEARCH WITHOUT CUSTOM CONDITION
 ===============================
 
13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium*
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer _STORED_http://purl.org/dc/elements/1.1/subject : http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer _STORED_http://www.w3.org/1999/02/22-rdf-syntax-ns#type : http://fusepool.eu/ontologies/ecs#ContentItem
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer _STORED_http://www.w3.org/1999/02/22-rdf-syntax-ns#type : http://purl.org/ontology/bibo/Document
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for sodium in vascular endothelium? Vascular endothelium plays a key role in blood pressure regulation. Recently, it has been shown that a 5% increase of plasma sodium concentration (sodium excess) stiffens endothelial cells by about 25%, leading to cellular dysfunction. Surface measurements demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is elevated. In view of these results, a two-barrier model for sodium exiting the circulation across the endothelium is suggested. The first sodium barrier is the eGC which selectively buffers sodium ions with its negatively charged prote-oglycans.The second sodium barrier is the endothelial plasma membrane which contains sodium channels. Sodium excess, in the presence of aldosterone, leads to eGC break-down and, in parallel, to an up-regulation of plasma membrane sodium channels. The following hypothesis is postulated: Sodium excess increases vascular sodium permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day ingested sodium, instead of being readily buffered by the eGC and then rapidly excreted by the kidneys, is distributed in the whole body before being finally excreted. Gradually, the sodium overload damages the organism.
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.clerezza.rdf.cris.GraphIndexer resource-uri : http://fusepool.info/doc/pmc/3470790
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site registered for Entity http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site registered for Entity http://purl.org/ontology/bibo/Document


SEARCH WITH CUSTOM CONDITION
============================
 
13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40] org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5 


Re: CRIS field names in Lucene

Posted by Stephane Gamard <st...@gamard.net>.
Hi Daniel, 

Thank you for your feedback. I have a few questions/clarifications that I need, I’ve commented inline: 

On 14 Jan 2014, at 16:12, Daniel Spicar <ds...@apache.org> wrote:

> First a little warning: I am do not understand the workings of CRIS and
> Lucene 100% anymore as it has been a while since I created the
> GenericCondition (Lucene query syntax) and the Index structure and it used
> to be a different version of lucene. So double check anything I say that
> sounds odd ;)
> 

Sounds Excellent. I’ve made rdf.cris my new home :)

> On what level do you want to do this change? In general it is a possible
> approach to change the indexing to not index the raw URI but slugification
> loses information (in general). You don't want that as some applications
> may depend on the exact URIs (for example for ordering case sensitively).

There are applications that order the values by key? From what I understand and I’ve seen in the code, they field.name (which is the key) is never used. The only used key is the reference to the result’s URI to “florish” it later on in the process. As far as I can see, slugyfying the value of the fieldname must not have any incidence on the rest of ZZ. 

I’ve tested (on my fork) slugyfying the fieldname by simply adding the “toSlug()” method in the vProperty class. So far all works well and I’ve passed all the JUnit tests. Could you point me toward a specific case/impl that requires the fieldname to be an actual URI? 


> 
> But I am not sure messing with the indexing is necessary the right approach
> as some applications may depend on the current index values and their
> quirks. But you can always add a slugified URI field and perform queries on
> that. When you implement a custom condition, you need to make sure that you
> search on the correct field. Hence when the field name is a slugified URI,
> then you need to slugify the input URI and use that as the field name in
> the TermQuery.

Excellent. Custom conditions working like a charm. I took a small liberty to extend upon the abstractCondition class by adding a boost and a boolean clause. So when constructing a query, by API we can set the boost and boolean clause (MUST vs SHOULD) individually on each clause.

> 
> Another approach, without messing with the index directly, and when you
> want to use Lucene query syntax, is to just make sure to escape special
> characters with a backslash. This can be automated.
> 

Yes, this could be another solution, but also I have concerns with lucene not liking very much long fieldname, as well as the difficulty of escaping them at all the right places in the right manner (can only be enforced by “good practices”). 

> 
> 2014/1/13 Stephane Gamard <st...@gamard.net>
> 
>> Thanks Daniel,
>> 
>> Yes I saw that. Stupid mistake from me, I thought there were all stored. I
>> think I found the problem with my boost. Currently the condition is
>> expressed as a lucene query but fails the syntax when the key is a RDF
>> uriref: http://askagfdasd.jasd#toto:hello is not valid.
>> 
>> I’m thinking about slug-fying the name of the field instead of having the
>> raw uri used as the field key. What do you think?
>> 
>> _Stephane
>> 
>> On 13 Jan 2014, at 17:34, Daniel Spicar <da...@gmail.com> wrote:
>> 
>>> Hi Stephane
>>> 
>>> This is a prefix added to Lucene stored fields (the fields that actually
>>> get stored "as is" or unmodified in the document and returned by Lucene
>>> when asking for Documents). Lucene also creates (or can be told to do so)
>>> fields which are not "stored", thus one can analyze, tokenize, etc the
>>> original value and create fields by which Lucene can search/sort - but
>>> those fields are not returned as part of the document.
>>> 
>>> We add them to all stored fields before indexing (in the
>>> GraphIndexer.resourceToDocument method). I am not sure anymore why
>> exactly
>>> this was needed. I think there was a peculiar problem with the sort order
>>> when this was missing but I am not sure what exactly needed this
>>> "workaround".
>>> 
>>> Daniel
>>> 
>>> 
>>> 
>>> 2014/1/13 Stephane Gamard <st...@gamard.net>
>>> 
>>>> Hi all,
>>>> 
>>>> I am trying to implement  new conditions for CRIS and I’ve come around a
>>>> peculiar problem. I’ve create a “BoostCondition” based on the same
>>>> principle than the WildCardCondition. Here’s it’s Ctor and query method:
>>>> 
>>>> public BoostCondition(VirtualProperty property, String value, Float
>>>> boost) {
>>>>   this.property = property;
>>>>   this.value = value;
>>>>   this.boost = boost;
>>>> }
>>>> 
>>>> public BoostCondition(UriRef uriRefProperty, String value, Float
>> boost) {
>>>>   this(new PropertyHolder(uriRefProperty,false), value,boost);
>>>> }
>>>> 
>>>> @Override
>>>> protected Query query() {
>>>>   TermQuery termQuery = new TermQuery(new Term(property.getStringKey(),
>>>> value));
>>>>   termQuery.setBoost(boost);
>>>>   return termQuery;
>>>> }
>>>> 
>>>> Nothing fancy and here is how it is used:
>>>> 
>>>>      conditions.add(new BoostCondition(RDF.type, "<
>>>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new
>>>> Float(0.5)));
>>>> 
>>>>       final List<NonLiteral> matchingNodes =
>>>> indexService.findResources(conditions, facetCollector);
>>>>       node.addPropertyValue(ECS.contentsCount, matchingNodes.size());
>>>> 
>>>> All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’
>>>> while when indexed it is indexed as: “_STORED_”+RDF.type as per the
>>>> following lucene Query:
>>>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
>>>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>>>> 
>>>> 
>>>> Attached is log with and without the custom condition
>>>> 
>>>> INDEXING
>>>> ==========
>>>> 
>>>> 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]:
>> cache
>>>> full or writes have ceased. Indexing...
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
>>>> http://fusepool.info/doc/pmc/3470790> considering 3 properties
>>>> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>>>> http://purl.org/dc/elements/1.1/subject) with value
>>>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>>>> http://fusepool.eu/ontologies/ecs#ContentItem
>>>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>>>> http://purl.org/ontology/bibo/Document
>>>> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
>>>> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e
>> (J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
>>>> with value Two barriers for sodium in vascular endothelium? Vascular
>>>> endothelium plays a key role in blood pressure regulation. Recently, it
>> has
>>>> been shown that a 5% increase of plasma sodium concentration (sodium
>>>> excess) stiffens endothelial cells by about 25%, leading to cellular
>>>> dysfunction. Surface measurements demonstrated that the endothelial
>>>> glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is
>>>> elevated. In view of these results, a two-barrier model for sodium
>> exiting
>>>> the circulation across the endothelium is suggested. The first sodium
>>>> barrier is the eGC which selectively buffers sodium ions with its
>>>> negatively charged prote-oglycans.The second sodium barrier is the
>>>> endothelial plasma membrane which contains sodium channels. Sodium
>> excess,
>>>> in the presence of aldosterone, leads to eGC break-down and, in
>> parallel,
>>>> to an up-regulation of plasma membrane sodium channels. The following
>>>> hypothesis is postulated: Sodium excess increases vascular sodium
>>>> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
>>>> ingested sodium, instead of being readily buffered by the eGC and then
>>>> rapidly excreted by the kidneys, is distributed in the whole body before
>>>> being finally excreted. Gradually, the sodium overload damages the
>> organism.
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
>>>> http://fusepool.info/doc/pmc/3581062> considering 3 properties
>>>> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>>>> http://purl.org/dc/elements/1.1/subject) with value
>>>> http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>>>> http://purl.org/dc/elements/1.1/subject) with value
>>>> http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>>>> http://fusepool.eu/ontologies/ecs#ContentItem
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>>>> http://purl.org/ontology/bibo/Document
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
>>>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>>>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e
>> (J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
>>>> with value Diagnosis and treatment of mitochondrial myopathies
>>>> Mitochondrial disorders are a heterogeneous group of disorders resulting
>>>> from primary dysfunction of the respiratory chain. Muscle tissue is
>> highly
>>>> metabolically active, and therefore myopathy is a common element of the
>>>> clinical presentation of these disorders, although this may be
>> overshadowed
>>>> by central neurological features. This review is aimed at a general
>> medical
>>>> and neurologist readership and provides a clinical approach to the
>>>> recognition, investigation, and treatment of mitochondrial myopathies.
>>>> Emphasis is placed on practical management considerations while
>> including
>>>> some recent updates in the field.
>>>> 
>>>> 
>>>> 
>>>> SEARCH WITHOUT CUSTOM CONDITION
>>>> ===============================
>>>> 
>>>> 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
>>>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium*
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>>>> http://purl.org/dc/elements/1.1/subject :
>>>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
>>>> http://fusepool.eu/ontologies/ecs#ContentItem
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
>>>> http://purl.org/ontology/bibo/Document
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer
>>>> _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for
>> sodium
>>>> in vascular endothelium? Vascular endothelium plays a key role in blood
>>>> pressure regulation. Recently, it has been shown that a 5% increase of
>>>> plasma sodium concentration (sodium excess) stiffens endothelial cells
>> by
>>>> about 25%, leading to cellular dysfunction. Surface measurements
>>>> demonstrated that the endothelial glycocalyx (eGC), an anionic
>> biopolymer,
>>>> deteriorates when sodium is elevated. In view of these results, a
>>>> two-barrier model for sodium exiting the circulation across the
>> endothelium
>>>> is suggested. The first sodium barrier is the eGC which selectively
>> buffers
>>>> sodium ions with its negatively charged prote-oglycans.The second sodium
>>>> barrier is the endothelial plasma membrane which contains sodium
>> channels.
>>>> Sodium excess, in the presence of aldosterone, leads to eGC break-down
>> and,
>>>> in parallel, to an up-regulation of plasma membrane sodium channels. The
>>>> following hypothesis is postulated: Sodium excess increases vascular
>> sodium
>>>> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
>>>> ingested sodium, instead of being readily buffered by the eGC and then
>>>> rapidly excreted by the kidneys, is distributed in the whole body before
>>>> being finally excreted. Gradually, the sodium overload damages the
>> organism.
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer resource-uri :
>>>> http://fusepool.info/doc/pmc/3470790
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced
>> Site
>>>> registered for Entity
>>>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>>>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>>>> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced
>> Site
>>>> registered for Entity http://purl.org/ontology/bibo/Document
>>>> 
>>>> 
>>>> SEARCH WITH CUSTOM CONDITION
>>>> ============================
>>>> 
>>>> 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40]
>>>> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
>>>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
>>>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
>>>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>>>> 
>>>> 
>> 
>> 


Re: CRIS field names in Lucene

Posted by Daniel Spicar <ds...@apache.org>.
First a little warning: I am do not understand the workings of CRIS and
Lucene 100% anymore as it has been a while since I created the
GenericCondition (Lucene query syntax) and the Index structure and it used
to be a different version of lucene. So double check anything I say that
sounds odd ;)

On what level do you want to do this change? In general it is a possible
approach to change the indexing to not index the raw URI but slugification
loses information (in general). You don't want that as some applications
may depend on the exact URIs (for example for ordering case sensitively).

But I am not sure messing with the indexing is necessary the right approach
as some applications may depend on the current index values and their
quirks. But you can always add a slugified URI field and perform queries on
that. When you implement a custom condition, you need to make sure that you
search on the correct field. Hence when the field name is a slugified URI,
then you need to slugify the input URI and use that as the field name in
the TermQuery.

Another approach, without messing with the index directly, and when you
want to use Lucene query syntax, is to just make sure to escape special
characters with a backslash. This can be automated.


2014/1/13 Stephane Gamard <st...@gamard.net>

> Thanks Daniel,
>
> Yes I saw that. Stupid mistake from me, I thought there were all stored. I
> think I found the problem with my boost. Currently the condition is
> expressed as a lucene query but fails the syntax when the key is a RDF
> uriref: http://askagfdasd.jasd#toto:hello is not valid.
>
> I’m thinking about slug-fying the name of the field instead of having the
> raw uri used as the field key. What do you think?
>
> _Stephane
>
> On 13 Jan 2014, at 17:34, Daniel Spicar <da...@gmail.com> wrote:
>
> > Hi Stephane
> >
> > This is a prefix added to Lucene stored fields (the fields that actually
> > get stored "as is" or unmodified in the document and returned by Lucene
> > when asking for Documents). Lucene also creates (or can be told to do so)
> > fields which are not "stored", thus one can analyze, tokenize, etc the
> > original value and create fields by which Lucene can search/sort - but
> > those fields are not returned as part of the document.
> >
> > We add them to all stored fields before indexing (in the
> > GraphIndexer.resourceToDocument method). I am not sure anymore why
> exactly
> > this was needed. I think there was a peculiar problem with the sort order
> > when this was missing but I am not sure what exactly needed this
> > "workaround".
> >
> > Daniel
> >
> >
> >
> > 2014/1/13 Stephane Gamard <st...@gamard.net>
> >
> >> Hi all,
> >>
> >> I am trying to implement  new conditions for CRIS and I’ve come around a
> >> peculiar problem. I’ve create a “BoostCondition” based on the same
> >> principle than the WildCardCondition. Here’s it’s Ctor and query method:
> >>
> >> public BoostCondition(VirtualProperty property, String value, Float
> >> boost) {
> >>    this.property = property;
> >>    this.value = value;
> >>    this.boost = boost;
> >>  }
> >>
> >>  public BoostCondition(UriRef uriRefProperty, String value, Float
> boost) {
> >>    this(new PropertyHolder(uriRefProperty,false), value,boost);
> >>  }
> >>
> >>  @Override
> >>  protected Query query() {
> >>    TermQuery termQuery = new TermQuery(new Term(property.getStringKey(),
> >> value));
> >>    termQuery.setBoost(boost);
> >>    return termQuery;
> >>  }
> >>
> >> Nothing fancy and here is how it is used:
> >>
> >>       conditions.add(new BoostCondition(RDF.type, "<
> >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new
> >> Float(0.5)));
> >>
> >>        final List<NonLiteral> matchingNodes =
> >> indexService.findResources(conditions, facetCollector);
> >>        node.addPropertyValue(ECS.contentsCount, matchingNodes.size());
> >>
> >> All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’
> >> while when indexed it is indexed as: “_STORED_”+RDF.type as per the
> >> following lucene Query:
> >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
> >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
> >>
> >>
> >> Attached is log with and without the custom condition
> >>
> >> INDEXING
> >> ==========
> >>
> >> 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]:
> cache
> >> full or writes have ceased. Indexing...
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
> >> http://fusepool.info/doc/pmc/3470790> considering 3 properties
> >> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> >> http://purl.org/dc/elements/1.1/subject) with value
> >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> >> http://fusepool.eu/ontologies/ecs#ContentItem
> >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> >> http://purl.org/ontology/bibo/Document
> >> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
> >> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e
> (J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
> >> with value Two barriers for sodium in vascular endothelium? Vascular
> >> endothelium plays a key role in blood pressure regulation. Recently, it
> has
> >> been shown that a 5% increase of plasma sodium concentration (sodium
> >> excess) stiffens endothelial cells by about 25%, leading to cellular
> >> dysfunction. Surface measurements demonstrated that the endothelial
> >> glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is
> >> elevated. In view of these results, a two-barrier model for sodium
> exiting
> >> the circulation across the endothelium is suggested. The first sodium
> >> barrier is the eGC which selectively buffers sodium ions with its
> >> negatively charged prote-oglycans.The second sodium barrier is the
> >> endothelial plasma membrane which contains sodium channels. Sodium
> excess,
> >> in the presence of aldosterone, leads to eGC break-down and, in
> parallel,
> >> to an up-regulation of plasma membrane sodium channels. The following
> >> hypothesis is postulated: Sodium excess increases vascular sodium
> >> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
> >> ingested sodium, instead of being readily buffered by the eGC and then
> >> rapidly excreted by the kidneys, is distributed in the whole body before
> >> being finally excreted. Gradually, the sodium overload damages the
> organism.
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
> >> http://fusepool.info/doc/pmc/3581062> considering 3 properties
> >> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> >> http://purl.org/dc/elements/1.1/subject) with value
> >> http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> >> http://purl.org/dc/elements/1.1/subject) with value
> >> http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> >> http://fusepool.eu/ontologies/ecs#ContentItem
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> >> http://purl.org/ontology/bibo/Document
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
> >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> >> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e
> (J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
> >> with value Diagnosis and treatment of mitochondrial myopathies
> >> Mitochondrial disorders are a heterogeneous group of disorders resulting
> >> from primary dysfunction of the respiratory chain. Muscle tissue is
> highly
> >> metabolically active, and therefore myopathy is a common element of the
> >> clinical presentation of these disorders, although this may be
> overshadowed
> >> by central neurological features. This review is aimed at a general
> medical
> >> and neurologist readership and provides a clinical approach to the
> >> recognition, investigation, and treatment of mitochondrial myopathies.
> >> Emphasis is placed on practical management considerations while
> including
> >> some recent updates in the field.
> >>
> >>
> >>
> >> SEARCH WITHOUT CUSTOM CONDITION
> >> ===============================
> >>
> >> 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
> >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium*
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> >> http://purl.org/dc/elements/1.1/subject :
> >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
> >> http://fusepool.eu/ontologies/ecs#ContentItem
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
> >> http://purl.org/ontology/bibo/Document
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer
> >> _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for
> sodium
> >> in vascular endothelium? Vascular endothelium plays a key role in blood
> >> pressure regulation. Recently, it has been shown that a 5% increase of
> >> plasma sodium concentration (sodium excess) stiffens endothelial cells
> by
> >> about 25%, leading to cellular dysfunction. Surface measurements
> >> demonstrated that the endothelial glycocalyx (eGC), an anionic
> biopolymer,
> >> deteriorates when sodium is elevated. In view of these results, a
> >> two-barrier model for sodium exiting the circulation across the
> endothelium
> >> is suggested. The first sodium barrier is the eGC which selectively
> buffers
> >> sodium ions with its negatively charged prote-oglycans.The second sodium
> >> barrier is the endothelial plasma membrane which contains sodium
> channels.
> >> Sodium excess, in the presence of aldosterone, leads to eGC break-down
> and,
> >> in parallel, to an up-regulation of plasma membrane sodium channels. The
> >> following hypothesis is postulated: Sodium excess increases vascular
> sodium
> >> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
> >> ingested sodium, instead of being readily buffered by the eGC and then
> >> rapidly excreted by the kidneys, is distributed in the whole body before
> >> being finally excreted. Gradually, the sodium overload damages the
> organism.
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.clerezza.rdf.cris.GraphIndexer resource-uri :
> >> http://fusepool.info/doc/pmc/3470790
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced
> Site
> >> registered for Entity
> >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> >> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced
> Site
> >> registered for Entity http://purl.org/ontology/bibo/Document
> >>
> >>
> >> SEARCH WITH CUSTOM CONDITION
> >> ============================
> >>
> >> 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40]
> >> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
> >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
> >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
> >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
> >>
> >>
>
>

Re: CRIS field names in Lucene

Posted by Stephane Gamard <st...@gamard.net>.
Thanks Daniel, 

Yes I saw that. Stupid mistake from me, I thought there were all stored. I think I found the problem with my boost. Currently the condition is expressed as a lucene query but fails the syntax when the key is a RDF uriref: http://askagfdasd.jasd#toto:hello is not valid. 

I’m thinking about slug-fying the name of the field instead of having the raw uri used as the field key. What do you think? 

_Stephane

On 13 Jan 2014, at 17:34, Daniel Spicar <da...@gmail.com> wrote:

> Hi Stephane
> 
> This is a prefix added to Lucene stored fields (the fields that actually
> get stored "as is" or unmodified in the document and returned by Lucene
> when asking for Documents). Lucene also creates (or can be told to do so)
> fields which are not "stored", thus one can analyze, tokenize, etc the
> original value and create fields by which Lucene can search/sort - but
> those fields are not returned as part of the document.
> 
> We add them to all stored fields before indexing (in the
> GraphIndexer.resourceToDocument method). I am not sure anymore why exactly
> this was needed. I think there was a peculiar problem with the sort order
> when this was missing but I am not sure what exactly needed this
> "workaround".
> 
> Daniel
> 
> 
> 
> 2014/1/13 Stephane Gamard <st...@gamard.net>
> 
>> Hi all,
>> 
>> I am trying to implement  new conditions for CRIS and I’ve come around a
>> peculiar problem. I’ve create a “BoostCondition” based on the same
>> principle than the WildCardCondition. Here’s it’s Ctor and query method:
>> 
>> public BoostCondition(VirtualProperty property, String value, Float
>> boost) {
>>    this.property = property;
>>    this.value = value;
>>    this.boost = boost;
>>  }
>> 
>>  public BoostCondition(UriRef uriRefProperty, String value, Float boost) {
>>    this(new PropertyHolder(uriRefProperty,false), value,boost);
>>  }
>> 
>>  @Override
>>  protected Query query() {
>>    TermQuery termQuery = new TermQuery(new Term(property.getStringKey(),
>> value));
>>    termQuery.setBoost(boost);
>>    return termQuery;
>>  }
>> 
>> Nothing fancy and here is how it is used:
>> 
>>       conditions.add(new BoostCondition(RDF.type, "<
>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new
>> Float(0.5)));
>> 
>>        final List<NonLiteral> matchingNodes =
>> indexService.findResources(conditions, facetCollector);
>>        node.addPropertyValue(ECS.contentsCount, matchingNodes.size());
>> 
>> All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’
>> while when indexed it is indexed as: “_STORED_”+RDF.type as per the
>> following lucene Query:
>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>> 
>> 
>> Attached is log with and without the custom condition
>> 
>> INDEXING
>> ==========
>> 
>> 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]: cache
>> full or writes have ceased. Indexing...
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
>> http://fusepool.info/doc/pmc/3470790> considering 3 properties
>> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>> http://purl.org/dc/elements/1.1/subject) with value
>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>> http://fusepool.eu/ontologies/ecs#ContentItem
>> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>> http://purl.org/ontology/bibo/Document
>> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
>> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
>> with value Two barriers for sodium in vascular endothelium? Vascular
>> endothelium plays a key role in blood pressure regulation. Recently, it has
>> been shown that a 5% increase of plasma sodium concentration (sodium
>> excess) stiffens endothelial cells by about 25%, leading to cellular
>> dysfunction. Surface measurements demonstrated that the endothelial
>> glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is
>> elevated. In view of these results, a two-barrier model for sodium exiting
>> the circulation across the endothelium is suggested. The first sodium
>> barrier is the eGC which selectively buffers sodium ions with its
>> negatively charged prote-oglycans.The second sodium barrier is the
>> endothelial plasma membrane which contains sodium channels. Sodium excess,
>> in the presence of aldosterone, leads to eGC break-down and, in parallel,
>> to an up-regulation of plasma membrane sodium channels. The following
>> hypothesis is postulated: Sodium excess increases vascular sodium
>> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
>> ingested sodium, instead of being readily buffered by the eGC and then
>> rapidly excreted by the kidneys, is distributed in the whole body before
>> being finally excreted. Gradually, the sodium overload damages the organism.
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
>> http://fusepool.info/doc/pmc/3581062> considering 3 properties
>> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>> http://purl.org/dc/elements/1.1/subject) with value
>> http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
>> http://purl.org/dc/elements/1.1/subject) with value
>> http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>> http://fusepool.eu/ontologies/ecs#ContentItem
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
>> http://purl.org/ontology/bibo/Document
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
>> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
>> org.apache.clerezza.rdf.cris.GraphIndexer indexing
>> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
>> with value Diagnosis and treatment of mitochondrial myopathies
>> Mitochondrial disorders are a heterogeneous group of disorders resulting
>> from primary dysfunction of the respiratory chain. Muscle tissue is highly
>> metabolically active, and therefore myopathy is a common element of the
>> clinical presentation of these disorders, although this may be overshadowed
>> by central neurological features. This review is aimed at a general medical
>> and neurologist readership and provides a clinical approach to the
>> recognition, investigation, and treatment of mitochondrial myopathies.
>> Emphasis is placed on practical management considerations while including
>> some recent updates in the field.
>> 
>> 
>> 
>> SEARCH WITHOUT CUSTOM CONDITION
>> ===============================
>> 
>> 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium*
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>> http://purl.org/dc/elements/1.1/subject :
>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
>> http://fusepool.eu/ontologies/ecs#ContentItem
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
>> http://purl.org/ontology/bibo/Document
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer
>> _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for sodium
>> in vascular endothelium? Vascular endothelium plays a key role in blood
>> pressure regulation. Recently, it has been shown that a 5% increase of
>> plasma sodium concentration (sodium excess) stiffens endothelial cells by
>> about 25%, leading to cellular dysfunction. Surface measurements
>> demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer,
>> deteriorates when sodium is elevated. In view of these results, a
>> two-barrier model for sodium exiting the circulation across the endothelium
>> is suggested. The first sodium barrier is the eGC which selectively buffers
>> sodium ions with its negatively charged prote-oglycans.The second sodium
>> barrier is the endothelial plasma membrane which contains sodium channels.
>> Sodium excess, in the presence of aldosterone, leads to eGC break-down and,
>> in parallel, to an up-regulation of plasma membrane sodium channels. The
>> following hypothesis is postulated: Sodium excess increases vascular sodium
>> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
>> ingested sodium, instead of being readily buffered by the eGC and then
>> rapidly excreted by the kidneys, is distributed in the whole body before
>> being finally excreted. Gradually, the sodium overload damages the organism.
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.clerezza.rdf.cris.GraphIndexer resource-uri :
>> http://fusepool.info/doc/pmc/3470790
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site
>> registered for Entity
>> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
>> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
>> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site
>> registered for Entity http://purl.org/ontology/bibo/Document
>> 
>> 
>> SEARCH WITH CUSTOM CONDITION
>> ============================
>> 
>> 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40]
>> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
>> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
>> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>> 
>> 


Re: CRIS field names in Lucene

Posted by Daniel Spicar <da...@gmail.com>.
Hi Stephane

This is a prefix added to Lucene stored fields (the fields that actually
get stored "as is" or unmodified in the document and returned by Lucene
when asking for Documents). Lucene also creates (or can be told to do so)
fields which are not "stored", thus one can analyze, tokenize, etc the
original value and create fields by which Lucene can search/sort - but
those fields are not returned as part of the document.

We add them to all stored fields before indexing (in the
GraphIndexer.resourceToDocument method). I am not sure anymore why exactly
this was needed. I think there was a peculiar problem with the sort order
when this was missing but I am not sure what exactly needed this
"workaround".

Daniel



2014/1/13 Stephane Gamard <st...@gamard.net>

> Hi all,
>
> I am trying to implement  new conditions for CRIS and I’ve come around a
> peculiar problem. I’ve create a “BoostCondition” based on the same
> principle than the WildCardCondition. Here’s it’s Ctor and query method:
>
>  public BoostCondition(VirtualProperty property, String value, Float
> boost) {
>     this.property = property;
>     this.value = value;
>     this.boost = boost;
>   }
>
>   public BoostCondition(UriRef uriRefProperty, String value, Float boost) {
>     this(new PropertyHolder(uriRefProperty,false), value,boost);
>   }
>
>   @Override
>   protected Query query() {
>     TermQuery termQuery = new TermQuery(new Term(property.getStringKey(),
> value));
>     termQuery.setBoost(boost);
>     return termQuery;
>   }
>
> Nothing fancy and here is how it is used:
>
>        conditions.add(new BoostCondition(RDF.type, "<
> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new
> Float(0.5)));
>
>         final List<NonLiteral> matchingNodes =
> indexService.findResources(conditions, facetCollector);
>         node.addPropertyValue(ECS.contentsCount, matchingNodes.size());
>
> All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’
> while when indexed it is indexed as: “_STORED_”+RDF.type as per the
> following lucene Query:
> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>
>
> Attached is log with and without the custom condition
>
> INDEXING
> ==========
>
> 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]: cache
> full or writes have ceased. Indexing...
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
> http://fusepool.info/doc/pmc/3470790> considering 3 properties
> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> http://purl.org/dc/elements/1.1/subject) with value
> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> http://fusepool.eu/ontologies/ecs#ContentItem
> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> http://purl.org/ontology/bibo/Document
> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
> with value Two barriers for sodium in vascular endothelium? Vascular
> endothelium plays a key role in blood pressure regulation. Recently, it has
> been shown that a 5% increase of plasma sodium concentration (sodium
> excess) stiffens endothelial cells by about 25%, leading to cellular
> dysfunction. Surface measurements demonstrated that the endothelial
> glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is
> elevated. In view of these results, a two-barrier model for sodium exiting
> the circulation across the endothelium is suggested. The first sodium
> barrier is the eGC which selectively buffers sodium ions with its
> negatively charged prote-oglycans.The second sodium barrier is the
> endothelial plasma membrane which contains sodium channels. Sodium excess,
> in the presence of aldosterone, leads to eGC break-down and, in parallel,
> to an up-regulation of plasma membrane sodium channels. The following
> hypothesis is postulated: Sodium excess increases vascular sodium
> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
> ingested sodium, instead of being readily buffered by the eGC and then
> rapidly excreted by the kidneys, is distributed in the whole body before
> being finally excreted. Gradually, the sodium overload damages the organism.
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing <
> http://fusepool.info/doc/pmc/3581062> considering 3 properties
> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1,
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a,
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e])
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> http://purl.org/dc/elements/1.1/subject) with value
> http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1(
> http://purl.org/dc/elements/1.1/subject) with value
> http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> http://fusepool.eu/ontologies/ecs#ContentItem
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a(
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value
> http://purl.org/ontology/bibo/Document
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1
> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]]
> org.apache.clerezza.rdf.cris.GraphIndexer indexing
> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2)
> with value Diagnosis and treatment of mitochondrial myopathies
> Mitochondrial disorders are a heterogeneous group of disorders resulting
> from primary dysfunction of the respiratory chain. Muscle tissue is highly
> metabolically active, and therefore myopathy is a common element of the
> clinical presentation of these disorders, although this may be overshadowed
> by central neurological features. This review is aimed at a general medical
> and neurologist readership and provides a clinical approach to the
> recognition, investigation, and treatment of mitochondrial myopathies.
> Emphasis is placed on practical management considerations while including
> some recent updates in the field.
>
>
>
>  SEARCH WITHOUT CUSTOM CONDITION
>  ===============================
>
> 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium*
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> http://purl.org/dc/elements/1.1/subject :
> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
> http://fusepool.eu/ontologies/ecs#ContentItem
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type :
> http://purl.org/ontology/bibo/Document
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer
> _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for sodium
> in vascular endothelium? Vascular endothelium plays a key role in blood
> pressure regulation. Recently, it has been shown that a 5% increase of
> plasma sodium concentration (sodium excess) stiffens endothelial cells by
> about 25%, leading to cellular dysfunction. Surface measurements
> demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer,
> deteriorates when sodium is elevated. In view of these results, a
> two-barrier model for sodium exiting the circulation across the endothelium
> is suggested. The first sodium barrier is the eGC which selectively buffers
> sodium ions with its negatively charged prote-oglycans.The second sodium
> barrier is the endothelial plasma membrane which contains sodium channels.
> Sodium excess, in the presence of aldosterone, leads to eGC break-down and,
> in parallel, to an up-regulation of plasma membrane sodium channels. The
> following hypothesis is postulated: Sodium excess increases vascular sodium
> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day
> ingested sodium, instead of being readily buffered by the eGC and then
> rapidly excreted by the kidneys, is distributed in the whole body before
> being finally excreted. Gradually, the sodium overload damages the organism.
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.clerezza.rdf.cris.GraphIndexer resource-uri :
> http://fusepool.info/doc/pmc/3470790
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site
> registered for Entity
> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b
> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38]
> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site
> registered for Entity http://purl.org/ontology/bibo/Document
>
>
> SEARCH WITH CUSTOM CONDITION
> ============================
>
> 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40]
> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery:
> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* +
> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:<
> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5
>
>