You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Fergus McMenemie (JIRA)" <ji...@apache.org> on 2009/02/02 20:25:59 UTC

[jira] Commented: (SOLR-1001) using invariant request values from solrconfig.xml inside a data-config.xml regexp

    [ https://issues.apache.org/jira/browse/SOLR-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669705#action_12669705 ] 

Fergus McMenemie commented on SOLR-1001:
----------------------------------------

I could probably hack around this myself given Shalin's clue as to the cause. However two possible issues come to mind.

* Is it possible that an equivalent change needs made to other transformers?

* Is the construct ${XXX} a valid part of a regular expression, I dont think so, but.... ?

> using invariant request values from solrconfig.xml inside a data-config.xml regexp
> ----------------------------------------------------------------------------------
>
>                 Key: SOLR-1001
>                 URL: https://issues.apache.org/jira/browse/SOLR-1001
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Fergus McMenemie
>             Fix For: 1.4
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> As per several postings I noted that I can define variables inside an invariants list section of the DIH handler of solrconfig.xml. I can also reference these variables within data-config.xml. This works properly, the solr field "test" is nicely populated. However it is not substituted into my regex transformer? Here is my  data-config.xml which gives a hint of the use case.
>    <dataConfig>
>    <dataSource name="myfilereader" type="FileDataSource"/>    
>     <document>
>        <entity name="jc"
> 	       processor="FileListEntityProcessor"
> 	       fileName="^.*\.xml$"
> 	       newerThan="'NOW-1000DAYS'"
> 	       recursive="true"
> 	       rootEntity="false"
> 	       dataSource="null"
> 	       baseDir="/Volumes/spare/ts/fords/dtd/fordsxml/data">
> 	  <entity name="x"
> 	          dataSource="myfilereader"
> 		  processor="XPathEntityProcessor"
> 		  url="${jc.fileAbsolutePath}"
> 		  stream="false"
> 		  forEach="/record"
> 		  transformer="DateFormatTransformer,TemplateTransformer,RegexTransformer,HTMLStripTransformer">
>    <field column="fileAbsolutePath" template="${jc.fileAbsolutePath}" />
>    <field column="fileWebPath"      regex="${dataimporter.request.finstalldir}(.*)" replaceWith="$1" sourceColName="fileAbsolutePath"/>
>    <field column="test"             template="${dataimporter.request.finstalldir}" />
>    <field column="title"            xpath="/record/title" />
>    <field column="para"             xpath="/record/sect1/para" stripHTML="true" />
>    <field column="date"             xpath="/record/metadata/date[@qualifier='Date']" dateTimeFormat="yyyyMMdd"   />
>    	     </entity>
>        </entity>
>        </document>
>     </dataConfig>
> Shalin has pointed out that we are creating the regex Pattern without first resolving the variable. So we need to call VariableResolver.resolve on the 'regex' attribute's value before creating the Pattern object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.