You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Bill Bell (JIRA)" <ji...@apache.org> on 2009/11/09 08:07:32 UTC

[jira] Commented: (SOLR-1548) SolrPHP Library improvements

    [ https://issues.apache.org/jira/browse/SOLR-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774876#action_12774876 ] 

Bill Bell commented on SOLR-1548:
---------------------------------

addDocuments() - you can pass the cdataList and cdataSubList.
- cdataList is the list of fields like 'name'
- cdataSubList is the list of dynamic fields beginning with.... sm_field could match sm_field_make, sm_field_model as an example

I didn't get to the control chars. But this should be added too -

function strip_ctl_chars($text) {
  // See:  http://w3.org/International/questions/qa-forms-utf-8.html 
  // Printable utf-8 does not include any of these chars below x7F
  $text = preg_replace('@[\x00-\x08\x0B\x0C\x0E-\x1F]@', ' ', $text);
  /*$output = '';
  $slen = strlen($text);
  echo "Check: $text\n";
  for ($i = 0; $i < $slen; $i++) {
	$v = ord($text[$i]);
	if (($v >= 0 && $v <= 8) ||
	    ($v == 11) ||
	    ($v == 12) ||
	    ($v == 14) ||
	    ($v == 15) ||
	    ($v >= 16 && $v <= 31)) {
		echo "Found something!! " . $v . "\n";
		$output .= ' ';
		echo "set it:" . ord($output[$i]) . "\n";
	} else {
		$output .= $text[$i];
	}
  }
  return $output; */
  return $text;
}

How to use -

$cdataList = array("description", "name);
$cdataSubList = array("sm_field");
$solr->addDocuments($documents, false, true, true, $cdataList, $cdataSubList);

To get XML:

$str = $solr->_documentToXmlFragment( $document, $cdataList, $cdataSubList);




> SolrPHP Library improvements
> ----------------------------
>
>                 Key: SOLR-1548
>                 URL: https://issues.apache.org/jira/browse/SOLR-1548
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - php
>    Affects Versions: 1.4
>         Environment: All
>            Reporter: Bill Bell
>         Attachments: SolrPhpClient.tar.gz
>
>
> 1. Adds the ability to specify CDATA fields
> 2. Since I like storing the XML as it is sent to SOLR - changes public the raw XML
> 3. Adds the ability to specify the fields to strip UTF-8 control characters to avoid problems

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.