You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Gary Orser <or...@cns.montana.edu> on 2003/06/12 22:56:44 UTC

1.1 indexing performance

Hi,

The discussion about 1.1 indexing made me wonder if there is something
that I am doing wrong, or if there is a known bug with indexing and
performance in the 1.1 code.

I have a database with ~50mb of data.  ~600 xml files, which took about
5 seconds to respond to a query in 1.0, and now is taking 115 seconds
with 1.1.  The data was reloaded from scratch.  This procedure takes
about 1.5 hours.

#!/bin/sh

xindice delete_collection -l -c /db/neurosys/data -n ccblab
xindice add_collection -l -c /db/neurosys/data -n ccblab

xindice add_indexer -l -c /db/neurosys/data/ccblab -n starE -p "*"
xindice add_indexer -l -c /db/neurosys/data/ccblab -n starA -p "*@*"

# neither add multiple nor import seem to work, so...

for file in ccblab/*
do
  name=`echo $file | sed "s/ccblab\///"`
  CMD="xindice ad -l -c /db/neurosys/ccblab -f $file -n $name"
  echo "$CMD"
  $CMD
done


after executing the above script

ls -l ./db/neurosys/ccblab

-rw-r--r--    1 wwwrun   nogroup  53489664 Jun  9 13:29 ccblab.tbl
-rw-r--r--    1 wwwrun   nogroup  243196928 Jun  9 13:29 starA.idx
-rw-r--r--    1 wwwrun   nogroup      6144 Jun  9 13:29 starE.idx

xindice xpath -l -c /db/neurosys/ccblab -q "/XSIL/XSIL[@Name='system']"
(this takes 115 seconds)

Any ideas?
Are the "star" queries still ok?

Cheers, Gary
------------------------------------------------------
Gary Orser , (406) 994-6451, orser@cns.montana.edu
Montana State University
Center for Computational Biology
1 Lewis Hall, Bozeman MT, 59717