You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@xmlbeans.apache.org by heikki doeleman <tr...@gmail.com> on 2006/08/08 12:50:24 UTC

Using XmlCursor and XmlBookmark

[sorry - this time with message subject]

Hello,

I'm thinking of using XmlCursor / XmlBookmark to create in andex to a large
XML document. I'm wondering if I'm taking a right approach, as I found not
many real examples of XmlBookmark usage. Especially I'm wondering whether
the solution sketched below is prone to memory leaks, or otherwise
inefficient. Also, I'm not entirely sure of the exact nature of XmlCursor
vs. XmlBookmark, and it may well be that I'm doing superfluous stuff here; I
hope you like to take a look.

Say the XML is like this

<catalog>
    <recordlist>
        <record>
            <title>short text</title>
            <description>much longer text</description>
        </record>
        . . . . . (many records) 
    </recordlist>
</catalog>

My intent is to read this document into memory, then construct an index such
that each distinct word found in the description elements, acts as a key
that maps to a list of XmlBookmarks representing all records where that word
occurs. The code is like this:

CatalogDocument catalogDoc = CatalogDocument.Factory.parse(xmlFile); 
Catalog catalog = catalogDoc.getCatalog();
RecordList recordList = catalog.getRecordList();
Record[] recordArray = recordList.getRecordArray();
// the index to construct
Map bookmarxIdx = new HashMap();
for(int i = 0; i < recordArray.length; i++) {
    Record record = recordArray[i]; 
    XmlCursor recordCursor = record.newCursor();
    String description = record.getDescription();
    StringTokenizer st = new StringTokenizer(description);
    while(st.hasMoreTokens()) {
          String word = st.nextToken();
          List bookmarksForWord = (List) bookmarxIdx.get(word);
          if(bookmarksForWord == null) {
              bookmarksForWord = new ArrayList();
          }
          XmlCursor.XmlBookmark bookmark = new MyBookmark(word);
          recordCursor.setBookmark(bookmark);                    
          bookmarksForWord .add(recordCursor);                    
          bookmarxIdx.put(word, bookmarxValue);                                      
    }
}

// index is created. Now I can quickly retrieve the records that contain
some 
// word, for example "hello",  doing this:

List cursorsForSearchterm = (List) bookmarxIdx.get("hello");
for(Iterator i = cursors.iterator(); i.hasNext();) {
    XmlCursor c = (XmlCursor) i.next();
    XmlCursor.XmlBookmark bm = c.getBookmark(MyBookmark.class);
    c = bm.createCursor();
    System.out.println("Found record:  " + c.getObject().toString() );
    c.dispose();
}

The catalog will ermain in application memory during its entire lifetime, so
I figured that it'd be okay to keep all the XmlCursors I'm storing in the
index -- also during application lifetime and without ever disposing of
them. Then when retrieving records, I retrieve the XmlCursors from the
index, get their bookmark, create a new XmlCursor to position itslef at that
bookmark, retrieve the record from there, and dispose of this XmlCursor.

although it works I would like to know whether this is a valid way of
dealing with these XmlCursors and XmlBookmarks, or that I'm way off the mark
here ..

thank you so much, and regards
Heikki Doeleman


-- 
View this message in context: http://www.nabble.com/Using-XmlCursor-and-XmlBookmark-tf2071921.html#a5704059
Sent from the Xml Beans - User forum at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org

RE: Using XmlCursor and XmlBookmark

Posted by Cezar Andrei <ce...@bea.com>.

The cursors are designed to move very quickly and efficiently through
the document. One can use more cursors but it's not efficient to use a
new one for each record. 

In the for loop, if possible avoid creating new cursors, create only one
at the beginning and use that one for all the records, storing only the
bookmarks, both in the map and list. After indexing don't forget to
dispose the cursor.

For retrieving, you can create only one new cursor and move that one
from bookmark to bookmark ( c.toBookmark(bookmark) ) , and finally
dispose it.

Cezar

> -----Original Message-----
> From: heikki doeleman [mailto:tropicano@gmail.com]
> Sent: Tuesday, August 08, 2006 5:50 AM
> To: user@xmlbeans.apache.org
> Subject: Using XmlCursor and XmlBookmark
> 
> 
> [sorry - this time with message subject]
> 
> Hello,
> 
> I'm thinking of using XmlCursor / XmlBookmark to create in andex to a
> large
> XML document. I'm wondering if I'm taking a right approach, as I found
not
> many real examples of XmlBookmark usage. Especially I'm wondering
whether
> the solution sketched below is prone to memory leaks, or otherwise
> inefficient. Also, I'm not entirely sure of the exact nature of
XmlCursor
> vs. XmlBookmark, and it may well be that I'm doing superfluous stuff
here;
> I
> hope you like to take a look.
> 
> Say the XML is like this
> 
> <catalog>
>     <recordlist>
>         <record>
>             <title>short text</title>
>             <description>much longer text</description>
>         </record>
>         . . . . . (many records)
>     </recordlist>
> </catalog>
> 
> My intent is to read this document into memory, then construct an
index
> such
> that each distinct word found in the description elements, acts as a
key
> that maps to a list of XmlBookmarks representing all records where
that
> word
> occurs. The code is like this:
> 
> CatalogDocument catalogDoc = CatalogDocument.Factory.parse(xmlFile);
> Catalog catalog = catalogDoc.getCatalog();
> RecordList recordList = catalog.getRecordList();
> Record[] recordArray = recordList.getRecordArray();
> // the index to construct
> Map bookmarxIdx = new HashMap();
> for(int i = 0; i < recordArray.length; i++) {
>     Record record = recordArray[i];
>     XmlCursor recordCursor = record.newCursor();
>     String description = record.getDescription();
>     StringTokenizer st = new StringTokenizer(description);
>     while(st.hasMoreTokens()) {
>           String word = st.nextToken();
>           List bookmarksForWord = (List) bookmarxIdx.get(word);
>           if(bookmarksForWord == null) {
>               bookmarksForWord = new ArrayList();
>           }
>           XmlCursor.XmlBookmark bookmark = new MyBookmark(word);
>           recordCursor.setBookmark(bookmark);
>           bookmarksForWord .add(recordCursor);
>           bookmarxIdx.put(word, bookmarxValue);
>     }
> }
> 
> // index is created. Now I can quickly retrieve the records that
contain
> some
> // word, for example "hello",  doing this:
> 
> List cursorsForSearchterm = (List) bookmarxIdx.get("hello");
> for(Iterator i = cursors.iterator(); i.hasNext();) {
>     XmlCursor c = (XmlCursor) i.next();
>     XmlCursor.XmlBookmark bm = c.getBookmark(MyBookmark.class);
>     c = bm.createCursor();
>     System.out.println("Found record:  " + c.getObject().toString() );
>     c.dispose();
> }
> 
> The catalog will ermain in application memory during its entire
lifetime,
> so
> I figured that it'd be okay to keep all the XmlCursors I'm storing in
the
> index -- also during application lifetime and without ever disposing
of
> them. Then when retrieving records, I retrieve the XmlCursors from the
> index, get their bookmark, create a new XmlCursor to position itslef
at
> that
> bookmark, retrieve the record from there, and dispose of this
XmlCursor.
> 
> although it works I would like to know whether this is a valid way of
> dealing with these XmlCursors and XmlBookmarks, or that I'm way off
the
> mark
> here ..
> 
> thank you so much, and regards
> Heikki Doeleman
> 
> 
> --
> View this message in context:
http://www.nabble.com/Using-XmlCursor-and-
> XmlBookmark-tf2071921.html#a5704059
> Sent from the Xml Beans - User forum at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
> For additional commands, e-mail: user-help@xmlbeans.apache.org

_______________________________________________________________________
Notice:  This email message, together with any attachments, may contain
information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
entities,  that may be confidential,  proprietary,  copyrighted  and/or
legally privileged, and is intended solely for the use of the individual
or entity named in this message. If you are not the intended recipient,
and have received this message in error, please immediately return this
by email and then delete it.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org