You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Mark Horton <ma...@nostromo.net> on 2002/12/14 15:24:53 UTC

parse performance degradation

Hi,
I've encountered a problem that seems to be related to using sockets and 
xerces.

I'm listening on a socket receiving XML then parsing it using a SAX 
parser (see code below).  This socket will be receiving a lot of XML 
messages.  During load testing I noticed a serious performance 
degradation after a few thousand messages.  It goes from parsing 400 
messages per second to less than 15 per second.

I tried compiling xerces without threads in case it was a threading 
issue.  I have used both DOM and SAX parsers as well.  None of this helped.

I have also found that it runs with no performance degradation if I 
comment out the following line:

parser->parse(*memBufIS);

Somehow the call parse->parse() is causing a problem.  I have also 
removed the socket code and simply read from the same file over and 
over.  This is very fast and does not degrade.

I could be wrong but it seems that somehow sockets (or possibly certain 
IO calls) and parsing causes some sort of resource contention.  The load 
on the box is negligable.  There is plenty of CPU and RAM.

One more odd bit of info:

The slowness manifests itself in the socket recv() call.  It sometimes 
takes up to 8 seconds to read from the socket.  Again, if I comment out 
the call to parse->parse() it works with no degradation.  I don't 
understand how calling parse->parse() repeatedly would cause recv() to 
be slow.

Has anyone encountered anything like this or does anyone have any ideas?

Mark


void XSCon::serverLoop(int sockfd)
{
     unsigned long startMillis, endMillis;
     unsigned int loc;
     int  readSize, pos;
     int  BUF_SIZE = 10240;
     char buffer[BUF_SIZE + 1];
     string str, line;
     string data = "";
     string remainder = "";

     for (;;)
     {
         if ((readSize = recv(sockfd, buffer, BUF_SIZE - 1, 0)) == -1)
         {
             perror("recv");
             exit(1);
         }

         if (readSize ==0) break;

         buffer[readSize] = '\0';
         str = string(buffer);

         pos = 0;
         for (;;) {
             loc = str.find("\n", pos);

             if (loc != string::npos)
             {
                 line = remainder + str.substr(pos, loc - pos + 1);
                 data += line;
                 pos = loc + 1;
                 remainder = "";
             }
             else
             {
                 remainder += str.substr(pos, str.length() - pos);
                 break;
             }
         }
     }

     data += remainder;

     MemBufInputSource* memBufIS = new MemBufInputSource
         ((const XMLByte*)data.data(), data.length(), "abc", false);

     try
     {
         parser->parse(*memBufIS);
     }
     catch (const XMLException& e)
     {
         cerr << e.getMessage() << endl;
     }

     xxx++;
     cout << "Count: " << xxx << endl;

     delete memBufIS;
}


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: parse performance degradation

Posted by Mark Horton <ma...@nostromo.net>.
I just realized that I forgot to post details about my setup.  I'm using 
the following:

xerces-2.1.0 (from source)
g++ 3.2
linux redhat-8.0 (x86)
kernel 2.4.18-14

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: parse performance degradation

Posted by David Featherstone <df...@taralnetworks.com>.
Have you considered memory fragmentation as a possible issue? (It's not
the sort of problem you could reproduce by processing the same file over
and over.)

-----Original Message-----
From: Mark Horton [mailto:mark@nostromo.net] 
Sent: Saturday, December 14, 2002 9:25 AM
To: xerces-c-dev@xml.apache.org
Subject: parse performance degradation

Hi,
I've encountered a problem that seems to be related to using sockets and

xerces.

I'm listening on a socket receiving XML then parsing it using a SAX 
parser (see code below).  This socket will be receiving a lot of XML 
messages.  During load testing I noticed a serious performance 
degradation after a few thousand messages.  It goes from parsing 400 
messages per second to less than 15 per second.

I tried compiling xerces without threads in case it was a threading 
issue.  I have used both DOM and SAX parsers as well.  None of this
helped.

I have also found that it runs with no performance degradation if I 
comment out the following line:

parser->parse(*memBufIS);

Somehow the call parse->parse() is causing a problem.  I have also 
removed the socket code and simply read from the same file over and 
over.  This is very fast and does not degrade.

I could be wrong but it seems that somehow sockets (or possibly certain 
IO calls) and parsing causes some sort of resource contention.  The load

on the box is negligable.  There is plenty of CPU and RAM.

One more odd bit of info:

The slowness manifests itself in the socket recv() call.  It sometimes 
takes up to 8 seconds to read from the socket.  Again, if I comment out 
the call to parse->parse() it works with no degradation.  I don't 
understand how calling parse->parse() repeatedly would cause recv() to 
be slow.

Has anyone encountered anything like this or does anyone have any ideas?

Mark


void XSCon::serverLoop(int sockfd)
{
     unsigned long startMillis, endMillis;
     unsigned int loc;
     int  readSize, pos;
     int  BUF_SIZE = 10240;
     char buffer[BUF_SIZE + 1];
     string str, line;
     string data = "";
     string remainder = "";

     for (;;)
     {
         if ((readSize = recv(sockfd, buffer, BUF_SIZE - 1, 0)) == -1)
         {
             perror("recv");
             exit(1);
         }

         if (readSize ==0) break;

         buffer[readSize] = '\0';
         str = string(buffer);

         pos = 0;
         for (;;) {
             loc = str.find("\n", pos);

             if (loc != string::npos)
             {
                 line = remainder + str.substr(pos, loc - pos + 1);
                 data += line;
                 pos = loc + 1;
                 remainder = "";
             }
             else
             {
                 remainder += str.substr(pos, str.length() - pos);
                 break;
             }
         }
     }

     data += remainder;

     MemBufInputSource* memBufIS = new MemBufInputSource
         ((const XMLByte*)data.data(), data.length(), "abc", false);

     try
     {
         parser->parse(*memBufIS);
     }
     catch (const XMLException& e)
     {
         cerr << e.getMessage() << endl;
     }

     xxx++;
     cout << "Count: " << xxx << endl;

     delete memBufIS;
}


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org