You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Mark Horton <ma...@nostromo.net> on 2002/12/14 15:24:53 UTC
parse performance degradation
Hi,
I've encountered a problem that seems to be related to using sockets and
xerces.
I'm listening on a socket receiving XML then parsing it using a SAX
parser (see code below). This socket will be receiving a lot of XML
messages. During load testing I noticed a serious performance
degradation after a few thousand messages. It goes from parsing 400
messages per second to less than 15 per second.
I tried compiling xerces without threads in case it was a threading
issue. I have used both DOM and SAX parsers as well. None of this helped.
I have also found that it runs with no performance degradation if I
comment out the following line:
parser->parse(*memBufIS);
Somehow the call parse->parse() is causing a problem. I have also
removed the socket code and simply read from the same file over and
over. This is very fast and does not degrade.
I could be wrong but it seems that somehow sockets (or possibly certain
IO calls) and parsing causes some sort of resource contention. The load
on the box is negligable. There is plenty of CPU and RAM.
One more odd bit of info:
The slowness manifests itself in the socket recv() call. It sometimes
takes up to 8 seconds to read from the socket. Again, if I comment out
the call to parse->parse() it works with no degradation. I don't
understand how calling parse->parse() repeatedly would cause recv() to
be slow.
Has anyone encountered anything like this or does anyone have any ideas?
Mark
void XSCon::serverLoop(int sockfd)
{
unsigned long startMillis, endMillis;
unsigned int loc;
int readSize, pos;
int BUF_SIZE = 10240;
char buffer[BUF_SIZE + 1];
string str, line;
string data = "";
string remainder = "";
for (;;)
{
if ((readSize = recv(sockfd, buffer, BUF_SIZE - 1, 0)) == -1)
{
perror("recv");
exit(1);
}
if (readSize ==0) break;
buffer[readSize] = '\0';
str = string(buffer);
pos = 0;
for (;;) {
loc = str.find("\n", pos);
if (loc != string::npos)
{
line = remainder + str.substr(pos, loc - pos + 1);
data += line;
pos = loc + 1;
remainder = "";
}
else
{
remainder += str.substr(pos, str.length() - pos);
break;
}
}
}
data += remainder;
MemBufInputSource* memBufIS = new MemBufInputSource
((const XMLByte*)data.data(), data.length(), "abc", false);
try
{
parser->parse(*memBufIS);
}
catch (const XMLException& e)
{
cerr << e.getMessage() << endl;
}
xxx++;
cout << "Count: " << xxx << endl;
delete memBufIS;
}
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
Re: parse performance degradation
Posted by Mark Horton <ma...@nostromo.net>.
I just realized that I forgot to post details about my setup. I'm using
the following:
xerces-2.1.0 (from source)
g++ 3.2
linux redhat-8.0 (x86)
kernel 2.4.18-14
Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
RE: parse performance degradation
Posted by David Featherstone <df...@taralnetworks.com>.
Have you considered memory fragmentation as a possible issue? (It's not
the sort of problem you could reproduce by processing the same file over
and over.)
-----Original Message-----
From: Mark Horton [mailto:mark@nostromo.net]
Sent: Saturday, December 14, 2002 9:25 AM
To: xerces-c-dev@xml.apache.org
Subject: parse performance degradation
Hi,
I've encountered a problem that seems to be related to using sockets and
xerces.
I'm listening on a socket receiving XML then parsing it using a SAX
parser (see code below). This socket will be receiving a lot of XML
messages. During load testing I noticed a serious performance
degradation after a few thousand messages. It goes from parsing 400
messages per second to less than 15 per second.
I tried compiling xerces without threads in case it was a threading
issue. I have used both DOM and SAX parsers as well. None of this
helped.
I have also found that it runs with no performance degradation if I
comment out the following line:
parser->parse(*memBufIS);
Somehow the call parse->parse() is causing a problem. I have also
removed the socket code and simply read from the same file over and
over. This is very fast and does not degrade.
I could be wrong but it seems that somehow sockets (or possibly certain
IO calls) and parsing causes some sort of resource contention. The load
on the box is negligable. There is plenty of CPU and RAM.
One more odd bit of info:
The slowness manifests itself in the socket recv() call. It sometimes
takes up to 8 seconds to read from the socket. Again, if I comment out
the call to parse->parse() it works with no degradation. I don't
understand how calling parse->parse() repeatedly would cause recv() to
be slow.
Has anyone encountered anything like this or does anyone have any ideas?
Mark
void XSCon::serverLoop(int sockfd)
{
unsigned long startMillis, endMillis;
unsigned int loc;
int readSize, pos;
int BUF_SIZE = 10240;
char buffer[BUF_SIZE + 1];
string str, line;
string data = "";
string remainder = "";
for (;;)
{
if ((readSize = recv(sockfd, buffer, BUF_SIZE - 1, 0)) == -1)
{
perror("recv");
exit(1);
}
if (readSize ==0) break;
buffer[readSize] = '\0';
str = string(buffer);
pos = 0;
for (;;) {
loc = str.find("\n", pos);
if (loc != string::npos)
{
line = remainder + str.substr(pos, loc - pos + 1);
data += line;
pos = loc + 1;
remainder = "";
}
else
{
remainder += str.substr(pos, str.length() - pos);
break;
}
}
}
data += remainder;
MemBufInputSource* memBufIS = new MemBufInputSource
((const XMLByte*)data.data(), data.length(), "abc", false);
try
{
parser->parse(*memBufIS);
}
catch (const XMLException& e)
{
cerr << e.getMessage() << endl;
}
xxx++;
cout << "Count: " << xxx << endl;
delete memBufIS;
}
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org