You are viewing a plain text version of this content. The canonical link for it is here.
Posted to doxia-commits@maven.apache.org by vs...@apache.org on 2008/12/06 15:41:44 UTC

svn commit: r723989 - /maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java

Author: vsiveton
Date: Sat Dec  6 06:41:44 2008
New Revision: 723989

URL: http://svn.apache.org/viewvc?rev=723989&view=rev
Log:
DOXIA-265: Add an EntityResolver in AbstractXmlParser#getXmlReader()

o added a simple cached file mechanism

Modified:
    maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java

Modified: maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java
URL: http://svn.apache.org/viewvc/maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java?rev=723989&r1=723988&r2=723989&view=diff
==============================================================================
--- maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java (original)
+++ maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java Sat Dec  6 06:41:44 2008
@@ -21,9 +21,12 @@
 
 import java.io.BufferedReader;
 import java.io.ByteArrayInputStream;
+import java.io.File;
 import java.io.IOException;
 import java.io.Reader;
 import java.io.StringReader;
+import java.net.URL;
+import java.util.Hashtable;
 import java.util.LinkedHashMap;
 import java.util.Map;
 import java.util.regex.Matcher;
@@ -35,11 +38,15 @@
 import org.apache.maven.doxia.markup.XmlMarkup;
 import org.apache.maven.doxia.sink.Sink;
 import org.apache.maven.doxia.sink.SinkEventAttributeSet;
+import org.codehaus.plexus.util.FileUtils;
 import org.codehaus.plexus.util.IOUtil;
+import org.codehaus.plexus.util.ReaderFactory;
 import org.codehaus.plexus.util.StringUtils;
+import org.codehaus.plexus.util.WriterFactory;
 import org.codehaus.plexus.util.xml.pull.MXParser;
 import org.codehaus.plexus.util.xml.pull.XmlPullParser;
 import org.codehaus.plexus.util.xml.pull.XmlPullParserException;
+import org.xml.sax.EntityResolver;
 import org.xml.sax.InputSource;
 import org.xml.sax.SAXException;
 import org.xml.sax.SAXNotRecognizedException;
@@ -570,8 +577,50 @@
             xmlReader.setFeature( "http://xml.org/sax/features/validation", true );
             xmlReader.setFeature( "http://apache.org/xml/features/validation/schema", true );
             xmlReader.setErrorHandler( errorHandler );
+            xmlReader.setEntityResolver( new CachedFileEntityResolver() );
         }
 
         return xmlReader;
     }
+
+    /**
+     * Implementation of the callback mechanism <code>EntityResolver</code>.
+     * Using a mechanism of cached files in temp dir to improve performance when using the <code>XMLReader</code>.
+     */
+    public static class CachedFileEntityResolver
+        implements EntityResolver
+    {
+        private static final Map cache = new Hashtable();
+
+        /** {@inheritDoc} */
+        public InputSource resolveEntity( String publicId, String systemId )
+            throws SAXException, IOException
+        {
+            byte[] res = (byte[]) cache.get( systemId );
+            // already cached?
+            if ( res == null )
+            {
+                File temp =
+                    new File( System.getProperty( "java.io.tmpdir" ), FileUtils.getFile( systemId ).getName() );
+                // maybe already as a temp file?
+                if ( !temp.exists() )
+                {
+                    res = IOUtil.toByteArray( new URL( systemId ).openStream() );
+                    IOUtil.copy( res, WriterFactory.newPlatformWriter( temp ) );
+                }
+                else
+                {
+                    res = IOUtil.toByteArray( ReaderFactory.newPlatformReader( temp ) );
+                }
+
+                cache.put( systemId, res );
+            }
+
+            InputSource is = new InputSource( new ByteArrayInputStream( res ) );
+            is.setPublicId( publicId );
+            is.setSystemId( systemId );
+
+            return is;
+        }
+    }
 }



Re: svn commit: r723989 - /maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java

Posted by Vincent Siveton <vi...@gmail.com>.
Hi Benjamin,

Right and I fixed it in r724309.

Thanks!

Vincent

2008/12/6 Benjamin Bentmann <be...@udo.edu>:
> Hi Vincent,
>
>> Author: vsiveton
>> Date: Sat Dec  6 06:41:44 2008
>> New Revision: 723989
>>
>> URL: http://svn.apache.org/viewvc?rev=723989&view=rev
>> Log:
>> DOXIA-265: Add an EntityResolver in AbstractXmlParser#getXmlReader()
>>
>> o added a simple cached file mechanism
>>
>> Modified:
>>
>>  maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java
>> [...]
>> +            byte[] res = (byte[]) cache.get( systemId );
>> +            // already cached?
>> +            if ( res == null )
>> +            {
>> +                File temp =
>> +                    new File( System.getProperty( "java.io.tmpdir" ),
>> FileUtils.getFile( systemId ).getName() );
>> +                // maybe already as a temp file?
>> +                if ( !temp.exists() )
>> +                {
>> +                    res = IOUtil.toByteArray( new URL( systemId
>> ).openStream() );
>> +                    IOUtil.copy( res, WriterFactory.newPlatformWriter(
>> temp ) );
>> +                }
>> +                else
>> +                {
>> +                    res = IOUtil.toByteArray(
>> ReaderFactory.newPlatformReader( temp ) );
>> +                }
>> +
>> +                cache.put( systemId, res );
>> +            }
>> +
>> +            InputSource is = new InputSource( new ByteArrayInputStream(
>> res ) );
>> +            is.setPublicId( publicId );
>> +            is.setSystemId( systemId );
>> +
>
> Is it safe to use a reader here, especially a platform reader? Byte streams
> that don't match the intended encoding get crippled but is the encoding of
> the data known here? Should this maybe just use
>  IOUtil.copy( byte[], OutputStream )
> and
>  IOUtil.toByteArray( InputStream )
> i.e. simply move bytes around instead of thinking about characters?
>
>
> Benjamin
>

Re: svn commit: r723989 - /maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java

Posted by Benjamin Bentmann <be...@udo.edu>.
Hi Vincent,

> Author: vsiveton
> Date: Sat Dec  6 06:41:44 2008
> New Revision: 723989
> 
> URL: http://svn.apache.org/viewvc?rev=723989&view=rev
> Log:
> DOXIA-265: Add an EntityResolver in AbstractXmlParser#getXmlReader()
> 
> o added a simple cached file mechanism
> 
> Modified:
>     maven/doxia/doxia/trunk/doxia-core/src/main/java/org/apache/maven/doxia/parser/AbstractXmlParser.java
> [...]
> +            byte[] res = (byte[]) cache.get( systemId );
> +            // already cached?
> +            if ( res == null )
> +            {
> +                File temp =
> +                    new File( System.getProperty( "java.io.tmpdir" ), FileUtils.getFile( systemId ).getName() );
> +                // maybe already as a temp file?
> +                if ( !temp.exists() )
> +                {
> +                    res = IOUtil.toByteArray( new URL( systemId ).openStream() );
> +                    IOUtil.copy( res, WriterFactory.newPlatformWriter( temp ) );
> +                }
> +                else
> +                {
> +                    res = IOUtil.toByteArray( ReaderFactory.newPlatformReader( temp ) );
> +                }
> +
> +                cache.put( systemId, res );
> +            }
> +
> +            InputSource is = new InputSource( new ByteArrayInputStream( res ) );
> +            is.setPublicId( publicId );
> +            is.setSystemId( systemId );
> +

Is it safe to use a reader here, especially a platform reader? Byte 
streams that don't match the intended encoding get crippled but is the 
encoding of the data known here? Should this maybe just use
   IOUtil.copy( byte[], OutputStream )
and
   IOUtil.toByteArray( InputStream )
i.e. simply move bytes around instead of thinking about characters?


Benjamin