You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@shindig.apache.org by jo...@apache.org on 2009/12/17 01:48:31 UTC
svn commit: r891496 [1/2] - in /incubator/shindig/trunk: ./
features/src/main/javascript/features/caja/ java/gadgets/
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/
java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/ java/ga...
Author: johnh
Date: Thu Dec 17 00:48:29 2009
New Revision: 891496
URL: http://svn.apache.org/viewvc?rev=891496&view=rev
Log:
Introduces Caja-based GadgetHtmlParser, and refactors/cleans up HtmlParser impl a bit.
Summary:
* Overhauls the GadgetHtmlParser base class and associated test cases
* Tweaks the Neko-based HTML parser implementation
* Introduces new Caja-based HTML parser
This fairly substantial CL reworks the HTML parsing system to better represent
(though not fully yet) the way that HTML is handled within gadgets: as tag soup,
cleaned up via custom rules after the fact into a legitimate, well-formed
document. It's a step toward treating concrete GadgetHtmlParser implementations
purely as fragment parsers.
Change detail:
* All parsing tests factored into base test classes with concrete tests largely
just providing a concrete parser implementation.
- HTML-equivalence method added utilizing the (fantastic) diff_match_patch
library, which ignores whitespace, case, and attributing-encoding differences.
* GadgetHtmlParser now does significant cleanup of the DOM it retrieves from
parseDomImpl(...), which BTW will soon go the way of the dodo in favor of always
using parseFragmentImpl(...)
- Creates head element and populates it with all style elements (only), as
putting these here cannot break rendering and because HTML requires <style> in
head.
- Creates body element as well.
- Combines multiple <head> elements together, if present.
- Prepends head with elements that occurred above a <head> element that
occurred in source, if any.
- Combines multiple <body> elements together, if present.
- Prepends and appends, respectively, elements found before and after the
first <body> tag and after the first <head> tag, and elements found after the
first <body> tag, without any <head> or <body> parent, to the <body> tag (that
was a mouthful).
- As noted above, stuffs all <style> elements found in <body> at the end of
<head>
- If OpenSocial-type <script> elements are treated per spec (ie. having only
text, no children), reprocesses this text as HTML and adds as children for
template processing.
* Introduces CajaHtmlParser
- Still has parseDomImpl method, mostly for API compatibility (short-term)
with Neko-based HtmlParser implementation, which has subtle differences btw
parseDomImpl and parseFragmentImpl which I want to clean up in a follow-up CL
(again, obviating the need for parseDomImpl altogether).
- Delegates to Caja's DomParser class's parseFragment() method for most
parsing needs
Added:
incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/CajaHtmlParser.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractSocialMarkupHtmlParserTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaCompactHtmlSerializerTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaParserAndSerializerTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaSocialMarkupHtmlParserTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoCompactHtmlSerializerTest.java
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-socialmarkup.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test.html
Removed:
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fragment2-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fragment2.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-ampersands-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-ampersands.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-iecond-comments-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-iecond-comments.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-specialtags-expected.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-with-specialtags.html
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test.html
Modified:
incubator/shindig/trunk/features/src/main/javascript/features/caja/feature.xml
incubator/shindig/trunk/java/gadgets/pom.xml
incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java
incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/nekohtml/NekoSimplifiedHtmlParser.java
incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/CajaContentRewriter.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParserAndSerializerTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializerTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoParserAndSerializeTest.java
incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/SocialMarkupHtmlParserTest.java
incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html
incubator/shindig/trunk/pom.xml
Modified: incubator/shindig/trunk/features/src/main/javascript/features/caja/feature.xml
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/features/src/main/javascript/features/caja/feature.xml?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/features/src/main/javascript/features/caja/feature.xml (original)
+++ incubator/shindig/trunk/features/src/main/javascript/features/caja/feature.xml Thu Dec 17 00:48:29 2009
@@ -23,7 +23,7 @@
<gadget>
<script src="res://com/google/caja/plugin/domita-minified.js"/>
<script src="caja.js"/>
- <script src="res://com/google/caja/plugin/valija.co.js"/>
+ <script src="res://com/google/caja/plugin/valija.out.js"/>
<script src="taming.js"/>
</gadget>
</feature>
Modified: incubator/shindig/trunk/java/gadgets/pom.xml
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/pom.xml?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/pom.xml (original)
+++ incubator/shindig/trunk/java/gadgets/pom.xml Thu Dec 17 00:48:29 2009
@@ -129,6 +129,11 @@
<artifactId>json</artifactId>
</dependency>
<dependency>
+ <groupId>diff_match_patch</groupId>
+ <artifactId>diff_match_patch</artifactId>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
<groupId>caja</groupId>
<artifactId>caja</artifactId>
</dependency>
Modified: incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/GadgetHtmlParser.java Thu Dec 17 00:48:29 2009
@@ -20,20 +20,23 @@
import org.apache.shindig.common.cache.Cache;
import org.apache.shindig.common.cache.CacheProvider;
import org.apache.shindig.common.util.HashUtil;
-import org.apache.shindig.common.xml.DomUtil;
import org.apache.shindig.gadgets.GadgetException;
import org.apache.shindig.gadgets.parse.nekohtml.NekoSimplifiedHtmlParser;
import com.google.common.collect.BiMap;
import com.google.common.collect.ImmutableBiMap;
+import com.google.common.collect.Lists;
import com.google.inject.ImplementedBy;
import com.google.inject.Inject;
import com.google.inject.Provider;
+import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.DocumentFragment;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
+import java.util.LinkedList;
+
/**
* Parser for arbitrary HTML content
*/
@@ -89,28 +92,103 @@
key = HashUtil.rawChecksum(source.getBytes());
document = documentCache.getElement(key);
}
+
if (document == null) {
document = parseDomImpl(source);
HtmlSerialization.attach(document, serializerProvider.get(), source);
+ Node html = document.getDocumentElement();
+
+ Node head = null;
+ Node body = null;
+ LinkedList<Node> beforeHead = Lists.newLinkedList();
+ LinkedList<Node> beforeBody = Lists.newLinkedList();
+
+ while (html.hasChildNodes()) {
+ Node child = html.removeChild(html.getFirstChild());
+ if (child.getNodeType() == Node.ELEMENT_NODE &&
+ "head".equalsIgnoreCase(child.getNodeName())) {
+ if (head == null) {
+ head = child;
+ } else {
+ // Concatenate <head> elements together.
+ transferChildren(head, child);
+ }
+ } else if (child.getNodeType() == Node.ELEMENT_NODE &&
+ "body".equalsIgnoreCase(child.getNodeName())) {
+ if (body == null) {
+ body = child;
+ } else {
+ // Concatenate <body> elements together.
+ transferChildren(body, child);
+ }
+ } else if (head == null) {
+ beforeHead.add(child);
+ } else if (body == null) {
+ beforeBody.add(child);
+ } else {
+ // Both <head> and <body> are present. Append to tail of <body>.
+ body.appendChild(child);
+ }
+ }
+
// Ensure head tag exists
- if (DomUtil.getFirstNamedChildNode(document.getDocumentElement(), "head") == null) {
+ if (head == null) {
+ // beforeHead contains all elements that should be prepended to <body>. Switch them.
+ LinkedList<Node> temp = beforeBody;
+ beforeBody = beforeHead;
+ beforeHead = temp;
+
// Add as first element
- document.getDocumentElement().insertBefore(
- document.createElement("head"),
- document.getDocumentElement().getFirstChild());
- }
- // If body not found the document was entirely empty. Create the
- // element anyway
- if (DomUtil.getFirstNamedChildNode(document.getDocumentElement(), "body") == null) {
- document.getDocumentElement().appendChild(
- document.createElement("body"));
+ head = document.createElement("head");
+ html.insertBefore(head, html.getFirstChild());
+ } else {
+ // Re-append head node.
+ html.appendChild(head);
+ }
+
+ // Ensure body tag exists.
+ if (body == null) {
+ // Add immediately after head.
+ body = document.createElement("body");
+ html.insertBefore(body, head.getNextSibling());
+ } else {
+ // Re-append body node.
+ html.appendChild(body);
+ }
+
+ // Leftovers: nodes before the first <head> node found and the first <body> node found.
+ // Prepend beforeHead to the front of <head>, and beforeBody to beginning of <body>,
+ // in the order they were found in the document.
+ prependToNode(head, beforeHead);
+ prependToNode(body, beforeBody);
+
+ // One exception. <style> nodes from <body> end up at the end of <head>, since doing so
+ // is HTML compliant and can never break rendering due to ordering concerns.
+ LinkedList<Node> styleNodes = Lists.newLinkedList();
+ NodeList bodyKids = body.getChildNodes();
+ for (int i = 0; i < bodyKids.getLength(); ++i) {
+ Node bodyKid = bodyKids.item(i);
+ if (bodyKid.getNodeType() == Node.ELEMENT_NODE &&
+ "style".equalsIgnoreCase(bodyKid.getNodeName())) {
+ styleNodes.add(bodyKid);
+ }
}
+
+ for (Node styleNode : styleNodes) {
+ head.appendChild(body.removeChild(styleNode));
+ }
+
+ // Finally, reprocess all script nodes for OpenSocial purposes, as these
+ // may be interpreted (rightly, from the perspective of HTML) as containing text only.
+ reprocessScriptForOpenSocial(html);
+
if (shouldCache) {
documentCache.addElement(key, document);
}
}
+
if (shouldCache) {
Document copy = (Document)document.cloneNode(true);
HtmlSerialization.copySerializer(document, copy);
@@ -118,6 +196,18 @@
}
return document;
}
+
+ protected void transferChildren(Node to, Node from) {
+ while (from.hasChildNodes()) {
+ to.appendChild(from.removeChild(from.getFirstChild()));
+ }
+ }
+
+ protected void prependToNode(Node to, LinkedList<Node> from) {
+ while (from.size() > 0) {
+ to.insertBefore(from.removeLast(), to.getFirstChild());
+ }
+ }
/**
* Parses a snippet of markup and appends the result as children to the
@@ -139,6 +229,7 @@
}
}
DocumentFragment fragment = parseFragmentImpl(source);
+ reprocessScriptForOpenSocial(fragment);
if (shouldCache) {
fragmentCache.addElement(key, fragment);
}
@@ -157,13 +248,63 @@
protected boolean shouldCache() {
return documentCache != null && documentCache.getCapacity() != 0;
}
-
+
+ private void reprocessScriptForOpenSocial(Node root) throws GadgetException {
+ LinkedList<Node> nodeQueue = Lists.newLinkedList();
+ nodeQueue.add(root);
+ while (!nodeQueue.isEmpty()) {
+ Node next = nodeQueue.removeFirst();
+ if (next.getNodeType() == Node.ELEMENT_NODE &&
+ "script".equalsIgnoreCase(next.getNodeName())) {
+ Attr typeAttr = (Attr)next.getAttributes().getNamedItem("type");
+ if (typeAttr != null && SCRIPT_TYPE_TO_OSML_TAG.get(typeAttr.getValue()) != null) {
+ // The underlying parser impl may have already parsed these.
+ // Only re-parse with the coalesced text children if that's all there are.
+ boolean parseOs = true;
+ StringBuilder sb = new StringBuilder();
+ NodeList scriptKids = next.getChildNodes();
+ for (int i = 0; parseOs && i < scriptKids.getLength(); ++i) {
+ Node scriptKid = scriptKids.item(i);
+ if (scriptKid.getNodeType() != Node.TEXT_NODE) {
+ parseOs = false;
+ }
+ sb.append(scriptKid.getTextContent());
+ }
+ if (parseOs) {
+ // Clean out the script node.
+ while (next.hasChildNodes()) {
+ next.removeChild(next.getFirstChild());
+ }
+ DocumentFragment osFragment = parseFragmentImpl(sb.toString());
+ while (osFragment.hasChildNodes()) {
+ Node osKid = osFragment.removeChild(osFragment.getFirstChild());
+ osKid = next.getOwnerDocument().adoptNode(osKid);
+ if (osKid.getNodeType() == Node.ELEMENT_NODE) {
+ next.appendChild(osKid);
+ }
+ }
+ }
+ }
+ }
+
+ // Enqueue children for inspection.
+ NodeList children = next.getChildNodes();
+ for (int i = 0; i < children.getLength(); ++i) {
+ nodeQueue.add(children.item(i));
+ }
+ }
+ }
+
/**
- * @param source
- * @return a parsed document or document fragment
+ * TODO: remove the need for parseDomImpl as a parsing method. Gadget HTML is
+ * tag soup handled in custom fashion, or is a legitimate fragment. In either case,
+ * we can simply use the fragment parsing implementation and patch up in higher-level calls.
+ * @param source a piece of HTML
+ * @return a Document parsed from the HTML
* @throws GadgetException
*/
- protected abstract Document parseDomImpl(String source) throws GadgetException;
+ protected abstract Document parseDomImpl(String source)
+ throws GadgetException;
/**
* @param source a snippet of HTML markup
@@ -172,39 +313,6 @@
*/
protected abstract DocumentFragment parseFragmentImpl(String source)
throws GadgetException;
-
- /**
- * Normalize head and body tags in the passed fragment before including it
- * in the document
- * @param document
- * @param fragment
- */
- protected void normalizeFragment(Document document, DocumentFragment fragment) {
- Node htmlNode = DomUtil.getFirstNamedChildNode(fragment, "HTML");
- if (htmlNode != null) {
- document.appendChild(htmlNode);
- } else {
- Node bodyNode = DomUtil.getFirstNamedChildNode(fragment, "body");
- Node headNode = DomUtil.getFirstNamedChildNode(fragment, "head");
- if (bodyNode != null || headNode != null) {
- // We have either a head or body so put fragment into HTML tag
- Node root = document.appendChild(document.createElement("html"));
- if (headNode != null && bodyNode == null) {
- fragment.removeChild(headNode);
- root.appendChild(headNode);
- Node body = root.appendChild(document.createElement("body"));
- body.appendChild(fragment);
- } else {
- root.appendChild(fragment);
- }
- } else {
- // No head or body so put fragment into a body
- Node root = document.appendChild(document.createElement("html"));
- Node body = root.appendChild(document.createElement("body"));
- body.appendChild(fragment);
- }
- }
- }
private static class DefaultSerializerProvider implements Provider<HtmlSerializer> {
public HtmlSerializer get() {
Added: incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/CajaHtmlParser.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/CajaHtmlParser.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/CajaHtmlParser.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/caja/CajaHtmlParser.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,136 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations under the License.
+ */
+package org.apache.shindig.gadgets.parse.caja;
+
+import java.util.LinkedList;
+
+import com.google.caja.lexer.CharProducer;
+import com.google.caja.lexer.HtmlLexer;
+import com.google.caja.lexer.InputSource;
+import com.google.caja.lexer.ParseException;
+import com.google.caja.parser.html.DomParser;
+import com.google.caja.parser.html.Namespaces;
+import com.google.caja.reporting.Message;
+import com.google.caja.reporting.MessageLevel;
+import com.google.caja.reporting.MessageQueue;
+import com.google.caja.reporting.SimpleMessageQueue;
+import com.google.common.collect.Lists;
+import com.google.inject.Inject;
+
+import org.apache.shindig.gadgets.GadgetException;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.w3c.dom.DOMImplementation;
+import org.w3c.dom.Document;
+import org.w3c.dom.DocumentFragment;
+import org.w3c.dom.Node;
+
+public class CajaHtmlParser extends GadgetHtmlParser {
+ private final DOMImplementation documentFactory;
+
+ @Inject
+ public CajaHtmlParser(DOMImplementation documentFactory) {
+ this.documentFactory = documentFactory;
+ }
+
+ @Override
+ protected Document parseDomImpl(String source) throws GadgetException {
+ DocumentFragment fragment = parseFragmentImpl(source);
+
+ // TODO: remove parseDomImpl() altogether; only have subclasses
+ // support parseFragmentImpl() with base class cleaning up.
+ Document document = fragment.getOwnerDocument();
+ Node html = null;
+ LinkedList<Node> beforeHtml = Lists.newLinkedList();
+ while (fragment.hasChildNodes()) {
+ Node child = fragment.removeChild(fragment.getFirstChild());
+ if (child.getNodeType() == Node.ELEMENT_NODE &&
+ "html".equalsIgnoreCase(child.getNodeName())) {
+ if (html == null) {
+ html = child;
+ } else {
+ // Ignore the current (duplicated) html node but add its children
+ transferChildren(html, child);
+ }
+ } else if (html != null) {
+ html.appendChild(child);
+ } else {
+ beforeHtml.add(child);
+ }
+ }
+
+ if (html == null) {
+ html = document.createElement("html");
+ }
+
+ prependToNode(html, beforeHtml);
+
+ // Ensure document.getDocumentElement() is html node.
+ document.appendChild(html);
+
+ return document;
+ }
+
+ @Override
+ protected DocumentFragment parseFragmentImpl(String source)
+ throws GadgetException {
+ try {
+ MessageQueue mq = makeMessageQueue();
+ DomParser parser = getDomParser(source, mq);
+ DocumentFragment fragment = parser.parseFragment();
+ if (mq.hasMessageAtLevel(MessageLevel.ERROR)) {
+ StringBuilder err = new StringBuilder();
+ for (Message m : mq.getMessages()) {
+ err.append(m.toString()).append("\n");
+ }
+ throw new GadgetException(GadgetException.Code.HTML_PARSE_ERROR, err.toString());
+ }
+ return fragment;
+ } catch (ParseException e) {
+ throw new GadgetException(
+ GadgetException.Code.HTML_PARSE_ERROR, e.getCajaMessage().toString());
+ }
+ }
+
+ protected InputSource getInputSource() {
+ // Returns a default/dummy InputSource.
+ // We might consider adding the gadget URI to the GadgetHtmlParser API,
+ // but in the meantime this method is protected to allow overriding this
+ // with request-scoped retrieval of this same data.
+ return InputSource.UNKNOWN;
+ }
+
+ protected MessageQueue makeMessageQueue() {
+ return new SimpleMessageQueue();
+ }
+
+ protected boolean needsDebugData() {
+ return false;
+ }
+
+ private DomParser getDomParser(String source, final MessageQueue mq) throws ParseException {
+ InputSource is = getInputSource();
+ HtmlLexer lexer = new HtmlLexer(CharProducer.Factory.fromString(source, is));
+ final Namespaces ns = Namespaces.HTML_DEFAULT; // Includes OpenSocial
+ final boolean needsDebugData = needsDebugData();
+ DomParser parser = new DomParser(lexer, is, ns, mq);
+ parser.setDomImpl(documentFactory);
+ parser.setWantsComments(true);
+ parser.setNeedsDebugData(needsDebugData);
+ return parser;
+ }
+}
Modified: incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/nekohtml/NekoSimplifiedHtmlParser.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/nekohtml/NekoSimplifiedHtmlParser.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/nekohtml/NekoSimplifiedHtmlParser.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/parse/nekohtml/NekoSimplifiedHtmlParser.java Thu Dec 17 00:48:29 2009
@@ -107,8 +107,7 @@
}
Document document = handler.getDocument();
- DocumentFragment fragment = handler.getFragment();
- normalizeFragment(document, fragment);
+ document.appendChild(handler.getFragment().getFirstChild());
fixNekoWeirdness(document);
return document;
}
Modified: incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/CajaContentRewriter.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/CajaContentRewriter.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/CajaContentRewriter.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/main/java/org/apache/shindig/gadgets/servlet/CajaContentRewriter.java Thu Dec 17 00:48:29 2009
@@ -128,7 +128,6 @@
MessageQueue mq = new SimpleMessageQueue();
BuildInfo bi = BuildInfo.getInstance();
DefaultGadgetRewriter rw = new DefaultGadgetRewriter(bi, mq);
- rw.setValijaMode(true);
InputSource is = new InputSource(retrievedUri);
boolean safe = false;
Modified: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParserAndSerializerTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParserAndSerializerTest.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParserAndSerializerTest.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParserAndSerializerTest.java Thu Dec 17 00:48:29 2009
@@ -16,34 +16,74 @@
* specific language governing permissions and limitations
* under the License.
*/
-
package org.apache.shindig.gadgets.parse;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.lang.StringUtils;
-
-import org.junit.Assert;
-import org.w3c.dom.Document;
+import static org.junit.Assert.assertNull;
-import java.io.IOException;
+import org.junit.Before;
+import org.junit.Test;
/**
* Base test fixture for HTML parsing and serialization.
*/
-public abstract class AbstractParserAndSerializerTest extends Assert {
+public abstract class AbstractParserAndSerializerTest extends AbstractParsingTestBase {
+ protected GadgetHtmlParser parser;
+
+ protected abstract GadgetHtmlParser makeParser();
+
+ @Before
+ public void setUp() throws Exception {
+ parser = makeParser();
+ }
+
+ @Test
+ public void docWithDoctype() throws Exception {
+ // Note that doctype is properly retained
+ String content = loadFile("org/apache/shindig/gadgets/parse/test.html");
+ String expected = loadFile("org/apache/shindig/gadgets/parse/test-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
+ }
- /** The vm line separator */
- private static final String EOL = System.getProperty("line.separator");
+ @Test
+ public void docNoDoctype() throws Exception {
+ // Note that no doctype is properly created when none specified
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html");
+ String expected =
+ loadFile("org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html");
+ assertNull(parser.parseDom(content).getDoctype());
+ parseAndCompareBalanced(content, expected, parser);
+ }
+
+ @Test
+ public void notADocument() throws Exception {
+ // Note that no doctype is injected for fragments
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-fragment.html");
+ String expected = loadFile("org/apache/shindig/gadgets/parse/test-fragment-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
+ }
+
+ @Test
+ public void notADocument2() throws Exception {
+ // Note that no doctype is injected for fragments
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-fragment2.html");
+ String expected = loadFile("org/apache/shindig/gadgets/parse/test-fragment2-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
+ }
- protected String loadFile(String path) throws IOException {
- return IOUtils.toString(this.getClass().getClassLoader().
- getResourceAsStream(path));
+ @Test
+ public void noBody() throws Exception {
+ // Note that no doctype is injected for fragments
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-headnobody.html");
+ String expected = loadFile("org/apache/shindig/gadgets/parse/test-headnobody-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
}
- protected void parseAndCompareBalanced(String content, String expected, GadgetHtmlParser parser)
- throws Exception {
- Document document = parser.parseDom(content);
- expected = StringUtils.replace(expected, EOL, "\n");
- assertEquals(expected.trim(), HtmlSerialization.serialize(document).trim());
+ @Test
+ public void ampersand() throws Exception {
+ // Note that no doctype is injected for fragments
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-with-ampersands.html");
+ String expected =
+ loadFile("org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
}
}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractParsingTestBase.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,114 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse;
+
+import static org.junit.Assert.assertEquals;
+
+import name.fraser.neil.plaintext.diff_match_patch;
+import name.fraser.neil.plaintext.diff_match_patch.Diff;
+import name.fraser.neil.plaintext.diff_match_patch.Operation;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.commons.lang.StringEscapeUtils;
+import org.apache.commons.lang.StringUtils;
+import org.w3c.dom.Document;
+
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.LinkedList;
+
+/**
+ * Simple base class providing test helpers for parsing/serializing tests.
+ */
+public abstract class AbstractParsingTestBase {
+ /** The vm line separator */
+ private static final String EOL = System.getProperty("line.separator");
+
+ protected String loadFile(String path) throws IOException {
+ InputStream is = this.getClass().getClassLoader().getResourceAsStream(path);
+ // ENABLE THIS if you have troubles in your IDE loading resources.
+ /*
+ if (is == null) {
+ is = new FileInputStream(new File("/shindig/base/java/gadgets/src/test/resources/" + path));
+ }
+ */
+ return IOUtils.toString(is);
+ }
+
+ protected void parseAndCompareBalanced(String content, String expected, GadgetHtmlParser parser)
+ throws Exception {
+ Document document = parser.parseDom(content);
+ expected = StringUtils.replace(expected, EOL, "\n");
+ String serialized = HtmlSerialization.serialize(document);
+ assertHtmlEquals(expected, serialized);
+ }
+
+ private void assertHtmlEquals(String expected, String serialized) {
+ // Compute the diff of expected vs. serialized, and disregard constructs that we don't
+ // care about, such as whitespace deltas and differently-computed escape sequences.
+ diff_match_patch dmp = new diff_match_patch();
+ LinkedList<Diff> diffs = dmp.diff_main(expected, serialized);
+ while (diffs.size() > 0) {
+ Diff cur = diffs.removeFirst();
+ switch (cur.operation) {
+ case DELETE:
+ if (StringUtils.isBlank(cur.text) || "amp;".equalsIgnoreCase(cur.text)) {
+ continue;
+ }
+ if (diffs.size() == 0) {
+ // End of the set: assert known failure.
+ assertEquals(expected, serialized);
+ }
+ Diff next = diffs.removeFirst();
+ if (next.operation != Operation.INSERT) {
+ // Next operation isn't a paired insert: assert known failure.
+ assertEquals(expected, serialized);
+ }
+ if (!equivalentEntities(cur.text, next.text) &&
+ !cur.text.equalsIgnoreCase(next.text)) {
+ // Delete/insert pair: fail unless each's text is equivalent
+ // either in terms of case or entity equivalence.
+ assertEquals(expected, serialized);
+ }
+ break;
+ case INSERT:
+ // Assert known failure unless insert is whitespace/blank.
+ if (StringUtils.isBlank(cur.text) || "amp;".equalsIgnoreCase(cur.text)) {
+ continue;
+ }
+ assertEquals(expected, serialized);
+ break;
+ default:
+ // EQUALS: move on.
+ break;
+ }
+ }
+ }
+
+ private boolean equivalentEntities(String prev, String cur) {
+ if (!prev.endsWith(";") && !cur.endsWith(";")) {
+ return false;
+ }
+ String prevEnt = StringEscapeUtils.unescapeHtml(prev);
+ String curEnt = StringEscapeUtils.unescapeHtml(cur);
+ return prevEnt.equals(curEnt);
+ }
+}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractSocialMarkupHtmlParserTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractSocialMarkupHtmlParserTest.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractSocialMarkupHtmlParserTest.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/AbstractSocialMarkupHtmlParserTest.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,198 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import com.google.common.collect.ImmutableSet;
+import com.google.common.collect.Lists;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.shindig.common.xml.DomUtil;
+import org.apache.shindig.gadgets.GadgetException;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.apache.shindig.gadgets.parse.HtmlSerialization;
+import org.apache.shindig.gadgets.spec.PipelinedData;
+
+import org.junit.Before;
+import org.junit.Test;
+import org.w3c.dom.Attr;
+import org.w3c.dom.DOMException;
+import org.w3c.dom.Document;
+import org.w3c.dom.Element;
+import org.w3c.dom.Node;
+import org.w3c.dom.NodeList;
+
+import java.util.Iterator;
+import java.util.List;
+
+/**
+ * Test for the social markup parser.
+ */
+public abstract class AbstractSocialMarkupHtmlParserTest extends AbstractParsingTestBase {
+ private GadgetHtmlParser parser;
+ private Document document;
+
+ protected abstract GadgetHtmlParser makeParser();
+
+ @Before
+ public void setUp() throws Exception {
+ parser = makeParser();
+
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-socialmarkup.html");
+ document = parser.parseDom(content);
+ }
+
+ @Test
+ public void testSocialData() {
+ // Verify elements are preserved in social data
+ List<Element> scripts = getTags(GadgetHtmlParser.OSML_DATA_TAG);
+ assertEquals(1, scripts.size());
+
+ NodeList viewerRequests = scripts.get(0).getElementsByTagNameNS(
+ PipelinedData.OPENSOCIAL_NAMESPACE, "ViewerRequest");
+ assertEquals(1, viewerRequests.getLength());
+ Element viewerRequest = (Element) viewerRequests.item(0);
+ assertEquals("viewer", viewerRequest.getAttribute("key"));
+ assertEmpty(viewerRequest);
+ }
+
+ @Test
+ public void testSocialTemplate() {
+ // Verify elements and text content are preserved in social templates
+ List<Element> scripts = getTags(GadgetHtmlParser.OSML_TEMPLATE_TAG);
+ assertEquals(1, scripts.size());
+
+ assertEquals("template-id", scripts.get(0).getAttribute("id"));
+ assertEquals("template-name", scripts.get(0).getAttribute("name"));
+ assertEquals("template-tag", scripts.get(0).getAttribute("tag"));
+
+ NodeList boldElements = scripts.get(0).getElementsByTagName("b");
+ assertEquals(1, boldElements.getLength());
+ Element boldElement = (Element) boldElements.item(0);
+ assertEquals("Some ${viewer} content", boldElement.getTextContent());
+
+ NodeList osHtmlElements = scripts.get(0).getElementsByTagNameNS(
+ "http://ns.opensocial.org/2008/markup", "Html");
+ assertEquals(1, osHtmlElements.getLength());
+ }
+
+ @Test
+ public void testSocialTemplateSerialization() {
+ String content = HtmlSerialization.serialize(document);
+ assertTrue("Empty elements not preserved as XML inside template",
+ content.contains("<img/>"));
+ }
+
+ @Test
+ public void testJavascript() {
+ // Verify text content is unmodified in javascript blocks
+ List<Element> scripts = getTags("script");
+
+ // Remove any OpenSocial-specific nodes.
+ Iterator<Element> scriptIt = scripts.iterator();
+ while (scriptIt.hasNext()) {
+ if (isOpenSocialScript(scriptIt.next())) {
+ scriptIt.remove();
+ }
+ }
+
+ assertEquals(1, scripts.size());
+
+ NodeList boldElements = scripts.get(0).getElementsByTagName("b");
+ assertEquals(0, boldElements.getLength());
+
+ String scriptContent = scripts.get(0).getTextContent().trim();
+ assertEquals("<b>Some ${viewer} content</b>", scriptContent);
+ }
+
+ @Test
+ public void testPlainContent() {
+ // Verify text content is preserved in non-script content
+ NodeList spanElements = document.getElementsByTagName("span");
+ assertEquals(1, spanElements.getLength());
+ assertEquals("Some content", spanElements.item(0).getTextContent());
+ }
+
+ @Test
+ public void testCommentOrdering() {
+ NodeList divElements = document.getElementsByTagName("div");
+ assertEquals(1, divElements.getLength());
+ NodeList children = divElements.item(0).getChildNodes();
+ assertEquals(3, children.getLength());
+
+ // Should be comment/text/comment, not comment/comment/text
+ assertEquals(Node.COMMENT_NODE, children.item(0).getNodeType());
+ assertEquals(Node.TEXT_NODE, children.item(1).getNodeType());
+ assertEquals(Node.COMMENT_NODE, children.item(2).getNodeType());
+ }
+
+ @Test
+ public void testInvalid() throws Exception {
+ String content =
+ "<html><div id=\"div_super\" class=\"div_super\" valign:\"middle\"></div></html>";
+ try {
+ parser.parseDom(content);
+ fail("No exception caught on invalid character");
+ } catch (DOMException e) {
+ assertTrue(e.getMessage().contains("INVALID_CHARACTER_ERR"));
+ assertTrue(e.getMessage().contains(
+ "Around ...<div id=\"div_super\" class=\"div_super\"..."));
+ } catch (GadgetException e) {
+ assertEquals(GadgetException.Code.HTML_PARSE_ERROR, e.getCode());
+ }
+ }
+
+ private void assertEmpty(Node n) {
+ if (n.getChildNodes().getLength() != 0) {
+ assertTrue(StringUtils.isEmpty(n.getTextContent()) ||
+ StringUtils.isWhitespace(n.getTextContent()));
+ }
+ }
+
+ private List<Element> getTags(String tagName) {
+ NodeList list = document.getElementsByTagName(tagName);
+ List<Element> elements = Lists.newArrayListWithExpectedSize(list.getLength());
+ for (int i = 0; i < list.getLength(); i++) {
+ elements.add((Element) list.item(i));
+ }
+
+ // Add equivalent <script> elements
+ String scriptType = GadgetHtmlParser.SCRIPT_TYPE_TO_OSML_TAG.inverse().get(tagName);
+ if (scriptType != null) {
+ List<Element> scripts =
+ DomUtil.getElementsByTagNameCaseInsensitive(document, ImmutableSet.of("script"));
+ for (Element script : scripts) {
+ Attr typeAttr = (Attr)script.getAttributes().getNamedItem("type");
+ if (typeAttr != null && scriptType.equalsIgnoreCase(typeAttr.getValue())) {
+ elements.add((Element)script);
+ }
+ }
+ }
+ return elements;
+ }
+
+ private boolean isOpenSocialScript(Element script) {
+ Attr typeAttr = (Attr)script.getAttributes().getNamedItem("type");
+ return (typeAttr != null && typeAttr.getValue() != null &&
+ GadgetHtmlParser.SCRIPT_TYPE_TO_OSML_TAG.containsKey(typeAttr.getValue()));
+ }
+}
Modified: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializerTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializerTest.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializerTest.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/CompactHtmlSerializerTest.java Thu Dec 17 00:48:29 2009
@@ -18,9 +18,11 @@
*/
package org.apache.shindig.gadgets.parse;
-import org.apache.shindig.gadgets.parse.nekohtml.NekoSimplifiedHtmlParser;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
import com.google.inject.Provider;
+
import org.junit.Before;
import org.junit.Test;
@@ -30,10 +32,11 @@
/**
* Test cases for CompactHtmlSerializer.
*/
-public class CompactHtmlSerializerTest extends AbstractParserAndSerializerTest {
+public abstract class CompactHtmlSerializerTest extends AbstractParsingTestBase {
+
+ protected abstract GadgetHtmlParser makeParser();
- private GadgetHtmlParser full = new NekoSimplifiedHtmlParser(
- new ParseModule.DOMImplementationProvider().get());
+ private GadgetHtmlParser full = makeParser();
@Before
public void setUp() throws Exception {
@@ -45,24 +48,24 @@
}
@Test
- public void testWhitespaceNotCollapsedInSpecialTags() throws Exception {
+ public void whitespaceNotCollapsedInSpecialTags() throws Exception {
String content = loadFile(
- "org/apache/shindig/gadgets/parse/nekohtml/test-with-specialtags.html");
+ "org/apache/shindig/gadgets/parse/test-with-specialtags.html");
String expected = loadFile(
- "org/apache/shindig/gadgets/parse/nekohtml/test-with-specialtags-expected.html");
+ "org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html");
parseAndCompareBalanced(content, expected, full);
}
-
+
@Test
- public void testIeConditionalCommentNotRemoved() throws Exception {
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-with-iecond-comments.html");
+ public void ieConditionalCommentNotRemoved() throws Exception {
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-with-iecond-comments.html");
String expected = loadFile(
- "org/apache/shindig/gadgets/parse/nekohtml/test-with-iecond-comments-expected.html");
+ "org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html");
parseAndCompareBalanced(content, expected, full);
}
@Test
- public void testSpecialTagsAreRecognized() {
+ public void specialTagsAreRecognized() {
assertSpecialTag("textArea");
assertSpecialTag("scrIpt");
assertSpecialTag("Style");
@@ -78,7 +81,6 @@
CompactHtmlSerializer.isSpecialTag(tagName.toLowerCase()));
}
- @Test
public void testCollapseHtmlWhitespace() throws IOException {
assertCollapsed("abc", "abc");
assertCollapsed("abc ", "abc");
@@ -96,4 +98,4 @@
CompactHtmlSerializer.collapseWhitespace(input, output);
assertEquals(expected, output.toString());
}
-}
\ No newline at end of file
+}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaCompactHtmlSerializerTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaCompactHtmlSerializerTest.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaCompactHtmlSerializerTest.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaCompactHtmlSerializerTest.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse.caja;
+
+import org.apache.shindig.gadgets.parse.CompactHtmlSerializerTest;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.apache.shindig.gadgets.parse.ParseModule;
+
+public class CajaCompactHtmlSerializerTest extends CompactHtmlSerializerTest {
+
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new CajaHtmlParser(new ParseModule.DOMImplementationProvider().get());
+ }
+
+}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaParserAndSerializerTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaParserAndSerializerTest.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaParserAndSerializerTest.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaParserAndSerializerTest.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse.caja;
+
+import org.apache.shindig.gadgets.parse.AbstractParserAndSerializerTest;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.apache.shindig.gadgets.parse.ParseModule;
+
+public class CajaParserAndSerializerTest extends AbstractParserAndSerializerTest {
+
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new CajaHtmlParser(new ParseModule.DOMImplementationProvider().get());
+ }
+
+}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaSocialMarkupHtmlParserTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaSocialMarkupHtmlParserTest.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaSocialMarkupHtmlParserTest.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/caja/CajaSocialMarkupHtmlParserTest.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse.caja;
+
+import org.apache.shindig.gadgets.parse.AbstractSocialMarkupHtmlParserTest;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.apache.shindig.gadgets.parse.ParseModule;
+
+public class CajaSocialMarkupHtmlParserTest extends AbstractSocialMarkupHtmlParserTest {
+
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new CajaHtmlParser(new ParseModule.DOMImplementationProvider().get());
+ }
+
+}
Added: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoCompactHtmlSerializerTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoCompactHtmlSerializerTest.java?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoCompactHtmlSerializerTest.java (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoCompactHtmlSerializerTest.java Thu Dec 17 00:48:29 2009
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.shindig.gadgets.parse.nekohtml;
+
+import org.apache.shindig.gadgets.parse.CompactHtmlSerializerTest;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
+import org.apache.shindig.gadgets.parse.ParseModule;
+
+/**
+ * Compact HTML serializer test using the Neko parser implementation.
+ */
+public class NekoCompactHtmlSerializerTest extends CompactHtmlSerializerTest {
+
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new NekoSimplifiedHtmlParser(
+ new ParseModule.DOMImplementationProvider().get());
+ }
+
+}
Modified: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoParserAndSerializeTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoParserAndSerializeTest.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoParserAndSerializeTest.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/NekoParserAndSerializeTest.java Thu Dec 17 00:48:29 2009
@@ -17,7 +17,10 @@
*/
package org.apache.shindig.gadgets.parse.nekohtml;
+import static org.junit.Assert.assertNull;
+
import org.apache.shindig.gadgets.parse.AbstractParserAndSerializerTest;
+import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
import org.apache.shindig.gadgets.parse.ParseModule;
import org.junit.Test;
@@ -25,61 +28,51 @@
* Test behavior of neko based parser and serializers
*/
public class NekoParserAndSerializeTest extends AbstractParserAndSerializerTest {
-
- private NekoSimplifiedHtmlParser simple = new NekoSimplifiedHtmlParser(
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new NekoSimplifiedHtmlParser(
new ParseModule.DOMImplementationProvider().get());
-
- @Test
- public void testDocWithDoctype() throws Exception {
- // Note that doctype is properly retained
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test.html");
- String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-expected.html");
- parseAndCompareBalanced(content, expected, simple);
}
-
+
+ // Neko-specific tests.
@Test
- public void testDocNoDoctype() throws Exception {
- // Note that no doctype is properly created when none specified
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype.html");
- assertNull(simple.parseDom(content).getDoctype());
- }
+ public void scriptPushedToBody() throws Exception {
+ String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript.html");
+ String expected =
+ loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html");
+ parseAndCompareBalanced(content, expected, parser);
+ }
+ // Neko overridden tests (due to Neko quirks)
+ @Override
@Test
- public void testNotADocument() throws Exception {
+ public void notADocument() throws Exception {
// Note that no doctype is injected for fragments
String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fragment.html");
String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fragment-expected.html");
- parseAndCompareBalanced(content, expected, simple);
+ parseAndCompareBalanced(content, expected, parser);
}
-
+
+ @Override
@Test
- public void testNotADocument2() throws Exception {
- // Note that no doctype is injected for fragments
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fragment2.html");
- String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fragment2-expected.html");
- parseAndCompareBalanced(content, expected, simple);
- }
-
- @Test
- public void testNoBody() throws Exception {
+ public void noBody() throws Exception {
// Note that no doctype is injected for fragments
String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-headnobody.html");
String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-headnobody-expected.html");
- parseAndCompareBalanced(content, expected, simple);
+ parseAndCompareBalanced(content, expected, parser);
}
+ // Overridden because of comment vs. script ordering. Neko stuffs script into head, but
+ // postprocessing moves it back down into body, *above* the comment element. This is
+ // semantically meaningless (to HTML), so we create a new test to accommodate it.
+ @Override
@Test
- public void testAmpersand() throws Exception {
- // Note that no doctype is injected for fragments
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-with-ampersands.html");
- String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-with-ampersands-expected.html");
- parseAndCompareBalanced(content, expected, simple);
- }
-
- @Test
- public void testScriptPushedToBody() throws Exception {
- String content = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript.html");
- String expected = loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html");
- parseAndCompareBalanced(content, expected, simple);
+ public void docNoDoctype() throws Exception {
+ // Note that no doctype is properly created when none specified
+ String content = loadFile("org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html");
+ String expected =
+ loadFile("org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html");
+ assertNull(parser.parseDom(content).getDoctype());
+ parseAndCompareBalanced(content, expected, parser);
}
}
Modified: incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/SocialMarkupHtmlParserTest.java
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/SocialMarkupHtmlParserTest.java?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/SocialMarkupHtmlParserTest.java (original)
+++ incubator/shindig/trunk/java/gadgets/src/test/java/org/apache/shindig/gadgets/parse/nekohtml/SocialMarkupHtmlParserTest.java Thu Dec 17 00:48:29 2009
@@ -18,141 +18,16 @@
*/
package org.apache.shindig.gadgets.parse.nekohtml;
-import org.apache.commons.io.IOUtils;
-import org.apache.commons.lang.StringUtils;
+import org.apache.shindig.gadgets.parse.AbstractSocialMarkupHtmlParserTest;
import org.apache.shindig.gadgets.parse.GadgetHtmlParser;
-import org.apache.shindig.gadgets.parse.HtmlSerialization;
import org.apache.shindig.gadgets.parse.ParseModule;
-import org.apache.shindig.gadgets.spec.PipelinedData;
-
-import com.google.common.collect.Lists;
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import org.junit.Before;
-import org.junit.Test;
-import org.w3c.dom.DOMException;
-import org.w3c.dom.Document;
-import org.w3c.dom.Element;
-import org.w3c.dom.Node;
-import org.w3c.dom.NodeList;
-
-import java.util.List;
/**
* Test for the social markup parser.
*/
-public class SocialMarkupHtmlParserTest {
- private GadgetHtmlParser parser;
- private Document document;
-
- @Before
- public void setUp() throws Exception {
- parser = new NekoSimplifiedHtmlParser(new ParseModule.DOMImplementationProvider().get());
-
- String content = IOUtils.toString(this.getClass().getClassLoader().
- getResourceAsStream("org/apache/shindig/gadgets/parse/nekohtml/test-socialmarkup.html"));
- document = parser.parseDom(content);
- }
-
- @Test
- public void testSocialData() {
- // Verify elements are preserved in social data
- List<Element> scripts = getTags(GadgetHtmlParser.OSML_DATA_TAG);
- assertEquals(1, scripts.size());
-
- NodeList viewerRequests = scripts.get(0).getElementsByTagNameNS(
- PipelinedData.OPENSOCIAL_NAMESPACE, "ViewerRequest");
- assertEquals(1, viewerRequests.getLength());
- Element viewerRequest = (Element) viewerRequests.item(0);
- assertEquals("viewer", viewerRequest.getAttribute("key"));
- assertEmpty(viewerRequest);
- }
-
- @Test
- public void testSocialTemplate() {
- // Verify elements and text content are preserved in social templates
- List<Element> scripts = getTags(GadgetHtmlParser.OSML_TEMPLATE_TAG);
- assertEquals(1, scripts.size());
-
- NodeList boldElements = scripts.get(0).getElementsByTagName("b");
- assertEquals(1, boldElements.getLength());
- Element boldElement = (Element) boldElements.item(0);
- assertEquals("Some ${viewer} content", boldElement.getTextContent());
-
- NodeList osHtmlElements = scripts.get(0).getElementsByTagNameNS(
- "http://ns.opensocial.org/2008/markup", "Html");
- assertEquals(1, osHtmlElements.getLength());
- }
-
- @Test
- public void testSocialTemplateSerialization() {
- String content = HtmlSerialization.serialize(document);
- assertTrue("Empty elements not preserved as XML inside template",
- content.contains("<img/>"));
- }
-
- @Test
- public void testJavascript() {
- // Verify text content is unmodified in javascript blocks
- List<Element> scripts = getTags("script");
- assertEquals(1, scripts.size());
-
- NodeList boldElements = scripts.get(0).getElementsByTagName("b");
- assertEquals(0, boldElements.getLength());
-
- String scriptContent = scripts.get(0).getTextContent().trim();
- assertEquals("<b>Some ${viewer} content</b>", scriptContent);
- }
-
- @Test
- public void testPlainContent() {
- // Verify text content is preserved in non-script content
- NodeList spanElements = document.getElementsByTagName("span");
- assertEquals(1, spanElements.getLength());
- assertEquals("Some content", spanElements.item(0).getTextContent());
- }
-
- @Test
- public void testCommentOrdering() {
- NodeList divElements = document.getElementsByTagName("div");
- assertEquals(1, divElements.getLength());
- NodeList children = divElements.item(0).getChildNodes();
- assertEquals(3, children.getLength());
-
- // Should be comment/text/comment, not comment/comment/text
- assertEquals(Node.COMMENT_NODE, children.item(0).getNodeType());
- assertEquals(Node.TEXT_NODE, children.item(1).getNodeType());
- assertEquals(Node.COMMENT_NODE, children.item(2).getNodeType());
- }
-
- @Test
- public void testInvalid() throws Exception {
- String content = "<html><div id=\"div_super\" class=\"div_super\" valign:\"middle\"></div></html>";
- try {
- parser.parseDom(content);
- fail("No exception caught");
- } catch (DOMException e) {
- assertTrue(e.getMessage().contains("INVALID_CHARACTER_ERR"));
- assertTrue(e.getMessage().contains(
- "Around ...<div id=\"div_super\" class=\"div_super\"..."));
- }
- }
-
- private void assertEmpty(Node n) {
- if (n.getChildNodes().getLength() != 0) {
- assertTrue(StringUtils.isEmpty(n.getTextContent()) ||
- StringUtils.isWhitespace(n.getTextContent()));
- }
- }
-
- private List<Element> getTags(String tagName) {
- NodeList list = document.getElementsByTagName(tagName);
- List<Element> elements = Lists.newArrayListWithExpectedSize(list.getLength());
- for (int i = 0; i < list.getLength(); i++) {
- elements.add((Element) list.item(i));
- }
- return elements;
+public class SocialMarkupHtmlParserTest extends AbstractSocialMarkupHtmlParserTest {
+ @Override
+ protected GadgetHtmlParser makeParser() {
+ return new NekoSimplifiedHtmlParser(new ParseModule.DOMImplementationProvider().get());
}
}
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-fulldocnodoctype-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,8 @@
+<html>
+ <head><style>CSS</style></head>
+ <body>
+ <script>function foo(){}</script>
+ <!-- This is a full doc with no doctype -->
+ <div id="mydiv">DIV</div>
+ </body>
+</html>
\ No newline at end of file
Modified: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html?rev=891496&r1=891495&r2=891496&view=diff
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html (original)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/nekohtml/test-leadingscript-expected.html Thu Dec 17 00:48:29 2009
@@ -3,5 +3,9 @@
<link rel="linkrel">
-</head><body><script>foo1();</script><script>foo2();</script><script>foo3();</script><div id="mydiv">mycontent</div>
+</head><body>
+<script>foo1();</script>
+<script>foo2();</script>
+<script>foo3();</script>
+<div id="mydiv">mycontent</div>
</body></html>
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,23 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html><head id="head">
+ <link href="http://www.example.org/css.css" rel="stylesheet" type="text/css">
+ <title>An example</title>
+</head><body>
+ <!-- Some comment -->
+ <script type="text/javascript">document.write("&&&")</script>
+ <script src="http://www.example.org/1.js" type="text/javascript"></script>
+ <div>
+ <table><TBODY><tr><td>a cell</td></tr></TBODY></table>
+ </div>
+ <p>Lorem ipsum</p>
+ <a href="/test.html" title="">link</a>
+ <form action="/test/submit">
+ <div>
+ <input type="hidden" value="something">
+ <input type="text">
+ </div>
+ <div><-- An unbalanced tag we dont care about -->
+ <p>Some entities &#x27;"</p>
+ <p>Not a real entity &fake;</p>
+ </div></form>
+</body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,2 @@
+<html><head>
+<style type="text/css"> A { font : bold; }</style></head><body><script>document.write("dont add to head or else")</script></body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,2 @@
+<script>document.write("dont add to head or else")</script>
+<style type="text/css"> A { font : bold; }</style>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,2 @@
+<html><head><style type="text/css"> A { background-color : #7f7f7f; } </style>
+</head><body><div>A div</div></body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fragment2.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,2 @@
+<style type="text/css"> A { background-color : #7f7f7f; } </style>
+<div>A div</div>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,7 @@
+<html>
+ <head><style>CSS</style><script>function foo(){}</script></head>
+ <body>
+ <!-- This is a full doc with no doctype -->
+ <div id="mydiv">DIV</div>
+ </body>
+</html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-fulldocnodoctype.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,7 @@
+<html>
+ <head><style>CSS</style><script>function foo(){}</script></head>
+ <body>
+ <!-- This is a full doc with no doctype -->
+ <div id="mydiv">DIV</div>
+ </body>
+</html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,5 @@
+<html><head><style type="text/css"> A { font : bold; } </style></head><body>
+ <!-- A head tag but no body tag is not good -->
+<script>document.write("dont add to head or else")</script>
+
+</body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-headnobody.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,5 @@
+<head>
+ <!-- A head tag but no body tag is not good -->
+</head>
+<script>document.write("dont add to head or else")</script>
+<style type="text/css"> A { font : bold; } </style>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-socialmarkup.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-socialmarkup.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-socialmarkup.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-socialmarkup.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,19 @@
+<link rel="foo"></link>
+
+<script type="text/os-data" xmlns:os="http://ns.opensocial.org/2008/markup">
+ <os:ViewerRequest key="viewer"/>
+</script>
+
+<script id="template-id" name="template-name" type="text/os-template" tag="template-tag" xmlns:os="http://ns.opensocial.org/2008/markup">
+ <b>Some ${viewer} content</b>
+ <img/>
+ <os:Html/>
+</script>
+
+<script type="text/javascript">
+ <b>Some ${viewer} content</b>
+</script>
+
+<span>Some content</span>
+
+<div><!-- foo -->bar<!-- baz --></div>
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,8 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html><head id="head">
+ <link href="http://www.example.org/css.css" rel="stylesheet" type="text/css">
+ <title>An example</title>
+</head><body>
+ <!-- Some comment -->
+ <span title="&lt;">content</span>
+</body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-ampersands.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,11 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head id="head">
+ <link href="http://www.example.org/css.css" rel="stylesheet" type="text/css">
+ <title>An example</title>
+</head>
+<body>
+ <!-- Some comment -->
+ <span title="&lt;">content</span>
+</body>
+</html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,4 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html><head id="head"><link href="http://www.example.org/css.css" rel="stylesheet" type="text/css"><title>An example</title></head><body><!--[if IE 5]>
+ <p>Welcome to Internet Explorer 5.</p>
+ <![endif]--><!--[if IE]><p>You are using Internet Explorer.</p><![endif]--><!--[if !IE]><p>You are not using Internet Explorer.</p><![endif]--><!--[if IE 7]><p>Welcome to Internet Explorer 7!</p><![endif]--><!--[if !(IE 7)]><p>You are not using version 7.</p><![endif]--><!--[if gte IE 7]><p>You are using IE 7 or greater.</p><![endif]--><!--[if (IE 5)]><p>You are using IE 5 (any version).</p><![endif]--><!--[if (gte IE 5.5)&(lt IE 7)]><p>You are using IE 5.5 or IE 6.</p><![endif]--><!--[if lt IE 5.5]><p>Please upgrade your version of Internet Explorer.</p><![endif]--><!--[if true]>You are using an <em>uplevel</em> browser.<![endif]--><!--[if false]>You are using a <em>downlevel</em> browser.<![endif]--><!--[if true]><![if IE 7]><p>This nested comment is displayed in IE 7.</p><![endif]><![endif]--></body></html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-iecond-comments.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,30 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head id="head">
+ <link href="http://www.example.org/css.css" rel="stylesheet" type="text/css">
+ <title>An example</title>
+</head>
+<body>
+ <!--[if IE 5]>
+ <p>Welcome to Internet Explorer 5.</p>
+ <![endif]-->
+
+ <!--[if IE]><p>You are using Internet Explorer.</p><![endif]-->
+ <!--[if !IE]><p>You are not using Internet Explorer.</p><![endif]-->
+
+ <!--[if IE 7]><p>Welcome to Internet Explorer 7!</p><![endif]-->
+ <!--[if !(IE 7)]><p>You are not using version 7.</p><![endif]-->
+
+ <!--[if gte IE 7]><p>You are using IE 7 or greater.</p><![endif]-->
+ <!--[if (IE 5)]><p>You are using IE 5 (any version).</p><![endif]-->
+ <!--[if (gte IE 5.5)&(lt IE 7)]><p>You are using IE 5.5 or IE 6.</p><![endif]-->
+ <!--[if lt IE 5.5]><p>Please upgrade your version of Internet Explorer.</p><![endif]-->
+
+ <!--[if true]>You are using an <em>uplevel</em> browser.<![endif]-->
+ <!--[if false]>You are using a <em>downlevel</em> browser.<![endif]-->
+
+ <!--[if true]><![if IE 7]><p>This nested comment is displayed in IE 7.</p><![endif]><![endif]-->
+
+ <!-- this standard comment should be removed -->
+</body>
+</html>
\ No newline at end of file
Added: incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html
URL: http://svn.apache.org/viewvc/incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html?rev=891496&view=auto
==============================================================================
--- incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html (added)
+++ incubator/shindig/trunk/java/gadgets/src/test/resources/org/apache/shindig/gadgets/parse/test-with-specialtags-expected.html Thu Dec 17 00:48:29 2009
@@ -0,0 +1,33 @@
+<html><head><title>An example</title><style type="text/css">
+ <!--
+ #mymap #header {
+ background: #FF9700;
+ clear: both;
+ padding: 2px 0 1px;
+ position: relative;
+ width: 640px;
+ }
+
+ -->
+</style></head><body><script type="text/javascript">document.write("&&&")</script><script src="http://www.example.org/1.js" type="text/javascript"></script><script>
+ // scripts with no old comment hack should be preserved.
+ function a1() {
+ var v1 = 0;
+ alert(" this whitespace should be preserved.");
+ }
+</script><div><table><TBODY><tr><td>a cell</td></tr></TBODY></table></div><script type="text/javascript">
+ <!--
+ // script with old comment hack should be preserved.
+ function MM_goToURL() {
+ var i, args = MM_goToURL.arguments;
+ document.MM_returnValue = false;
+ for (i = 0; i < (args.length - 1); i += 2) eval(args[i] + ".location='" + args[i + 1] + "'");
+ }
+ //-->
+</script><p>Lorem ipsum</p><a href="/test.html" title="">link</a><pre>
+ This is a preformatted block of text,
+ and whitespaces should be preserved.
+ </pre><form action="/test/submit"><div><input type="hidden" value="something"><input type="text"><textarea>
+ This is a preformatted block of text,
+ and whitespaces should be preserved too.
+ </textarea></div></form></body></html>
\ No newline at end of file