You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/11/04 13:25:25 UTC

[GitHub] [superset] villebro commented on a diff in pull request #21895: feat: Improves SafeMarkdown HTML sanitization

villebro commented on code in PR #21895:
URL: https://github.com/apache/superset/pull/21895#discussion_r1014008269


##########
superset-frontend/packages/superset-ui-core/src/components/SafeMarkdown.tsx:
##########
@@ -16,38 +16,44 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-
-import React from 'react';
-import ReactMarkdown, { MarkdownAbstractSyntaxTree } from 'react-markdown';
-// @ts-ignore no types available
-import htmlParser from 'react-markdown/plugins/html-parser';
-
+import React, { useMemo } from 'react';
+import ReactMarkdown from 'react-markdown';
+import rehypeSanitize, { defaultSchema } from 'rehype-sanitize';
+import rehypeRaw from 'rehype-raw';
+import { merge } from 'lodash';
 import { FeatureFlag, isFeatureEnabled } from '../utils';
 
 interface SafeMarkdownProps {
   source: string;
+  htmlSanitization?: boolean;
+  htmlSchemaOverrides?: typeof defaultSchema;
 }
 
-function isSafeMarkup(node: MarkdownAbstractSyntaxTree) {
-  return node.type === 'html' && node.value
-    ? !/(href|src)="(javascript|vbscript|file):.*"/gim.test(node.value)
-    : true;
-}
+function SafeMarkdown({
+  source,
+  htmlSanitization = true,
+  htmlSchemaOverrides = {},
+}: SafeMarkdownProps) {

Review Comment:
   It'd be nice to have RTL tests that test something along the following lines:
   - the default use case where source is rendering correctly
   - sanitization is disabled -> how do things change
   - a schema override before and after
   



##########
superset-frontend/packages/superset-ui-core/src/components/SafeMarkdown.tsx:
##########
@@ -16,38 +16,44 @@
  * specific language governing permissions and limitations
  * under the License.
  */
-
-import React from 'react';
-import ReactMarkdown, { MarkdownAbstractSyntaxTree } from 'react-markdown';
-// @ts-ignore no types available
-import htmlParser from 'react-markdown/plugins/html-parser';
-
+import React, { useMemo } from 'react';
+import ReactMarkdown from 'react-markdown';
+import rehypeSanitize, { defaultSchema } from 'rehype-sanitize';
+import rehypeRaw from 'rehype-raw';
+import { merge } from 'lodash';
 import { FeatureFlag, isFeatureEnabled } from '../utils';
 
 interface SafeMarkdownProps {
   source: string;
+  htmlSanitization?: boolean;
+  htmlSchemaOverrides?: typeof defaultSchema;
 }
 
-function isSafeMarkup(node: MarkdownAbstractSyntaxTree) {
-  return node.type === 'html' && node.value
-    ? !/(href|src)="(javascript|vbscript|file):.*"/gim.test(node.value)
-    : true;
-}
+function SafeMarkdown({
+  source,
+  htmlSanitization = true,
+  htmlSchemaOverrides = {},
+}: SafeMarkdownProps) {
+  const displayHtml = isFeatureEnabled(FeatureFlag.DISPLAY_MARKDOWN_HTML);
+  const escapeHtml = isFeatureEnabled(FeatureFlag.ESCAPE_MARKDOWN_HTML);
+
+  const rehypePlugins = useMemo(() => {
+    const rehypePlugins: any = [];
+    if (displayHtml && !escapeHtml) {
+      rehypePlugins.push(rehypeRaw);
+      if (htmlSanitization) {
+        const schema = merge(defaultSchema, htmlSchemaOverrides);
+        rehypePlugins.push([rehypeSanitize, schema]);
+      }
+    }
+    return rehypePlugins;
+  }, [displayHtml, escapeHtml, htmlSanitization, htmlSchemaOverrides]);

Review Comment:
   Kind of unrelated comment, but just thinking out loud: as the feature flags here are constant (they're populated from bootstrap data at load time), `displayHtml` and `escapeHtml` are both constants, hence won't really change and are kind of redundant in the dep array. But it would be cool to make something like
   ```typescript
   const displayHtml = useFeatureFlag(FeatureFlag.DISPLAY_MARKDOWN_HTML);
   ```
   which could change dynamically, in which case they would also be more relevant in the dep array.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org