消除不受信任的HTML (来防止XSS攻击)_html/css_WEB-ITnose
问题
在做网站的时候,经常会提供用户评论的功能。有些不怀好意的用户,会搞一些脚本到评论内容中,而这些脚本可能会破坏整个页面的行为,更严重的是获取一些机要信息,此时需要清理该HTML,以避免跨站脚本cross-site scripting攻击(XSS)。
方法
使用jsoup HTML Cleaner 方法进行清除,但需要指定一个可配置的 Whitelist。
String unsafe = "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";String safe = Jsoup.clean(unsafe, Whitelist.basic());// now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>
说明
XSS又叫CSS (Cross Site Script) ,跨站脚本攻击。它指的是恶意攻击者往Web页面里插入恶意html代码,当用户浏览该页之时,嵌入其中Web里面的html代码会被执行,从而达到恶意攻击用户的特殊目的。XSS属于被动式的攻击,因为其被动且不好利用,所以许多人常忽略其危害性。所以我们经常只让用户输入纯文本的内容,但这样用户体验就比较差了。
一个更好的解决方法就是使用一个富文本编辑器WYSIWYG如CKEditor 和 TinyMCE。这些可以输出HTML并能够让用户可视化编辑。虽然他们可以在客户端进行校验,但是这样还不够安全,需要在服务器端进行校验并清除有害的HTML代码,这样才能确保输入到你网站的HTML是安全的。否则,攻击者能够绕过客户端的Javascript验证,并注入不安全的HMTL直接进入您的网站。
jsoup的whitelist清理器能够在服务器端对用户输入的HTML进行过滤,只输出一些安全的标签和属性。
jsoup提供了一系列的Whitelist基本配置,能够满足大多数要求;但如有必要,也可以进行修改,不过要小心。
这个cleaner非常好用不仅可以避免XSS攻击,还可以限制用户可以输入的标签范围。
参见
参阅XSS cheat sheet ,有一个例子可以了解为什么不能使用正则表达式,而采用安全的whitelist parser-based清理器才是正确的选择。
参阅Cleaner ,了解如何返回一个 Document 对象,而不是字符串
参阅Whitelist,了解如何创建一个自定义的whitelist
nofollow 链接属性了解

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics



The article discusses the HTML <progress> element, its purpose, styling, and differences from the <meter> element. The main focus is on using <progress> for task completion and <meter> for stati

The article discusses the HTML <datalist> element, which enhances forms by providing autocomplete suggestions, improving user experience and reducing errors.Character count: 159

HTML is suitable for beginners because it is simple and easy to learn and can quickly see results. 1) The learning curve of HTML is smooth and easy to get started. 2) Just master the basic tags to start creating web pages. 3) High flexibility and can be used in combination with CSS and JavaScript. 4) Rich learning resources and modern tools support the learning process.

The article discusses the HTML <meter> element, used for displaying scalar or fractional values within a range, and its common applications in web development. It differentiates <meter> from <progress> and ex

The article discusses the <iframe> tag's purpose in embedding external content into webpages, its common uses, security risks, and alternatives like object tags and APIs.

The article discusses the viewport meta tag, essential for responsive web design on mobile devices. It explains how proper use ensures optimal content scaling and user interaction, while misuse can lead to design and accessibility issues.

HTML defines the web structure, CSS is responsible for style and layout, and JavaScript gives dynamic interaction. The three perform their duties in web development and jointly build a colorful website.

AnexampleofastartingtaginHTMLis,whichbeginsaparagraph.StartingtagsareessentialinHTMLastheyinitiateelements,definetheirtypes,andarecrucialforstructuringwebpagesandconstructingtheDOM.
