Suppose I have an object of string keys and string values, and I want to write them as CSS custom properties into some server-generated HTML. How can I do this safely?
What I mean by security is
To keep it simple, I will restrict the key to only allow characters in the [a-zA-Z0-9_-]
class.
From reading the CSS spec and some personal testing, I think you can make a lot of progress by getting the value by following these steps:
{([
outside the string outside the string has a matching closing brace. If not, discard this key-value pair. \3C
to escape all instances of <<
, and use 3E
to escape all instances of >
. \3B
to escape all instances of ;
. I came up with the above steps based on this CSS syntax specification
For context, these properties can be used by user-defined styles that we insert elsewhere, but the same object is also used as template data in the template, so it may contain strings intended as content and strings expected of mixins as CSS variables. I feel like the algorithm above strikes a good balance of being very simple without running the risk of throwing away too many key-value pairs that might be useful in CSS (even allowing for future additions to CSS, but I want to make sure I don't Missing something.
Here's some JS code showing what I'm trying to achieve. obj
is the object in question, and preprocessPairs
is a function that takes the object and preprocesses it, removing/reformatting the values as described in the above steps.
function generateThemePropertiesTag(obj) { obj = preprocessPairs(obj); return `<style> :root { ${Object.entries(obj).map(([key, value]) => { return `--theme-${key}: ${value};` }).join("\n")} } </style>` }
So when given an object like this
{ "color": "#D3A", "title": "The quick brown fox" }
I want the CSS to look like this:
:root { --theme-color: #D3A; --theme-title: The quick brown fox; }
Although --theme-title
is a pretty useless custom variable when used in CSS, it doesn't actually break the stylesheet because CSS ignores properties it doesn't understand.
We might actually just use regular expressions and some other algorithms without having to rely on a specific language, hopefully that's what you need.
By declaring that the object key is inside
[a-zA-Z0-9_-]
we need to parse the value somehow.Value Model
So we can break it down into categories and see what we come across (they may be slightly simplified for clarity):
'.*'
(String surrounded by apostrophes; greedy)".*"
(String enclosed in double quotes; greedy)[ -]?\d (\.\d )?(%|[A-z] )?
(integer and decimal, optional percentage or with unit)#[0-9A-f]{3,6}
(color)[A-z0-9_-]
(keywords, named colors, "ease in", etc.)([\w-] )\([^)] \)
(functions similar tourl()
,calc()
> etc. )First filter
I can imagine you could do some filtering before trying to identify these patterns. Maybe we trim the value string first. As you mentioned,
and
>
can be escaped at the beginning of thepreprocessPairs()
function as it won't appear as we have above any mode. If you don't want unescaped semicolons appearing anywhere, you can also escape them.Recognition pattern
We can then try to identify these patterns within the values , and for each pattern we may need to run filtering again. We expect these patterns to be separated by some (or two) whitespace characters.
It should be okay to include support for multiline strings, which is an escaped newline.
Locales
We need to realize that we have at least two contexts to filter - HTML and CSS. When we include styles in
elements, the input must be safe and it must be valid CSS. Fortunately, you're not including the CSS in the element's
style
attribute, so this is slightly easier.Filtering based on value pattern
So points 1-5 will be very simple and most of the values will be covered by the simple filtering and trimming ahead. With some additions (don't know what impact on performance) it might even do extra checks for correct units, keywords, etc.
But compared to other points, I think the relatively bigger challenge is point 6. You might decide to simply disable
url()
in this custom style, letting you check the input to the function, so for example you might want to escape the semicolon, or maybe even check inside the function again with a tiny tweak The pattern is for examplecalc()
.in conclusion
In general, this is my opinion. With a few tweaks to these regular expressions, it should complement what you're already doing and give you as much flexibility as possible in typing CSS while saving you from having to tweak your code every time you tweak a CSS feature.
Example
Please comment, discuss, criticize, and let me know if I forgot to touch on a topic that is of particular interest to you.
source
Disclaimer: I am not the author, owner, investor, or contributor of the sources mentioned below. I just happen to use them to get some information.