Example #1: Function expression identifier leaks into an enclosing scope
Example #1: Function expression identifier leaks into an enclosing scope
var f = function g(){} ;
typeof g; // "function"
Remember how I mentioned that an identifier of named function expression is not available in an enclosing scope? Well, JScript doesn't agree with specs on this one - g in the above example resolves to a function object. This is a most widely observed discrepancy. It's dangerous in that it inadvertedly pollutes an enclosing scope - a scope that might as well be a global one - with an extra identifier. Such pollution can, of course, be a source of hard-to-track bugs.
I just mentioned that the identifier of a named function expression cannot be accessed from the outer scope. However, JScript does not comply with the standard in this regard. In the above example, g is a function object. This is a widely observable difference. This way it pollutes the surrounding scope with an extra identifier, which is probably the global scope, which is very dangerous. Of course, this contamination may be the source of a bug that is difficult to handle and track
Example #2: Named function expression is treated as BOTH - function declaration AND function expression
Example #2: Named function expression is treated as BOTH - function declaration AND function expression Double processing, function expressions and function declarations
typeof g; // "function"
var f = function g(){};
As I explained before, function declarations are parsed foremost any other expressions in a particular execution context. The above example demonstrates how JScript actually treats named function expressions as function declarations. You can see that it parses g before an “actual declaration” takes place.
As I explained earlier, function declarations are interpreted before all expressions in a particular execution environment. The above example shows that JScript actually treats a named function expression as a function declaration. We can see that he is explained before an actual statement.
This brings us to a next example:
Based on this we introduce the following example.
Example #3: Named function expression creates TWO DISCTINCT function objects!
Example #3: Named function expression creates two different function objects.
var f = function g(){};
f === g; // false
f.expando = 'foo';
g.expando; / / undefined
This is where things are getting interesting. Or rather - completely nuts. Here we are seeing the dangers of having to deal with two distinct objects - augmenting one of them obviously does not modify the other one; This could be quite troublesome if you decided to employ, say, caching mechanism and store something in a property of f, then tried accessing it as a property of g, thinking that it is the same object you're working with.
This is where things get a little more interesting, or completely crazy. Here we see the danger of having to deal with two different objects, so that when one of them is extended, the other will not change accordingly. If you plan to use the cache mechanism and store something in the attribute of f, and only try to access it in the attribute of g, you would think that they point to the same object, which will become very troublesome
Let's look at something a bit more complex.
Let’s look at some more complex examples.
Example #4: Function declarations are parsed sequentially and are not affected by conditional blocks
Example #4: Function declarations are parsed sequentially and are not affected by conditional blocks
var f = function g() {
return 1;
};
if (false) {
f = function g(){
return 2;
}
};
g(); // 2
An example like this could cause even harder to track bugs. What happens here is actually quite simple. First, g is being parsed as a function declaration, and since declarations in JScript are independent of conditional blocks, g is being declared as a function from the “dead” if branch - function g(){ return 2 }. Then all of the “regular” expressions are being evaluated and f is being assigned another, newly created function object to. “dead” if branch is never entered when evaluating expressions, so f keeps referencing first function - function g(){ return 1 }. It should be clear by now, that if you're not careful enough, and call g from within f, you'll end up calling a completely unrelated g function object.
An example like this can make tracking down a bug very difficult. The problem here is very simple. First g is interpreted as a function declaration, and since declarations in JScript are independent of conditional blocks, g is declared as a function from the invalid if branch function g(){ return 2 }. The ordinary expression is then evaluated and f is assigned to another newly created function object. When the expression is executed, since the if conditional branch will not be entered, f remains a reference to the first function function g(){ return 1 }.It is now clear that if you are not careful and call g inside f, you will end up calling a completely unrelated g function object.
You might be wondering how all this mess with different function objects compares to arguments.callee. Does callee reference f or g? Let's take a look:
You might be wondering how all this mess with different function objects compares to arguments.callee. What will be the result of comparing the object with arguments.callee? Does callee refer to f or g? Let’s take a look at
var f = function g(){
return [
arguments.callee == f,
arguments.callee == g
];
};
f(); // [true, false]
As you can see, arguments.callee references same object as f identifier. This is actually good news, as you will see later on .
We can see that arguments.callee refers to the same object as the f identifier, as you will see later, which is good news
Looking at JScript deficiencies, it becomes pretty clear what exactly we need to avoid. First, we need to be aware of a leaking identifier (so that it doesn't pollute enclosing scope). Second, we should never reference identifier used as a function name; A troublesome identifier is g from the previous examples. Notice how many ambiguities could have been avoided if we were to forget about g's existence. Always referencing function via f or arguments.callee is the key here. If you use named expression, think of that name as something that's only being used for debugging purposes. And finally, a bonus point is to always clean up an extraneous function created erroneously during NFE declaration.
Now that we have seen the shortcomings of JScript, it is very clear what we should avoid . First, we need to be aware of identifier leakage (so that it does not pollute the surrounding scope). Second, we should not quote identifiers as function names; g is a problematic identifier as seen from the previous example. Note that many ambiguities can be avoided if we forget the existence of g. Usually the most critical thing is to reference the function through f or argument.callee. If you use named expressions, remember that the name exists only for debugging purposes. Finally, an extra point is to always clean up additional functions created by wrongly declared named function expressions
I think last point needs a bit of an explanation:
I think the last point needs a bit of an explanation: More explanations
JScript memory management
Being familiar with JScript discrepancies, we can now see a potential problem with memory consumption when using these buggy constructs. Let's look at a simple example:
familiar Understanding the differences between JScript and the specification, we can see potential issues related to memory consumption when using these problematic constructs
var f = (function(){
if (true) {
return function g(){};
}
return function g(){};
})();
We know that a function returned from within this anonymous invocation - the one that has g identifier - is being assigned to outer f. We also know that named function expressions produce superfluous function object, and that this object is not the same as returned function. The memory issue here is caused by this extraneous g function being literally “trapped” in a closure of returning function. This happens because inner function is declared in the same scope as that pesky g one. Unless we explicitly break reference to g function it will keep consuming memory.
We find that a function returned from the anonymous call, that is, the function with g as the identifier, is copied to the external f. We also know that the named function expression creates an extra function object, and this object is not the same function as the returned object. The memory problem here is caused by the useless g function being literally captured in a closure that returns a function. This is because the inner function is declared in the same scope as the damn g function. Unless we explicitly destroy the reference to the g function, it will always occupy memory.
var f = (function(){
var f, g;
if (true) {
f = function g(){};
}
else {
f = function g(){};
}
//Assign null to g so that it will no longer be referenced by unrelated functions.
//null `g`, so that it. doesn't reference extraneous function any longer
g = null;
return f;
})();
Note that we explicitly declare g as well, so that g = null assignment wouldn't create a global g variable in conforming clients (i.e. non-JScript ones). By nulling reference to g, we allow garbage collector to wipe off this implicitly created function object that g refers to.
Note that we explicitly declared g, so g=null assignment will not create a global variable for compliant clients (such as non-JScirpt engines). By giving g a null reference, we allow garbage collection to clean the implicitly created function object referenced by g.
When taking care of JScript NFE memory leak, I decided to run a simple series of tests to confirm that nulling g actually does free memory. When the problem was revealed, I decided to run a series of simple tests to confirm that giving null references to g functions actually frees memory
Test
The test was simple. It would simply create 10000 functions via named function expressions and store them in an array. I would then wait for about a minute and check how high the memory consumption is. After that I would null-out the reference and repeat the procedure again. Here's a test case I used:
This test is very simple. He will create 1000 functions from named function expressions and store them in an array. I wait for about a minute and see how high the memory usage is. Only we add a null reference and repeat the above process. The following is a simple test case I use
function createFn(){
return (function(){
var f;
if (true) {
f = function F(){
return 'standard';
}
}
else if (false) {
f = function F(){
return 'alternative';
}
}
else {
f = function F(){
return 'fallback';
}
}
// var F = null;
return f;
})();
}
var arr = [ ];
for (var i=0; i<10000; i ) {
arr[i] = createFn();
}
Results as seen in Process Explorer on Windows XP SP2 were:
The results were performed on Windows XP SP2 and obtained through Process Explorer
IE6:
without `null`: 7.6K -> 20.3K
with `null`: 7.6K -> 18K
IE7:
without `null`: 14K -> 29.7K
with `null`: 14K -> 27K
The results somewhat confirmed my assumptions - explicitly nulling superfluous reference did free memory, but the difference in consumption was relatively insignificant. For 10000 function objects, there would be a ~3MB difference. This is definitely something that should be kept in mind when designing large-scale applications, applications that will run for either long time or on devices with limited memory (such as mobile devices). For any small script, the difference probably doesn't matter.
The results confirm my hypothesis to a certain extent, showing that giving null values to useless references does free memory , but the difference in consumption of inner dimensions does not seem to be very big. For 1000 function objects, there should be a difference of about 3M. But something is clear. When designing large-scale applications, the application will either run for a long time or on a device with limited memory (such as a mobile device). For any small script, the difference may not be very important.
You might think that it's all finally over, but we are not just quite there yet :) There's a tiny little detail that I'd like to mention and that detail is Safari 2.x
You may think this is the end, but it's not the end yet. I'm also going to discuss some minor details that are present in older versions of Safari; namely, Safari 2 .x series. I've seen some claims on the web that Safari 2.x does not support NFE at all. This is not true. Safari does support it, but has bugs in its implementation which you will see shortly.
Although it has not been discovered by people in the early version of Safari, that is, the bug of named function expression in Safari 2.x version. But I've seen some claims on the web that Safari 2.x doesn't support named function expressions at all. This is not true. Safari does support named function expressions, but as you will see later there are bugs in its implementation
When encountering function expression in a certain context, Safari 2.x fails to parse the program entirely. It doesn't throw any errors (such as SyntaxError ones). It simply bails out:
Safari 2.x will fail to interpret the entire program when a function expression is encountered in certain execution environments. It does not throw any errors (such as SyntaxError). Displayed as follows
(function f(){})(); // <== Famous function expression NFE
alert(1); //Because the previous expression is the entire program failed , this line is never reached, since previous expression fails the entire program
After fiddling with various test cases, I came to conclusion that Safari 2.x fails to parse named function expressions, if those are not part of assignment expressions. Some examples of assignment expressions are:
After testing with some test cases, I concluded that Safari interprets named function expressions if they are not part of assignment expressions. will fail.Some examples of assignment expressions are as follows
// part of variable declaration
var f = 1;
//part of simple assignment
f = 2 , g = 3;
// part of return statement
(function(){
return (f = 2);
})();
This means that putting named function expression into an assignment makes Safari “happy”:
This means that putting named function expression into an assignment makes Safari “happy”
(function f(){}); // fails
var f = function f(){}; // works successfully
(function(){
return function f(){}; // fails
})();
(function(){
return (f = function f(){}); // works successfully
})();
setTimeout(function f(){ }, 100); // fails
It also means that we can't use such common pattern as returning named function expression without an assignment:
This also means that we cannot use this ordinary pattern without an assignment expression as a return named function expression
//To replace this Safari2. Instead of this non-Safari-2x-compatible syntax:
(function(){
if (featureTest) {
return function f(){};
}
return function f(){};
})();
// we should use this slightly more verbose alternative:
(function( ){
var f;
if (featureTest) {
f = function f(){};
}
else {
f = function f(){};
}
return f;
})();
// or another variation of it:
(function(){
var f;
if (featureTest) {
return (f = function f(){});
}
return (f = function f(){});
})();
/*
Unfortunately, by doing so, we introduce an extra reference to a function
which gets trapped in a closure of returning function. To prevent extra memory usage,
we can assign all named function expressions to one single variable.
Unfortunately by doing this we introduce another reference to the function
It will be included in the closure of the returning function
To prevent excessive memory usage , we can assign all named function expressions to a separate variable
*/
var __temp;
(function(){
if (featureTest) {
return (__temp = function f(){});
}
return (__temp = function f(){});
})();
...
(function(){
if (featureTest2) {
return (__temp = function g(){});
}
return (__temp = function g(){ });
})();
/*
Note that subsequent assignments destroy previous references,
preventing any excessive memory usage. references to prevent any excessive memory usage
*/
If Safari 2.x compatibility is important, we need to make sure “incompatible” constructs do not even appear in the source. This is of course quite irritating, but is definitely possible to achieve, especially when knowing the root of the problem.
If Safari2.x compatibility is very important. We need to ensure that incompatible structures no longer appear in the code. This is of course very annoying, but it's certainly possible, especially if we know the source of the problem.
It's also worth mentioning that declaring a function as NFE in Safari 2.x exhibits another minor glitch, where function representation does not contain function identifier:
It's also worth mentioning that in Safari There is another small problem when declaring a function to be a named function expression. This is that the function representation does not contain a function identifier (probably a problem with toString)
var f = function g(){};
// Notice how function representation is lacking `g` identifier
String(g); // function () { }
This is not really a big deal. As I have already mentioned before, function decompilation is something that should not be relied upon anyway.
This is not a big problem. Because I have said before, function decompilation cannot be trusted under any circumstances.
Solution
var fn = (function(){
//Declare a variable to assign function object to
var f ;
// conditionally create a named function
// and assign its reference to f and assign its reference to `f`
if (true) {
f = function F(){ }
}
else if (false) {
f = function F(){ }
}
else {
f = function F (){ }
}
//Assign `null` to a variable corresponding to a function name
//This can make the function object ( This marks the function object (referred to by that identifier)
// available for garbage collection
var F = null;
//return return a conditionally defined function return a conditionally defined function
return f;
})();
Finally, here's how we would apply this “techinque” in real life, when writing something like a cross-browser addEvent function:
Finally, while we have a similar function similar to the cross-browser addEvent function, here is how we can use this technique in a real application
// 1) enclose declaration with a separate scope
var addEvent = (function(){
var docEl = document.documentElement;
/ / 2) Declare a variable to assign function to
var fn;
if (docEl.addEventListener) {
// 3) Make sure to make sure to give function a descriptive identifier
fn = function addEvent(element, eventName, callback) {
element.addEventListener(eventName, callback, false);
}
}
else if (docEl.attachEvent) {
fn = function addEvent(element, eventName, callback) {
element.attachEvent('on' eventName, callback);
}
}
else {
fn = function addEvent(element, eventName, callback) {
element['on' eventName] = callback;
}
}
// 4 )clean up `addEvent` function created by JScript
// Make sure to either prepend assignment with `var`,
// or declare addEvent at the top of the function or declare `addEvent` at the top of the function
var addEvent = null;
// 5) Finally return function referenced by `fn`
return fn;
})();
Alternative Solutions
It's worth mentioning that there actually exist alternative ways of
having descriptive names in call stacks. Ways that don't require one to
use named function expressions. First of all, it is often possible to
define function via declaration, rather than via expression. This option
is only viable when you don't need to create more than one function :
It should be noted that it is actually purely an alternative method of displaying the description name (function name) on the call stack. A method that does not require the use of named function expressions. First, it is often possible to define functions using declarations instead of
using expressions. This option is usually only suitable when you don't need to create multiple functions.
var hasClassName = (function(){
// Define some private variables define some private variables
var cache = { };
//Use function definition use function declaration
function hasClassName(element, className) {
var _className = '(?:^|\s )' className '(?:\s |$)';
var re = cache[ _className] || (cache[_className] = new RegExp(_className));
return re.test(element.className);
}
// return function return function
return hasClassName;
})();
This obviously wouldn't work when forking function definitions.
Nevertheless, there's an interesting pattern that I first seen used by
Tobie Langel . The way it works is by defining all functions
upfront using function declarations, but giving them slightly different
identifiers:
This method obviously does not apply to multi-way function definitions. However, there is an interesting method, which I first saw Tobie
Langel. use. This defines all functions with a function declaration, but gives the function declaration a slightly different identifier.
var addEvent = (function(){
var docEl = document.documentElement;
function addEventListener(){
/* ... */
}
function attachEvent(){
/* ... */
}
function addEventAsProperty(){
/* ... */
}
if (typeof docEl.addEventListener != 'undefined') {
return addEventListener;
}
elseif (typeof docEl.attachEvent != 'undefined') {
return attachEvent;
}
return addEventAsProperty;
})();
While it's an elegant approach, it has its own drawbacks. First, by
using different identifiers, you loose naming consistency. Whether it's
good or bad thing is not very clear. Some might prefer to have identical
names, while others wouldn't mind varying ones; after all, different
names can often “speak” about implementation used. For example, seeing
“attachEvent” in debugger, would let you know that it is an attachEvent-based implementation of addEvent. On the other hand,
implementation-related name might not be meaningful at all. If you're
providing an API and name “inner” functions in such way, the user of API
could easily get lost in all of these implementation details.
虽然这是一个比较优雅的方法,但是他也有自己的缺陷。首先,通过使用不同的标示符,你失去的命名的一致性。这是件好的事情还是件坏的事情还不好说。
有些人希望使用一支的命名,有些人则不会介意改变名字;毕竟,不同的名字通常代表不同的实现。例如,在调试器中看到“attachEvent”,你就可以
知道是addEvent基于attentEvent的一个实现。另外一方面,和实现相关的名字可能根本没有什意义。如果你提供一个api并用如此方法命名
内部的函数,api的使用者可能会被这些实现细节搞糊涂。
A solution to this problem might be to employ different naming
convention. Just be careful not to introduce extra verbosity. Some
alternatives that come to mind are:
解决这个问题的一个方法是使用不同的命名规则。但是注意不要饮用过多的冗余。下面列出了一些替代的命名方法
`addEvent`, `altAddEvent` and `fallbackAddEvent`
// or
`addEvent`, `addEvent2`, `addEvent3`
// or
`addEvent_addEventListener`, `addEvent_attachEvent`, `addEvent_asProperty`
Another minor issue with this pattern is increased memory
consumption. By defining all of the function variations upfront, you
implicitly create N-1 unused functions. As you can see, if attachEvent is found in document.documentElement,
then neither addEventListener nor addEventAsProperty are ever really used. Yet, they
already consume memory; memory which is never deallocated for the same
reason as with JScript's buggy named expressions - both functions are
“trapped” in a closure of returning one.
这种模式的另外一个问题就是增加了内存的开销。通过定义所有上面的函数变种,你隐含的创建了N-1个函数。你可以发现,如果attachEvent
在document.documentElement中发现,那么addEventListener和addEventAsProperty都没有被实际
用到。但是他们已经消耗的内存;和Jscript有名表达式bug的原因一样的内存没有被释放,在返回一个函数的同时,两个函数被‘trapped‘在闭
包中。
This increased consumption is of course hardly an issue. If a library
such as Prototype.js was to use this pattern, there would be not more
than 100-200 extra function objects created. As long as functions are
not created in such way repeatedly (at runtime) but only once (at load
time), you probably shouldn't worry about it.
这个递增的内存使用显然是个严重的问题。如果和Prototype.js类似的库需要使用这种模式,将有另外的100-200个多于的函数对象被创
建。如果函数没有被重复地(运行时)用这种方式创建,只是在加载时被创建一次,你可能就不用担心这个问题。