A detailed look at performance improvements in .NET-C#.Net Tutorial-php.cn

A detailed look at performance improvements in .NET

黄舟

Release： 2017-03-09 15:41:55

Original

1611 people have browsed it

.NET 4.6 brings some CLR features related to performance improvements. Some of these features will take effect automatically, while others, such as SIMD and Async Local Storage, require changes to the way applications are written. Certain changes.

SIMD

The Mono team has always been proud of their support for SIMD, the single instruction stream multiple data stream feature. SIMD is a CPU instruction set that can perform the same operation on up to 8 values at the same time. With the launch of .NET CLR version 4.6, Windows developers can finally use this feature.

In order to actually observe the effect of SIMD, you can refer to this example. Suppose you need to add two arrays in the form c[i] = a[i] + b[i] to get a third array. By using SIMD, you can write code in the following way:

for (int i = 0; i < size; i += Vector.Count)
 {
     Vectorv = new Vector(A,i) + new Vector(B,i);
     v.CopyTo(C,i);
 }

Copy after login

Note how this loop increments by the value of Vector.Count, which may be 4 or 8 depending on the CPU type. The .NET JIT compiler will generate corresponding code to batch add the arrays with a value of 4 or 8 depending on the CPU.

This method seems a bit cumbersome, so Microsoft also provides a series of auxiliary classes, including:

Matrix3x2 structure
Matrix4x4 structure
Plane structure
Quaternion structure
Vector class
Vector(T) structure
Vector2 structure
Vector3 structure
Vector4 structure

Assembly uninstall

I'm afraid most developers don't know this: .NET often loads the same assembly twice. The condition for this to happen is that .NET first loads the IL version of an assembly and subsequently loads the NGEN version (i.e. the precompiled version) of the same assembly. This approach is a serious waste of physical memory, especially for large 32-bit applications such as Visual Studio.

In .NET 4.6, once the CLR loads the NGEN version of an assembly, it will automatically clear the memory occupied by the corresponding IL version.

Garbage collection

Earlier we discussed the garbage collection delay mode introduced in .NET 4.0. Although this method is much more reliable than letting the GC stop completely for a period of time, it is still not sufficient for many GC scenarios. complete.

In .NET 4.6, you will be able to temporarily suspend the garbage collector in a more sophisticated way. The new TryStartNoGCRegion method allows you to specify how much memory is needed in the heap for small objects and large objects.

If there is insufficient memory, the runtime will return false or stop running until enough memory is obtained through GC cleaning. You can control this behavior by passing a flag to TryStartNoGCRegion. If you successfully enter a GC-free area (GC is not allowed until the end of the process), then the EndNoGCRegion method must be called at the end of the process.

The official documentation does not state whether this method is thread-safe, but considering the working principle of GC, you should try to avoid having two processes try to change the GC state at the same time.

Another improvement to the GC is the way it handles pinned objects (i.e. objects that cannot be moved once allocated). Although this aspect is somewhat vaguely described in the documentation, when you fix the position of an object, it usually fixes the positions of its adjacent objects as well. Rich Lander wrote in the article:

The GC will handle pinned objects in a more optimized way, so the GC can compress the memory around the pinned objects more effectively. For large-scale applications that use a large number of pins, this change will greatly improve the performance of the application.

The GC also shows better intelligence in how to use memory in earlier generations, Rich continued to write:

The way Generation 1 objects are promoted to Generation 2 objects has also been improved to use memory more efficiently. Before allocating new memory space for a generation, the GC will first try to use available space. At the same time, a new algorithm is used when creating objects using the available space area, so that the size of the newly allocated space is closer to the size of the object than before.

Asynchronous local storage

The last improvement is not directly related to performance, but it can still achieve optimization results through effective use. In the days before asynchronous APIs became popular, developers could leverage thread local storage (TLS) to cache information. TLS acts like a global object to a specific thread, which means you can directly access context information and cache it without having to explicitly pass some context object.

In async/await mode, thread local storage becomes useless. Because every time await is called, it is possible to jump to another thread. And even if you manage to avoid this situation, other code may jump to your thread and interfere with the information in TLS.

The new version of .NET introduces the asynchronous local storage (ALS) mechanism to solve this problem. ALS is semantically equivalent to thread local storage, but it can make corresponding jumps with the call of await. This function will be implemented through the AsyncLocal generic class, which will internally call the CallContext object to save data.

The above is the detailed content of A detailed look at performance improvements in .NET. For more information, please follow other related articles on the PHP Chinese website!