This chapter introduces some coding and design principles not covered in other parts of this book. Contains some .NET application scenarios, some will not cause much harm, and some will cause obvious problems. The rest will have different effects depending on how you use it. If you want to summarize the principles presented in this chapter, it is:
Excessive optimization will affect the abstraction of the code
This means that when you want higher To optimize performance, you need to understand the implementation details of each level of code. There will be many related introductions in this chapter.
Instances of classes are allocated on the heap and accessed through pointer references. Passing these objects is cheap because it's just a copy of the pointer (4 or 8 directly). However, objects also have some fixed overhead: 8 or 16 bytes (32 or 64-bit systems). This overhead includes pointers to method tables and synchronization fields used for other purposes. However, if you look at the memory occupied by an empty object through a debugging tool, you will find that it is 13 or 24 bytes larger (32-bit or 64-bit systems). This is caused by .NET's memory alignment mechanism.
The structure does not have the above overhead, and its memory usage is the combination of field sizes. If the structure is a local variable declared within a method (function), it allocates control on the stack. If a structure is declared as part of a class, the memory used by the structure is part of the memory layout of the class (so it is allocated on the heap). But when you pass the structure to a method (function), it will copy the byte data. Because it is not on the heap, the structure will not cause garbage collection.
So there is a compromise here. You can find various suggestions for struct sizes, but I won't give you an exact number here. In most cases, your structs will need to be kept small in size, especially if they need to be passed around frequently. You will want to ensure that the size of your structs does not cause too big a problem. The only thing that is certain is that you need to analyze it based on your own application scenarios.
In some cases, the difference in efficiency is quite large. The overhead of an object doesn't seem to be much, but you can see the difference by comparing an array of objects and an array of structures. Under a 32-bit system, assume that a data structure contains 16 bytes of data and the array length is 100w.
The space occupied by using the object array
8-byte array overhead+
(4-byte pointer address X1,000,000)+
((8-byte header+16-byte data)X1, 000,000)
=28MB
Space occupied by using structure array
8 byte array overhead+
(16-byte data X1,000,100)
=16MB
If you use a 64-bit system, the object array uses 40MB, while the structure array is still 16MB.
You can see that in a structure array, data of the same size takes up less memory. As the number of objects in the object array increases, the pressure on the GC will also increase.
In addition to space, there is also the issue of CPU efficiency. The CPU has multiple levels of cache. The cache closer to the CPU is smaller, but the access speed will be faster, and it is easier to optimize for sequentially saved data.
For a structure array, they are all continuous values in memory. Accessing the data in the structure array is very simple. As long as you find the correct position, you can get the corresponding value. This means there is a huge difference when iterating over large arrays of data. If the value is already in the CPU's cache, accessing it is an order of magnitude faster than accessing RAM.
If you want to access an item in the object array, you need to obtain the pointer reference of the object first, and then access it in the heap. When iterating the object array, it will cause the data pointer to jump in the heap and frequently update the CPU cache, thus wasting many opportunities to access the CPU cache data.
In many cases, reducing the cost of CPU access to memory by improving the location of data stored in memory is one of the main reasons for using structures, which can significantly improve performance.
Because structures are always copied when used, be careful when coding, otherwise you will create some interesting bugs. For example, you cannot compile the following example:
struct Point { public int x; public int y; } public static void Main() { List<Point> points = new List<Point>(); points.Add(new Point() {x = 1, y = 2}); points[0].x = 3; }
The problem is in the last line. You are trying to modify a certain value of the Point element in the list. This operation is not possible because points[0] returns is a copy of the original value. The correct way to modify the value is
Point p = points[0]; p.x = 3; points[0] = p;
However, you can adopt a stricter coding strategy: do not modify the structure. Once a structure is created, its values should never be changed. This eliminates the above compilation issues and simplifies the rules for using structures.
I mentioned before that structures should be kept small to avoid spending a lot of time copying them, but occasionally some large structures will be used. For example, an object of final business process details needs to store a large number of timestamps:
class Order { public DateTime ReceivedTime { get; set; } public DateTime AcknowledgeTime { get; set; } public DateTime ProcessBeginTime { get; set; } public DateTime WarehouseReceiveTime { get; set; } public DateTime WarehouseRunnerReceiveTime { get; set; } public DateTime WarehouseRunnerCompletionTime { get; set; } public DateTime PackingBeginTime { get; set; } public DateTime PackingEndTime { get; set; } public DateTime LabelPrintTime { get; set; } public DateTime CarrierNotifyTime { get; set; } public DateTime ProcessEndTime { get; set; } public DateTime EmailSentToCustomerTime { get; set; } public DateTime CarrerPickupTime { get; set; } // lots of other data ... }
In order to simplify the code, we can divide the time data into its own substructure, so that we can access it in this way Order object:
Order order = new Order(); Order.Times.ReceivedTime = DateTime.UtcNow;
We can put all the data in our own class:
class OrderTimes { public DateTime ReceivedTime { get; set; } public DateTime AcknowledgeTime { get; set; } public DateTime ProcessBeginTime { get; set; } public DateTime WarehouseReceiveTime { get; set; } public DateTime WarehouseRunnerReceiveTime { get; set; } public DateTime WarehouseRunnerCompletionTime { get; set; } public DateTime PackingBeginTime { get; set; } public DateTime PackingEndTime { get; set; } public DateTime LabelPrintTime { get; set; } public DateTime CarrierNotifyTime { get; set; } public DateTime ProcessEndTime { get; set; } public DateTime EmailSentToCustomerTime { get; set; } public DateTime CarrerPickupTime { get; set; } } class Order { public OrderTimes Times; }
但是,这样会为每个Order对象引入额外的12或者24字节的开销。如果你需要将OrderTimes对象作为一个整体传入各种方法函数里,这也许是有一定道理的,但为什么不把Order对象传入方法里呢?如果你同时有数千个Order对象,则可能会导致更多的垃圾回收,这是额外的对象增加的引用导致的。
相反,将OrderTime更改为结构体,通过Order上的属性(例如:Order.Times.ReceivedTime)访问OrderTImes结构体的各个属性,不会导致结构体的副本(.NET会对这个访问做优化)。这样OrderTimes结构体基本上成为Order类的内存布局的一部分,几乎和没有子结构体一样了,你拥有了更加漂亮的代码。
这种技术确实违反了不可变的结构体原理,但这里的技巧就是将OrderTimes结构的字段视为Order对象的字段。你不需要将OrderTimes结构体作为一个实体进行传递,它只是一个代码组织方式。
The above is the detailed content of Summarize some examples of coding and design principles. For more information, please follow other related articles on the PHP Chinese website!