Table of Contents
C++ Performance Analysis (1), Performance Analysis
C++ Performance Analysis (1)
Internal sorting The properties of the algorithm can be analyzed
Completed in c language: 1 Huffman code/decoder 2 The performance of the internal sorting algorithm can be analyzed
Home Backend Development PHP Tutorial C++ performance analysis (1), performance analysis_PHP tutorial

C++ performance analysis (1), performance analysis_PHP tutorial

Jul 13, 2016 am 10:20 AM

C++ Performance Analysis (1), Performance Analysis

C++ Performance Analysis (1)

Performance problems cannot be solved only with "technology". It is often a comprehensive problem of architecture, testing, assumptions, etc. However, for an engineer, you must start small and solve some "obvious" small problems. Otherwise, small things will add up to big things, and a dam thousands of miles away will collapse in an ant nest.

Why does the performance of C++ always rank behind C (see the latest test results from websites such as http://benchmarksgame.alioth.debian.org/u32/performance.php?test=binarytrees)? I think this is due to 3 reasons:

1) The C++ compiler used for testing does not use the latest optimization technology

2) The added value of C++ is not taken into account in testing

3) The "subtlety" of C++ at the application level (please refer to my other blogs about C++) often discourages ordinary programmers from choosing "textbook use cases", resulting in some side effects not being eliminated at the application level.

I remember that more than 10 years ago, when I was developing at Microsoft, I consulted Stan Lippman, the author of the earliest C++ compiler (then Microsoft VC++ architect), about a series of C++ performance problems for our group. In his With help, we used technologies such as inline and RVO in key places, which completely solved the performance problem and found several major errors in VC++. I realize that most of the performance problems of C++ lie in our shallow understanding of C++, and most of them are not difficult to solve.

Let’s use an example to compare and see how some subtle details affect program performance.

struct intPair

{

int ip1;

int ip2;

                                                  

intPair(int i1, int i2) : ip1(i1), ip2(i2) {}

intPair(int i1) : ip1(i1), ip2(i1) {}

};

// Calc sum (usinh value semantic)

Int

Sum1(intPair p)

{

return p.ip1 + p.ip2;

}

// Calc sum (usinh ref semantic)

int

Sum2(intPair &p)

{

return p.ip1 + p.ip2;

}

// Calc sum (usinh const ref semantic)

Int

Sum3(const intPair& p)

{

return p.ip1 + p.ip2;

}

The above simple

struct has three Sum functions, which do exactly the same thing, but is the performance the same? We use the following program to test:

double Sum(int t, int loop)

{

using namespace std;

if (t ==

1)

                    {

clock_t begin = clock();

int x =0;

for(int i = 0; i < loop; ++i)

                                                                             

                                                            

                                                                                                             

clock_t end = clock();

return double(end - begin) / CLOCKS_PER_SEC;

         }

else if (t ==

2

)

                    {

                        clock_t begin = clock();

                        int x =0;

                        intPair p(1,2);

                        for(int i = 0; i < loop; ++i)

                        {

                                    x += Sum1(p);

                        }

                        clock_t end = clock();

                        return double(end - begin) / CLOCKS_PER_SEC;

             }

             else if (t == 3)

             {

                        clock_t begin = clock();

                        int x =0;

                        intPair p(1,2);

                        for(int i = 0; i < loop; ++i)

                        {

                                    x += Sum2(p);

                        }

                        clock_t end = clock();

                        return double(end - begin) / CLOCKS_PER_SEC;                    

            }

             else if (t == 4)

             {

                        clock_t begin = clock();

                        int x =0;

                        intPair p(1,2);

                        for(int i = 0; i < loop; ++i)

                        {

                                    x += Sum3(p);

                        }

                        clock_t end = clock();

                        return double(end - begin) / CLOCKS_PER_SEC;                                            

             }

             else if (t == 5)

             {

                        clock_t begin = clock();

                        int x =0;

                        for(int i = 0; i < loop; ++i)

                        {

                                    x += Sum3(10);

                        }

                        clock_t end = clock();

                        return double(end - begin) / CLOCKS_PER_SEC;                                            

             }

             return 0;

}

我们用了5个案列,对Sum1和Sum3 风别用了两种调用方式,对Sum2用了一种调用方式。我们测试了10万次调用:

double sec = Sum(1, 100000);

printf("Sum1 (use  ctor) time: %f n", sec);

sec = Sum(2, 100000);

printf("Sum1 (use no c'tor) time: %f n", sec);

sec = Sum(3, 100000);

printf("Sum2 time: %f n", sec);

sec = Sum(4, 100000);

printf("Sum3 without conversion time: %f n", sec);

sec = Sum(5, 100000);

printf("Sum3 with conversion time: %f n", sec);

 

我们在VisualStidio 2010 中测试,结果是:

 

用例1    18ms 

用例2    9ms

用例3    6ms

用例4    7ms

用例5    12ms

也就是说:用例1和5最慢,其他基本没有差别。

细心的读者不难看出,

1)用例5的性能问题,是因为Sum3用了C++的implicit conversion ,将整数自动转化成intPair 的临时变量。这是一个应用层面的问题,如果我们不得不将整数作这个转换,也就不得不付出这个性能上的代价。

2)用例1的问题和5类似,都是因为不得不每次创建临时变量。当然,可以强迫constructor inline 来使得临时变量的生成成本降低。

3)用例2用了在函数调用前了编译自生的copy constructor,不过因为 intPair object 很小,影响可以忽略不计了。

4)用例3性能是稳定的,但是它用了“间接”方式(详情请看我关于reference的博克),所以产生的指令比用例2多两条。但对性能的影响不大,估计和Intel的L1,L2 缓存有关。

    *注意到OOP函数如果仅仅对 this 的成员存取数据,一般可以充分利用缓存,除非 object 过大。 

5)用例4 和用例3生成代码完全一样,应该没有差别。const 只是编译时有用,生成的代码与const 与否无关。

性能问题的话题太多,本文只是蜻蜓点水,但是已经触及了C++的两个最大的性能隐患:

  a) 临时变量

b) Implicit conversion (silent conversion)

2014-6-20 Seattle


Internal sorting The properties of the algorithm can be analyzed

I don’t have time to write a program for you, but I need to give you an idea and it will flash to my mind. First, copy the examples from the book on the three kinds of sorting and write them into three functions. In each function, take the system time once at the beginning and end and finally subtract it. The overhead time is printed out

Completed in c language: 1 Huffman code/decoder 2 The performance of the internal sorting algorithm can be analyzed

I modified the online program and integrated it. Take a look
#include
#include
#include
#define M 50
#define MAX 100000;

typedef struct
{
int weight;//Node weight
int parent,lchild,rchild;
}HTNODE,*HUFFMANTREE;

typedef char** HUFFMANCODE;//Dynamically allocate array to store Huffman coding table

typedef struct
{
int key; /* Keywords*/
}RecordNode; /*Type of sorting node*/

typedef struct
{
RecordNode *record;
int n; /*Size of sorting object* /
}SortObject; //Sequence to be sorted

HUFFMANTREE huffmantree(int n,int weight[])//Construct Huffman tree
{
int m1,m2,k;
int i,j,x1,x2;
HUFFMANTREE ht;
ht=(HUFFMANTREE)malloc((2*n)*sizeof(HTNODE));
for(i=1;i< ;(2*n);i++)//Initialize the data of each node in the Huffman tree. The assignment without initial value is 0
{
ht[i].parent=ht[i].lchild =ht[i].rchild=0;
if(i<=n)
ht[i].weight=weight[i];
else
ht[i].weight=0 ;
}
for(i=1;i{
m1=m2= MAX;
x1=x2=0;
for(j=1;j<(n+i);j++)
{
if((ht[j].weight{
m2=m1;
x2=x1;
m1=ht[j].weight;
x1=j;
}
else if((ht[j].weight{
m2=ht[j].weight;
x2=j;
}
}
k=n+i;
ht[x1].parent=ht[x2].parent=k;
ht[k].weight =m1+m2;
ht[k].lchild=x1;
ht[k].rchild=x2;
}
return ht;
}

void huffmancoding(int n,HUFFMANCODE hc,HUFFMANTREE ht,char str[])
{
int i,start,child,father;
char *cd;
hc=(HUFFMANCODE)malloc(( n+1)*sizeof(char*));//Allocate the head pointer of n character encoding
cd=(char*)malloc(n*sizeof(char));//Allocate the working space for encoding
cd[n-1]='\0';//Encoding terminator
f...The rest of the text>>

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/867005.htmlTechArticleC++ performance analysis (1), performance analysis C++ performance analysis (1) Performance problems cannot be solved only with technology Yes, it is often a comprehensive problem of architecture, testing, assumptions, etc. But...
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
WWE 2K25: How To Unlock Everything In MyRise
4 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Working with Flash Session Data in Laravel Working with Flash Session Data in Laravel Mar 12, 2025 pm 05:08 PM

Laravel simplifies handling temporary session data using its intuitive flash methods. This is perfect for displaying brief messages, alerts, or notifications within your application. Data persists only for the subsequent request by default: $request-

cURL in PHP: How to Use the PHP cURL Extension in REST APIs cURL in PHP: How to Use the PHP cURL Extension in REST APIs Mar 14, 2025 am 11:42 AM

The PHP Client URL (cURL) extension is a powerful tool for developers, enabling seamless interaction with remote servers and REST APIs. By leveraging libcurl, a well-respected multi-protocol file transfer library, PHP cURL facilitates efficient execution of various network protocols, including HTTP, HTTPS, and FTP. This extension offers granular control over HTTP requests, supports multiple concurrent operations, and provides built-in security features.

Simplified HTTP Response Mocking in Laravel Tests Simplified HTTP Response Mocking in Laravel Tests Mar 12, 2025 pm 05:09 PM

Laravel provides concise HTTP response simulation syntax, simplifying HTTP interaction testing. This approach significantly reduces code redundancy while making your test simulation more intuitive. The basic implementation provides a variety of response type shortcuts: use Illuminate\Support\Facades\Http; Http::fake([ 'google.com' => 'Hello World', 'github.com' => ['foo' => 'bar'], 'forge.laravel.com' =>

12 Best PHP Chat Scripts on CodeCanyon 12 Best PHP Chat Scripts on CodeCanyon Mar 13, 2025 pm 12:08 PM

Do you want to provide real-time, instant solutions to your customers' most pressing problems? Live chat lets you have real-time conversations with customers and resolve their problems instantly. It allows you to provide faster service to your custom

Explain the concept of late static binding in PHP. Explain the concept of late static binding in PHP. Mar 21, 2025 pm 01:33 PM

Article discusses late static binding (LSB) in PHP, introduced in PHP 5.3, allowing runtime resolution of static method calls for more flexible inheritance.Main issue: LSB vs. traditional polymorphism; LSB's practical applications and potential perfo

Framework Security Features: Protecting against vulnerabilities. Framework Security Features: Protecting against vulnerabilities. Mar 28, 2025 pm 05:11 PM

Article discusses essential security features in frameworks to protect against vulnerabilities, including input validation, authentication, and regular updates.

Customizing/Extending Frameworks: How to add custom functionality. Customizing/Extending Frameworks: How to add custom functionality. Mar 28, 2025 pm 05:12 PM

The article discusses adding custom functionality to frameworks, focusing on understanding architecture, identifying extension points, and best practices for integration and debugging.

See all articles