Optimization of code

In Data-processing Programming , optimization is the practice which generally consists in reducing the execution time of a function, the space occupied by the data and the program, or the consumption of energy.

The rule number one of optimization is that it should intervene only once the program functions and answers the functional specifications . The experiment shows that to apply optimizations of low level of the code before these two conditions are not carried out generally returns to a waste of time and proves to be harmful with the clearness of the code and the good performance of the program:

premature optimization is the source of all the evils. ”, Donald Knuth quoting Dijkstra

However this quotation, truncated, is very often badly interpreted. The full version being:

One should forget small local optimizations, say, 97% of time: premature optimization is the source of all the evils. (We should forget butt small efficiencies, say butt 97% off the time: premature optimization is the root off all evil.)”, Donald Knuth

The original quotation indicates very clearly which this rule should apply only to local optimizations, of low level (rewriting out of assembler, unwinding of loop etc) and not with high level optimizations concerning the choice of the algorithms or the architecture of a project. On the contrary the more the project grows and the even more impossible these high level optimizations will be difficult and expensive (in terms of time, difficulty and budget), to carry out.

The majority of the recent Compilateur S practice in an automatic way a certain number of optimizations which it would be tiresome to carry out manually and which would return the less readable Source code.

Local manual optimization can prove to be necessary in very specific cases, but measurements show that on machines RISC which have a high number of Registre S and where effectiveness asks the regrouping of the identical instructions to profit from the effect pipeline, the optimizer of a compiler C often provides a code more effective than that which would be written in assembler by an experienced programmer (what was never the case on the machines CISC). And in addition this code is much easier to maintain, because the instructions out of C remain in an order related on the only intelligibility of the code and not to specificities of the machine: in the current optimizers, indeed, the orders machines associated with an instruction are not necessarily any more in contiguous position, for reasons of effectiveness of execution. That returns the code particularly indecipherable generated assembler.

Practical of optimization

First approach

For beginning optimization, it is necessary to know to measure the speed of the code. For that it is necessary to choose a parameter, preferably simple, measurable. This perhaps for example processing time on a precise data file, or the number of images posted a second, or the number of requests treated per minute.

Once the parameter of measurement determined, it is necessary to measure the time spent in each part of the program. It is not rare that 80% to 90% of time are devoted to the execution of 10% of the code (the critical loops ). The figures vary according to the size and from the complexity of the projects. It is necessary to locate these 10% of code to be most profitable in its optimizations. This stage of localization can be realized using specialized tools of instrumentation of the code named profilers . They are in charge to count the number of executions with each function and corresponding cycles of the microprocessor during the execution.

Then one reiterates on the section most consuming resource as much of time than necessary this loop:

  • optimization of part of the code
  • measurement of the performance profit

Second approach

One can optimize on several levels a program:
  • at the algorithmic level, by choosing a algorithm lower complexity (with the mathematical direction) and structures of adapted data,
  • on the level of the language of development, by ordering the instructions as well as possible and by using the libraries available,
  • by locally using a low-level language, which can be the language C or, for the most critical needs, the Assembly language.

One passes at the higher level of optimization only once the possibilities of a level were exhausted. The use of a low-level language on the whole of a project for reasons of speed is one of the most common errors and most expensive which an industrial project can make.

The optimization of code is regarded per many developers amateurs a a little magic art and, for this reason, as one of the parts more exciting programming. This leads them to believe that a good programmer is a person who optimizes from the start the program. However the experiment shows that it cannot stage a bad initial design. It is in the design that the experiment of the developer plays more . In addition, in a number majority and growing of cases, the “good programmer” is less that which writes code astute (the optimizer will take care some generally better than him) than that which writes readable code and easy to maintain .

A good knowledge of the techniques of Structures of data as well as algorithms (even without going until the thorough theoretical considerations of the algorithmic Complexity) is shown much more fertile than that of an assembly language. When the most adequate algorithm was determined, the most effective optimizations can be obtained by using the following way:

  • writing of the critical code in a high-level language (like Scheme or Common Lisp),
  • application of successive mathematical transformations which preserve the specification of the program while reducing the consumption of the resources,
  • translation of the code transformed in a low-level language (language C).

In practice, the performances of the current machines make that applications comprising many slow Entrées-sorties can make the saving in these three stages and to write itself directly in a language like Haskell. The well-known application nget , which systematically harvests the images published in the forums Usenet, had in its first implementation written in Haskell. The version out of C was only one translation which does not appear more powerful for this type of application. An application limited mainly by the CPU and the speed of the memory on the other hand will be able to gain enormously with being written in a language such as C or C++.

Automatic optimization

The compilers are often able to make local optimizations, of which no developer would think in first approach.

For the language C, that can consider:

  • local variables and the registers
  • functions not implemented out of assembler as a function
  • the switch, which are optimum.

However one can largely help the compiler by declaring the variables with the keywords const and/or restrict when it is possible. Otherwise the compiler cannot know if a zone memory is accessible by other references, and will decontaminate optimizations (phenomenon known as of aliasing memory).

Examples

Use of local variables to avoid the aliasing of memory

Following the C++ code will in general be optimized little by the compiler because it is often unable to know if the code of the loop modifies or not the cycle counter: a pointer or a reference could modify it. void MyClass:: DoSomething () const { for (int i=0; i< m_nbrElements ; ++i) { void *ptr = GetSomePtr (); …. } }

In this version, one states clearly that one uses an iteration count fixed in advance and who will never be modified, authorizing the compiler to carry out more aggressive optimizations: void MyClass:: DoSomething () { const int nbrElements = m_nbrElements; for (int i=0; i< nbrElements ; ++i) { …. } }

A specificity of the binary one: the shift

One of very first optimizations was that of division and the multiplication by a power of 2.

Indeed, current data processing rests on the binary one, since it uses as basic element the Transistor (and historically, before the relay) which authorizes only two different values.

One thus logically implemented in Machine language the operations of shift on the left and shift on the right.

Indeed, in binary, the shift of a number of a notch towards the left multiplies it by 2.

Ainsi, 2 (102) shifted of 1 bit gives 4 (1002).
5 (1012) shifted of 2 bits gives 20 (101002): 5*2^2=20.
This also goes for division, by shifting the bits towards the line.
100 (11001002) shifted of 3 bits towards the line thus gives 100/2^3=12.5 12 (11002) because we work on integers.

Division (apart from this case and of the pathological cases) is an expensive instruction in machine time, and is not still available besides on the great majority of the processors of the type RISC.

The key word inline of C

The code C following:

inline int F (int has, int b) { return A * B; } int G (int a) { switch (A) { box 10: return F (has, a); box 11: box 12: return F (has - 2, a); box 1200: return F (has - 2, a); default : return F (has, a); } }

A compilation with GCC - O4 - S gives:

.file " opt.c" .text .p2align 4, 15 .globl G .type G, @function G: pushl %ebp movl %esp, %ebp movl 8 (%ebp), %edx cmpl $12, %edx jg. L14 leal -2 (%edx), %eax cmpl $11, %edx jge. L15 movl $100, %eax cmpl $10, %edx . L17: I. L2 movl %edx, %eax . L15: imull %edx, %eax . L2: popl %ebp ret .p2align 4, 7 . L14: movl $1437600, %eax cmpl $1200, %edx jmp. L17 .size G. - G .section .note.GNU-stack, " " , @progbits .ident " GCC: (GNU) 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk) "

What could be translated, for a easier comprehension, by the code C following:

int G (int a) { int eax, B; yew (has > 12)/* case has == 1200 * goto L14; eax = has - 2; yew (>= 11 has)/* case has == 11 or has == 12 * goto L15; eax=100; /* = 10 * 10 * b=10; L17: yew (has == b)/* case has == 10 * goto L2; /* case " default" * eax=a; L15: eax=eax*a; L2: return eax; L14: eax = 1437600; /* = 1200* (1200-2) * B = 1200; goto L17; }

One can notice for example that the function “F” was not generated, but that its code directly was built-in the function “G” (the key word “inline” makes it possible to force this type of optimization in C)

See too

External bonds

  • The Fallacy off Premature Optimization by Randall Hyde explaining the quotation of Donald Knuth and his bad interpretation
  • Premature Optimization by Charles Cook on this same quotation of Donald Knuth.
Random links:District of Saint-Pierre | Canton of Holy-Genevieve-on-Argence | November 10th in sport | Alevism | Microrégion de Pato Branco