Problem of the backpack

The problem of the backpack , also noted KP (in English, Knapsack Problem ) is a combinative problem of Optimization. It models a situation similar to the filling of a Backpack, not being able to support more than one some Poids, with whole or part of a whole of objects having each one a weight and a value. The objects put in the backpack must maximize the full value, without exceeding the maximum weight.

History

In research

The problem of the backpack is one of the 21 problems Np-complete of Richard Karp, exposed in its article of 1972. He is intensively studied since the middle of the 20th century and one finds references as of 1897, in an article of George Ballard Mathews. The formulation of the problem is extremely simple, but its resolution is more complex. The existing algorithm S can solve practical authorities of important size. However, the singular structure of the problem, and the fact that it is present as a subproblem of other more general problems, make of it a subject of choice for the research.

Complexity and cryptography

This problem is at the base of the first asymmetrical algorithm of coding (or with “public key”) presented by Martin Hellman, Ralph Merkle and Whitfield Diffie with the Université of Stanford in 1976. However, even if the idea is due to the problem of the backpack, RSA is regarded as the first true asymmetrical encryption algorithm.

The Np-difficult version of this problem was used in primitives and protocols of Cryptographie, such as the Cryptosystème de Merkle-Hellman or the Cryptosystème de Chor-Rivest. Their advantage compared to the asymmetrical cryptosystèmes based on the difficulty of factorizing is their speed of coding and deciphering. However, the algorithm of Hellman, Merkle and Diffie is prone to the " doors dérobées" algorithmic, which implies that it “is broken”, i.e. cryptanalysé. The problem of the backpack is a traditional example of mistake with regard to the bonds between Np-complétude and cryptography. A re-examined version of the algorithm, with an iteration of the problem of the backpack, was then presented, to be as soon as broken. The asymmetrical encryption algorithms based on the backpack all were to date broken, the latest to date being that of Chor-Rivest.

Other fields concerned

Is also used it to model the following situations, sometimes as a subproblem:
  • in assistance systems with the Management of wallet: to balance selectivity and diversification with an aim of finding the best relationship between output and risk for a capital placed on several financial credit (actions…);
  • in the loading of Boat or Plane: all the luggage with destination must be brought, without being in overload;
  • in the cutting of Material X: to minimize the falls at the time of the cutting of sections various lengths in iron bars.

Another reason to be interested in this problem is its appearance in certain uses of methods of Génération of columns (thus for the problem of “bin packing”).

Anecdotiquement and thus justifying the name of the problem, a hiker is confronted there at the time of preparing its tour: the backpack has a limited capacity; and it is thus necessary to slice between taking, for example, two cans and a gourd of fifty centilitres or only one can and a gourd of one liter.

Mathematical statement

The facts of the case can be expressed in mathematical terms. The objects are numbered by the index I varying 1 with N . The w_i numbers and p_i respectively represent the weight and the value of the object number I . The capacity of the bag will be noted W .

There exist multiple ways of filling the backpack. To describe one of them it should be indicated for each element if it is taken or not. One can use a binary coding: the state of the I - ème element will be worth x_i=1 if the element is put in the bag, or x_i=0 if it is put aside. A way of filling the bag thus is completely described by a Vecteur, called vector contained, or simply contained: X= (x_1, x_2,…, x_n) ; and the associated weight, as well as the associated value, with this filling, can then be expressed like function of the vector contained.

For contents X given, the full value contained in the bag is naturally:

z (X) = \ sum_ {\ {I, \, x_i=1 \}} p_i = \ sum_ {i=1} ^n x_ip_i
In the same way, the sum of the weights of the selected objects is:
w (X) = \ sum_ {i=1} ^n x_iw_i

The problem can then be reformulated as the search for a vector contained X= (x_1, x_2, \ dowries, x_n) (components being worth 0 or 1), carrying out the maximum of the function full value z (X) , under the constraint:

w (X) = \ sum_ {i=1} ^n x_iw_i \ W (1)
I.e. the sum of the weights of the selected objects does not exceed the capacity of the backpack.

In general, one adds the following constraints in order to avoid the singular cases:

  • \ sum_ {i=1} ^n w_i > W: one cannot put all the objects;
  • p_i > 0, \ forall I \ in \ {1, \ dowries, N \} : any object brings a profit;
  • w_i > 0, \ forall I \ in \ {1, \ dowries, N \} : any object consumes resources.

Terminology:

  • z (X) is called function objective ;
  • any vector X checking the constraint (1) is known as realizable ;
  • if the value of z (X) is maximum, then X is known as optimal .

Np-complétude

The problem of backpack can be represented in a decisional form by replacing maximization by the following question: does an entirety k being given, exist a value of the x_i for which \ sum_ {i=1} ^n p_ix_i \ Ge k, with respect of the constraint? There is a bond between the version “decision” and the version “optimization” of the problem insofar as if there exists a polynomial algorithm which solves the version “decision”, then one can find the value maximum for the problem of optimization in a polynomial way by repeatedly applying this algorithm while increasing the value of K . In a similar way, if an algorithm finds the value optimal of the problem of optimization in a polynomial time, then the problem of decision can be solved in polynomial time by comparing the value of the solution left by this algorithm with the value K . Thus, the two versions of the problem are of difficulty similar.

In its decisional form, the problem is Np-complete, which means that there does not exist known method general to build an optimal solution, separately the systematic examination of all the possible solutions. The problem of optimization is Np-difficult, its resolution is at least as difficult as that of the problem of decision, and there does not exist known polynomial algorithm which, being given a solution, can say if it is optimal (what would amount saying that there does not exist solution with a larger k, therefore to solve the Np-complete problem of decision).

Systematic process of exploration

This systematic examination can be carried out using a binary tree of exploration such that represented opposite (the triangles represent under-trees).

The tree is described while going down since the top until bottom from the triangles (sheets of the tree). Each box corresponds to a single possible course. While following the indications carried along the edges of the tree, to each course a succession of values for x_0, x_1,…, x_n corresponds forming a vector contained. It is then possible to defer in each box the full value and the total weight of the corresponding contents. It any more but does not remain to eliminate the boxes which do not satisfy the constraint, and to choose among those which remain that (or one of those) which gives the greatest value to the function objective.

To each time an object is added to the list of the objects available, a level is added to the tree of binary exploration, and the number of boxes is multiplied by 2. The exploration of the tree and the filling of the boxes thus have a cost which grows exponentially with the number N of objects.

Proof of Np-complétude

See also: Theory of complexity

This proof of Np-complétude was presented by Maichail G. Lagoudakis taking again an article of Richard Karp and an article of J.E. Savage.

Approximate resolution

See also: Algorithm of approximation

As for the majority of the Np-complete problems, it can be interesting to find solutions realizable but nonoptimal. Preferably with a guarantee on the variation enters the value of the found solution and the value of the optimal solution.

The following terminology is adopted:

  • one calls effectiveness of an object the report/ratio of its value on its weight. The more important the value of the object is compared to what it consumes, the more interesting the object is;

Algorithm glouton

The simplest algorithm is a Algorithme glouton. The idea is to add in priority the most effective objects, until saturation of the bag:

to sort the objects by decreasing order of effectiveness w_conso: = 0 for I of 1 with N if W + w_conso <= W then X: = 1 w_conso: = w_conso + W if not X: = 0 fine if fine for

Algorithm glouton analyzes

One will note z^* the value of the optimal solutions.

The solution X turned over by the algorithm glouton can be of as bad quality as possible. Let us consider for example that we have only two objects to place in the bag. The first has a profit of 2 and one weight of 1, the second has a profit and a weight both equal to W . The first object is most effective, it will be selected in first and will prevent the catch of the second, thus giving a solution of value 1 whereas the optimal solution is worth W . There thus exist values of the problem for which the relationship between the found solution and the optimal solution is as close to zero as possible.

There exists other algorithms of approximation for the problem of backpack making it possible to have a solution guaranteed a distance k or a report/ratio \ epsilon of the quality of optimal solution. I.e. the solution X found is such as z^* - Z (X) \ the k or \ frac {Z (X)}{z^*} \ the 1 - \ epsilon. The complexity of these algorithms is, in general, function of the reverse of awaited quality; for example O (n^ \ frac {1} {\ epsilon}) or O (n^2 + \ frac {1} {\ epsilon^2}) . The execution times can be very consequent.

Métaheuristiques

The methods Métaheuristique S like the genetic algorithms or the optimizations based on algorithms of colonies of ants make it possible to obtain a reasonable approximation while avoiding monopolizing too many resources.

Genetic algorithm

The genetic algorithms are often used in the difficult problems of optimization like that of the backpack. They are relatively easy to implement and make it possible to obtain quickly a satisfactory solution even if the size of the problem is important.

One generates a population of individuals whose chromosomes symbolize a solution of the problem. The representation of an individual is binary since each object either will be retained, or isolated of the bag. The number of bits in the genome of each individual corresponds to the number of objects available.

Optimization follows the usual principles of the genetic algorithm. The individuals are evaluated then the best are retained for the reproduction. According to the evolution selected, the operators of reproduction can be more or less complex (cross-country race-over), changes can also intervene (replacement of one 0 by 1 or the reverse). One can also decide to copy the best individual for the following generation (elitism). After a certain number of generations, the population tends towards an optimum, even the exact solution.

Algorithms based on the colonies of ants

This concept was used to solve the multidimensional problem of the backpack where several constraints must be satisfied. The first algorithms were based on the idea of the algorithm glouton: the ants selected the most interesting objects gradually. This selection can vary but is always based on traces of phéromones deposited by the ants and which condition the later choices. Among the solutions suggested, one can quote the deposit of phéromone on the best objects, the deposit on pairs of objects inserted one after the other in the solution or the addition of phéromones on pairs of objects, independently about insertion.

A synthesis carried out by Tunisian and French researchers showed that the algorithm which consists in leaving traces on the pairs of successively selected objects proves less effective than the alternatives which are focused on an unspecified object or pairs. The improvements remain however possible since these algorithms could be combined with the other métaheuristiques ones in order to approach the optimal solution.

Exact resolution

The problem of the backpack, in its traditional version, was studied in-depth. There thus exists of many methods to solve it today. The majority of these methods correspond to a version improved of one of the following methods.

Dynamic programming

See also: dynamic Programming

The problem of the backpack has the property of optimal substructure , i.e. one can build the optimal solution of the problem with I variable starting from the problem with i-1 variable. This property makes it possible to use a method of resolution by dynamic Programming.

One will note KP (I, c) the problem reduced to I variable and capacity C . The idea is the following one:

Being given a variable I and a capacity C , the optimal solutions of KP (I, c) is:

  • optimal solutions of the problem with i-1 variable with same capacity C ( KP (i-1, c) ), to which one adds x_i=0;
  • optimal solutions of the problem with i-1 variable with capacity c - w_i ( KP (i-1, c - w_i) ), to which one adds x_i=1.

The problem backpack with zero variable ( KP (0, *) ) has an optimal solution of zero value.

One builds then a table T containing the value of the optimal solutions of any problem KP (I, c) in the following way:

for C of 0 with W to make T: = 0 fine for for I of 1 with N to make for C of 0 with W to make if c>=w then T: = max (T, T CW [I] + p) if not T: = T fine if fine for fine for

Once the built table, it is enough to start of the box of T and to deduce the state from the objects while going up to a box T.

This algorithm has a complexity temporal and space in O (nW) . However, one can bring back the consumption of memory to O (N + W) , and even, if only the value of the optimal solution is important, with O (W) . It has two advantages:

  1. speed;
  2. not need to sort the variables.
and a disadvantage:
  1. greedy in memory (thus not of solution to problem of big size).

This approach comes us from Gar fi nkel and Nemhauser (1972).

Procedure of separation and evaluation

Like any combinative problem, the problem of backpack can be solved using a procedure of separation and evaluation (PSE). The Fonction of evaluation of a node often consists in solving the problem in continuous variables (see low). The implementation suggested by Martello and Toth (1990) became a reference. It is characterized by:
  • an evaluation of the nodes improved;
  • a local research when the last variable added to the bag brought to a failure;
  • need for a considerable intellectual effort to include/understand to them Source code.

The advantage of this method is low fuel consumption of memory.

Hybrid approaches

The hybrid approach is not really a new method of resolution. It simply consists in combining the two preceding methods in order to draw all the advantages from them. Typically, one will apply a PSE until a depth of research where the subproblem will be judged enough small to be able to be solved by dynamic programming.

The precursors of this approach are Plateau and Elkihel (1985), followed by Martello and Toth (1990)

Random links:Argeliers | Hjälmaren | Edmond Zuchelli | Ba Ba Ti Ki Di Do | Abdullah Al-Ahdal | Ben_Marcus