Optimisation for Artificial Intelligence, a 4-day course
« Previous | Home ↑ | Next »
This short interlude will bring notions of complexity, in order to explain why and how some problems are really more difficult to address than others. There is a fair share of theoritical content in this domain, and the point here is not to explain the proofs to all claims.
Here, there is no analytical solution to the problem, which means the naive approach would enumerate all possibilities. But we really hope we can do better because the search space is huge. Besides, finding a solution, even suboptimal, can be complex.
We usually consider as correct all computer programs or algorithms (like the simplex, or a Branch & Bound) which address a mathematical problem and return a correct solution. Implicitely, we expect that such a program does return a solution for any instance, and that it handles situations when there is no solution as well.
However, when using a computer, we are usually limited by two parameters:
Before we go further, let’s recall the city placement problem in a discrete version. We could say that since we are going to print out our map, we can consider a 1000 × 1000 pixel grid and determine which pixel would match each city best: that would be 1’000’000 possibilities per city, let’s write it $10^6$.
If we have 2 or 3 cities, we can consider evaluating all possibilities to find the optimum. But for a problem with 36 cities, that is a total of about $10^{6×36}=10^{216}$ combinations to evaluate before finding the minimum. Even with a computer evaluating each combination in one nanosecond (which is actually not reasonable today), we would still need around $10^{207}$ seconds to find the optimum.
The universe is only 14 billion years old and we expect a solution within a lifetime… or actually, more within few seconds for tasks looking simple.
So we need a theory to try to predict:
In practice, when talking about (space or) time complexity, we use an asymptotic notation with a big $O$ notation: $O(n)$ for linear problems, $O(n^2)$ for quadratic problem, $O(2^n)$ for exponential problems, etc.
We will also talk about:
In practice, when implementing a sorting algorithm, we still try to go for optimisations which would bring a better complexity in most cases. For example, the sorting algorithm in Python is called Tim sort, named after Tim Peters, for extra nice properties it ensures. But the most important is to ensure that even the worst case scenario is not out of control.
If the big $O$ notation of a problem remains polynomial, we keep the problem within the polynomial category, even with crazy degrees and constants of polynoms. We consider these problems as easy, tractable and we usually find a way to make them run in reasonable time.
Many problems are known to be polynomial:
Linear programming is a polynomial problem, but keep in mind the simplex algorithm is not: there are a particular category of LP problems where it becomes exponential; however, it remains polynomial in most situations. Other methods (like the interior-points) are however polynomial.
For some problem, the best known algorithm is exponential:
but we do not know whether these problems really are exponential.
NP problems are a particular class of problems:
So basically, all polynomial problems are NP problems.
There is a special subset of NP problems called NP-complete problems. These problems are “at least as difficult as any problem in NP”.
We can put in this category many well known problems, such as MILP and CP, just by transforming any instance of these problems in another NP-complete problem (we name this transformation a reduction)
« Previous | Home ↑ | Next »