chan blog: Growth of Functions

3 Growth of Functions
3.1 Asymptotic notation
3.2 Standard notations and common functions

3 Growth of Functions
The order of growth of the running time of an algorithm, defined in Chapter 2, gives a simple characterization of the algorithm's efficiency and also allows us to compare the relative performance of alternative algorithms.

When we look at input sizes large enough to make only the order of growth of the running time relevant, we are studying the asymptotic efficiency of algorithms. That is, we are concerned with how the running time of an algorithm increases with the size of the input in the limit, as the size of the input increases without bound. Usually, an algorithm that is asymptotically more efficient will be the best choice for all but very small inputs.

This chapter gives several standard methods for simplifying the asymptotic analysis of algorithms. The next section begins by defining several types of "asymptotic notation," of which we have already seen an example in Θ-notation. We then present several notational conventions used throughout this book, and finally we review the behavior of functions that commonly arise in the analysis of algorithms.

3.1 Asymptotic notation
The notations we use to describe the asymptotic running time of an algorithm are defined in terms of functions whose domains are the set of natural numbers N = {0, 1, 2, ... }. Such notations are convenient for describing the worst-case running-time function T(n), which usually is defined only on integer input sizes.

This section defines the basic asymptotic notations and also introduces some common abuses.

Asymptotic notation, functions, and running times
we will use asymptotic notation primarily to describe the running times of algorithms, as when we wrote that insertion sort's worst-case running time Θ(n^2). Asymptotic notation actually applies to functions, however. Recall that we characterized insertion sort's worst-case running time as an^2 + bn + c, for some constants a, b, and c. By writing that insertion sort's running time is Θ(n^2), we abstracted away some details of this function. Because asymptotic notation applies to functions, what we were writing as Θ(n^@) was the function an^2 + bn + c, which in that case happened to characterize the worst-case running time of insertion sort.

In this book, the functions to which we apply asymptotic notation will usually characterize the running times of algorithms. But asymptotic notation can apply to functions that characterize some other aspect of algorithms ( the amount of space they use, for example), or even to functions that have nothing whatsoever to do with algorithms.

Even when we use asymptotic notation to apply to the running time of an algorithm, we need to understand which running time we mean. Sometimes we are interested in the worst-case running time. Often, however, we wish to characterize the running time no matter what the input. In other words, we often wish to make a blanket statement that covers all inputs, not just the worst case. We shall see asymptotic notations that are well suited to characterizing running times no matter what the input.

Θ-notation
In Chapter 2, we found that the worst-case running time of insertion sort is T(n) = Θ(n^2). Let us define what this notation means. -- Instead, we will usually write "f(n) = Θ(g(n))" to express the same notion. You might be confused because we abuse equality in this way, but we shall see later in this section that doing so has its advantages.

Figure 3.1(a) gives an intuitive picture of functions f(n) and g(n), where f(n) = Θ(g(n)). For all values of n at and to the right of n_0, the value of f(n) lies at or above c1g(n) and at or below c2g(n). In other words, for all n >= n0, the function f(n) is equal to g(n) to within a constant factor. we say that g(n) is an asymptotically tight bound for f(n).

The definition of Θ(g(n)) requires that every member f(n) in Θ(g(n)) be asymptotically nonnegative, that is, that f(n) be nonnegative whenever n is sufficiently large. (An asymptotically positive function is one that is positive for all sufficiently large n.) Consequently, the function g(n) itself must be asymptotically nonnegative, or else the set Θ(g(n)) is empty. We shall therefore assume that every function used within Θ-notation is asymptotically nonnegative. This assumption holds for the other asymptotic notations defined in this chapter as well.

-- Intuitively, the lower-order terms of an asymptotically positive function can be ignored in determining asymptotically tight bounds because they are insignificant for large n. WHen n is large, even a tiny fraction of the highest-order term suffices to domniate the lower-order terms. Thus, setting c1 to a value that is slightly smaller than the coefficient of the highest-order term and setting c2 to a value that is slightly larger permits the inequalities in the definition of Θ-notation to be satisfied. The coefficient of the highest-order term can likewise be ignored, since it only changes c1 and c2 by a constant factor equal to the coefficient. -- We shall often use the notation Θ(1) to mean either a constant or a constant function with respect to some variable.

O-notation
The Θ-notation asymptotically bounds a function from above and below. When we have only an asymptotic upper bound, we use O-notation. For a given function g(n), we donte by O(g(n)) (pronounced "big-oh of g of n" or sometimes just "oh of g of n") the set of functions

O(g(n)) = {f(n) : there exist positive constants c and n0 such that 0 <= f(n) <= cg(n) for all n>= n0}.

We use O-notation to give an upper bound on a function to within a constant factor. Figure 3.1(b) shows the intuition behind O-notation. For all values n at and to the right of n0, the value of function f(n) is on or below cg(n).

We write f(n) = O(g(n)) to indicate that a function f(n) is a member of the set O(g(n)). Note that f(n) = Θ(g(n)) implies f(n) = O(g(n)), since Θ-notation is a stronger notion than O-notation. Written set-theoretically, we have Θ(g(n) in O(g(n)).

If you have seen O-notation before, you might find it strange that we should write, for example, n = O(n^2). In the literature, we sometimes find O-notation informally describing asymptotically tight bounds, that is, what we have defined using Θ-notation. In this book, however, when write f(n) = O(g(n)), we are merely claiming that some constant multiple of g(n) is an asymptotic upper bound on f(n), with no claim about how tight an upper bound it is. Distinguishing asymptotic upper bounds from asymptotically tight bounds is standard in the algorithms literature.

Using O-notation, we can often describe the running time of an algorithm merely by inspecting the algorithm's overall structure.

Since O-notation describes an upper bound, when we use it to bound the worst-case running time of an algorithm, we have a bound on the running time of the algorithm on every input - the blanket statement we discussed earlier. Thus the O(n^2) bound on worst-case running time of insertion sort also applies to its running time on every input. The Θ(n^2) bound on the running time of insertion sort on every input, however, does not imply a Θ(n^2) bound on the running time of insertion sort on every input. For example, we saw in Chapter 2 that wehn the input is already sorted, insertion sort runs in Θ(n) time.

Technically, it is an abuse to say that the running time of insertion sort is O(n^2), since for a given n, the actual running time varies, depending on the particular input of size n. When we say "the running time is O(n^2)," we mean that there is a function f(n) that is O(n^2) such that for any value of n, no matter what particular input size n is choosen, the running time on that input is bounded from above by the value f(n). Equivalently, we mean that the worst-case running time is O(n^2).

Ω-notation
Just as O-notation provides an asymptotic upper bound on a function, Ω-notation provides an asymptotic lower bound. For a given function g(n), we denote by Ω(g(n)) (pronounced "big-omega of g of n" or sometimes just "omega of g of n") the set of functions

Ω(g(n)) = { f(n) : there exist positive constants c and n0 such that 0 <= cg(n) <= f(n) for all n >= n0 }.

Figure 3.1(c) shows the intuition behind Ω-notation. For all values n at or to the right of n0, the value of f(n) is on or above cg(n).

From the definitions of the asymptotic notations we have seen thus far, it is easy to prove the following important theorem (see Exercise 3.1-5).

Theorem 3.1
For any two functions f(n) and g(n), we have f(n) = Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)).

As an example of the application of this theorem, our proof that an^2 + bn + c = Θ(n^2) for any constants a, b, and c, where a > 0, immediately implies that an^2 + bn + c = Ω(n^2) and an^2 + bn + c = O(n^2). In practice, rather than using Theorem 3.1 to obtain asymptotic upper and lower bounds from asymptotically tight bounds, as we did for this example, we usually use it to prove asymptotically tight bounds from asymptotic upper and lower bounds.

When we say that the running time (no midifier) of an algorithm is Ω(g(n)), we mean that no matter what particular input of size n is chosen for each value of n, the running time on that input is at least a constant times g(n), for sufficiently large n. Equivalently, we are giving a lower bound on the best-case running time of an algorithm. For example, the best-case running time of insertion sort is Ω(n), which implies that the running time of insertion sort is Ω(n).

The running time of insertion sort therefore belongs to both Ω(n) and O(n^2), since it falls anywhere between a linear function of n and a quadratic function of n. Moreover,these bounds are asymptotically as tight as possible: for instance, the running time of insertion sort is not Ω(n^2), since there exists an input for which insertion sort runs in Θ(n) time (e.g., when the the input is already sorted). It is not contradictory, however, to say that the worst-case running time of insertion sort is Ω(n^2), since there exists an input that causes the algorithm to take Ω(n^2) time.

Asymptotic notation in equations and inequalities
For example, in introducing O-notation, we wrote "n = O(n^2)." We might also write 2n^2 + 3n + 1 = 2n^2 + Θ(n). How do we interpret such formulas?

When the asymptotic notation stands alone (that is, not within a larger formula) on the right-hand side of an equation (or inequality), as in n = O(n^2), we have already defined the equal sign to mean set membeship : n in O(n^2). In general, however, when asymptotic notation appears in a formula, we interpret it as standing for some anonymous function that we do not care to name. For example, the formula 2n^2 + 3n + 1 = 2n^2 + Θ(n) means that 2n^2 + 3n + 1 = 2n^2 + f(n), where f(n) is some function in the set Θ(n).

Using asymptotic notation in this manner can help eliminate inessential detail and clutter in an equation.

In some cases, asymptotic notation appears on the left-hand side of an equation, as in

2n^2 + Θ(n) = Θ(n^2).

We interpret such equations using the following rule: No matter how the anonymous functions are chosen on the left of the equal sign, there is a way to choose the anonymous functions on the right of the equal sign to make the equation valid. Thus, our example means that for any function f(n) in Θ(n), there is some function g(n) in Θ(n^2) such that 2n^2 + f(n) = g(n) for all n. In other words, the right-hand side of an equation provides a coarser level of detail than the left-hand side.

We can chain together a number of such relationships, as in

2n^2 + 3n + 1 = 2n^2 + Θ(n)
= Θ(n^2)

We can interpret each equation separately by the rules above.

o-notation
The asymptotic upper bound provided by O-notation may or may not be asymptotically tight. The bound 2n^2 = O(n^2) is asymptotically tight, but the bound 2n = O(n^2) is not. We use o-notation to denote an upper bound that is not asymptotically tight. We formally define o(g(n)) ("little-oh of g of n") as the set

o(g(n)) = { f(n) : for any positive constant c >= 0, there exists a constant n0 > 0 such that 0 <= f(n) < cg(n) for all n >=n0}.

For example, 2n = o(n^2), but 2n^2 != o(n^2).

The definitions of O-notation and o-notation are similar. The main difference is that in f(n) = O(g(n)), the bound 0 <= f(n) <= cg(n) holds for some constant c > 0. Intuitively, in o-notation the function f(n) becomes insignificant relative to g(n) as n approaches infinity; that is,

$\lim_{n->\infty } \frac{f(n)}{g(n)} = 0$

ω-notation
By analogy, ω-notation is to Ω-notation as o-notation is to O-notation. We use ω-notation to denote a lower bound that is not asymptotically tight. The relation f(n) = ω(g(n)) implies that

$\lim_{n->\infty } \frac{f(n)}{g(n)} = \infty$

3.2 Standard notations and common functions
This section review some standard mathematical functions and notations and explores the relationships among them. It also illustrates the use of the asymptotic notations.

Monotonicity
A function f(n) is monotonically increasing if m <= n implies f(m) <= f(n). Similarly, it is monotonically decreasing if m <= n implies f(m) >= f(n). A function f(n) is strictly increasing if m < n implies f(m) < f(n) and strictly decreasing if m < n implies f(m) > f(n).

Floors and ceilings
For any real number x, we denote the greatest integer less than or equal to x by ⎣x⎦ (read "the floor of x") and the least integer greater than or equal to by ⎡x⎤ (read "the ceiling of x"). For all real x,

x - 1 < ⎣x⎦ <= x <= ⎡x⎤ < x + 1.

For any integer n,

⎡n/2⎤ + ⎣n/2⎦ = n,

and for any real number x >= 0 and integers a,b > 0,

$\left \lceil \frac{\left \lceil x/a \right \rceil}{b} \right \rceil = \left \lceil \frac{x}{ab} \right \rceil$
$\left \lfloor \frac{\left \lfloor x/a \right \rfloor}{b} \right \rfloor = \left \lfloor \frac{x}{ab} \right \rfloor$
$\left \lceil \frac{a}{b} \right \rceil \leq \frac{a+(b-1)}{b}$
$\left \lfloor \frac{a}{b} \right \rfloor \geq \frac{a+(b-1)}{b}$

The floor function f(x) = ⎣x⎦ is monotonically increasing, as is the ceiling function f(x) = ⎡x⎤.

Modular arithmetic
For any integer a and any positive integer n, the value a mod n is the remainder (or residue) of the quotient a/n:

a mod n = a - n * floor(a/n)

It follows that

0 <= a mod n < n.

If (a mod n) = (b mod n), we write a ≡ b (mod n) and say that a is equivalent to b, modulo n. In other words, a ≡ b (mod n) if a and b have the same reminader when divided by n. Equivalently, a ≡ b (mod n) if and only if n is a divisor of b-a.

Polynomials
Given a nonnegative integer d, a polynomial in n of degree d is a function p(n) of the form

$p(n) = \sum_{i = 0}^{d}a_in^i$

whre the constants a_0, a_1, ... , a_d are the coefficinets of the polynomial and a_d != 0. A polynomial is asymptotically positive if and only if a_d > 0. For an asymptotically positive polynomial p(n) of degree d, we have p(n) = Θ(n^d). For any realconstant a >= 0, the function n^a is monotoniacally increasing, and for any real constant a <= 0, the function n^a is monotonically decreasing. We say that a function f(n) is polynomially bounded if f(n) = O(n^k) for some constant k.

Exponentials
... Thus, any exponential function with a base strictly greater than 1 grows faster than any polynomial function.
...

Logarithms
We shall use the following notations:

lg(n) = log_2(n) (binary logarithm)
ln(n) = log_e(n) (natural logarithm)
lg^k(n) = (lg(n))^k (eponentation)
lglgn = lg(lgn) (composition)

.. so that lgn + k will mean (lgn) + k andnot lg(n+k).

.. Computer scientists find 2 to be the most natural base for logarithms because so many algorithms and data structures involve splitting a problem into two parts.

We say that a function f(n) is polylogarithmically bounded if f(n) = O(lg^k(n)) for some constant k. We can relate the growth of polynomials and polylogarithms by substituting lgn for n and 2^a for a in equation (3.10), yielding

$\lim_{n->\infty}\frac{lg^bn}{(2^a)^{lgn}} = \lim_{n->\infty}\frac{lg^bn}{n^a} = 0$

이 극한으로부터, 우리는

$lg^bn = o(n^a)$

라고 결론 지을 수 있다.

Factorials
The notation n! (read "n factorial") is defined for integers n >= 0 as

n! = 1 if n = 0
n * (n-1)! if n > 0.

Thus, n! = 1 * 2 * 3.... *n.

A weak upper bound on the factorial function is n! <= n^n,since each of the n terms in the factorial product is at most n. Stirling's approximation,

$n! = \sqrt{2\pi n}(\frac{n}{e})^n(1 + \theta(\frac{1}{n}))$

where e is the base of the natural logarithn, gives us a tighter upper bound, and a lower bound as well. As Exercise 3.2-3 asks you to prove,

$n! = o(n^n)$
$n! = \omega (2^n)$
$lg(n!) = \theta(nlgn)$

where Stirling's approximation is helpful in proving equation (3.19).

Functional iteration
We use the notation f^(i)(n) to denote the function f(n) iteratively applied i times to an initial value of n. Formall,y let f(n) be a function over the reals. For non-negative integers i, we recursively define

f^(i)(n) = n if i = 0
f(f^(i-1)(n)) if i > 0

The iterated logarithm function
We use the notation lg^(*)n (read "log star of n") to denote the iterated logarithm, defined as follows. Leg lg^(i)n be as define above, with f(n) = lgn. Because the logarithm of a nonpositive number is undefined, lg^(i)n is defined only if lg^(i-1)n > 0. Be sure to distinguish lg^(i)n (the logarithm function applied i times in succession, starting with argument n) from lg^(i)n (the logarithm of n raised to the ith power). Then we define the iterated logarithm function as

$lg^*n = min \left \{ i>=0 : lg^in <= 1 \right \}$

The iterated logarithm is a very slowly growing function:

lg^* 2 = 1,
lg^*4 = 2,
lg^*16 = 3,
lg^*65536 = 4,
lg^*(2^65536) = 5.

Since the number of atoms in the observable universe is estimated to be about 10^80, which is much less than 2^65536, we rarely encounter an input size n such that lg^(*)n > 5.

Fibonacci numbers
We define the Fibonacci numbers by the following recurrence:

F_0 = 0,
F_1 = 1,
F_i = F_(i-1) + F_(i-2) for i >= 2.

Thus, each Fibonacci number is the sum of the two previous ones, yielding the sequence

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, ....

Fibonacci numbers are related to the golden ratio Φ and to its conjugate hat(Φ), which are the two roots of the equation

x^2 = x + 1

and are given by the following formulas(see Exercise 3.2-6):

Φ = (1+ root(5)) / 2= 1.61803...
hat(Φ) = (1 - root(5)) / 2 = -.61803...

Specifically, we have

F_i = (Φ^i - hat(Φ)^i) / root(5),

which we can prove by induction (Exercise 3.2-7). since |hat(Φ)| < 1. we have

|hat(Φ)^i| / root(5) < 1/root(5)
< 1/2

which implies that

F_i = floor(Φ^i/ root(5) + 1/2),

which is to say that the ith Fibonacci number F_i is equal to Φ^i/root(5) rounded to the nearest integer. Thus, Fibonacci numbers grow exponentially.

chan blog

Post Lists

2018년 7월 2일 월요일

Growth of Functions

댓글 없음:

댓글 쓰기