ayofoto.info Environment A COURSE IN PROBABILITY THEORY CHUNG PDF

# A course in probability theory chung pdf

Thursday, May 16, 2019 admin Comments(0)

Purchase A Course in Probability Theory - 2nd Edition. Authors: Kai Chung . Kai Lai Chung is a Professor Emeritus at Stanford University and has taught. BOOK REVIEWS. A course in probability theory, by Kai Lai Chung. Harcourt, New . York, xiii+ pp. \$ Probability, by Leo Breiman. Addison-Wesley. Of course basic knowledge of the sun and moon and some stars was Professor Chung has made important contributions to probability theory, particularly to. Author: CEOLA TOUSSIANT Language: English, Spanish, Dutch Country: Belize Genre: Health & Fitness Pages: 361 Published (Last): 31.01.2016 ISBN: 322-5-36905-283-9 ePub File Size: 28.43 MB PDF File Size: 11.66 MB Distribution: Free* [*Regsitration Required] Downloads: 33410 Uploaded by: SHON

A COURSE IN PROBABILITY THEORY THIRD EDITION KalLalChung Stanford University f\.CADEl\lIC PRESS --" po: il1r({"Jurt SCIence and Technology. Chung K.L. a Course in Probability Theory (3ed., AP, ) - Free ebook download as PDF File .pdf), Text File .txt) or read book online for free. A Course in Probability Theory-kai Lai Chung - Ebook download as PDF File . pdf) or read book online.

The point of Example 2 is that there is a ready-made product measure there. The random walks in Chapter 8 illustrate the way probability theory transforms other parts of mathematics. Thus if X is a positive r. Exercises 3 and 6 of Sec. Natural examples abound in more advanced theory, such as that of stochastic processes, but they are not as simple as those discussed above.

Quick testing grounds are provided, but for major battlefields one must await Chapters 7 and 8. Chapter 7 initiates what has been called the "central problem" of classical probability theory. Time has marched on and the center of the stage has shifted, but this topic remains without doubt a crowning achievement. The random walks in Chapter 8 illustrate the way probability theory transforms other parts of mathematics. It does so by introducing the trajectories of a process, thereby turning what was static into a dynamic structure.

The same revolution is now going on in potential theory by the injection of the theory of Markov processes. In Chapter 9 we return to fundamentals and strike out in major new directions.

While Markov processes can be barely introduced in the limited space, martingales have become an indispensable tool for any serious study of contemporary work and are discussed here at length.

The fact that these topics are placed at the end rather than the beginning of the book, where they might very well be, testifies to my belief that the student of mathematics is better advised to learn something old before plunging into the new.

A short course may be built around Chapters 2, 3, 4, selections from Chapters 5, 6, and the first one or two sections of Chapter 9. For a richer fare, substantial portions of the last three chapters should be given without skipping anyone of them. In a class with solid background, Chapters 1, 2, and 4 need not be covered in detail. At the opposite end, Chapter 2 may be filled in with proofs that are readily available in standard texts.

It is my hope that this book may also be useful to mature mathematicians as a gentle but not so meager introduction to genuine probability theory. Often they stop just before things become interesting! Such a reader may begin with Chapter 3, go at once to Chapter 5 with a few glances at Chapter 4, skim through Chapter 6, and take up the remaining chapters seriously to get a real feeling for the subject.

Several cases of exclusion and inclusion merit special comment. I chose to construct only a sequence of independent random variables in Section 3. I chose to postpone a discussion of conditioning until quite late, in order to follow it up at once with varied and worthwhile applications. With a little reshuffling Section 9.

I chose not to include a fuller treatment of infinitely divisible laws, for two reasons: I took pains to spell out a peripheral discussion of the logarithm of characteristic function to combat the errors committed on this score by numerous existing books.

## Chung K.L. a Course in Probability Theory (3ed., AP, 2001)

Finally, and this is mentioned here only in response to a query by Doob, I chose to present the brutal Theorem 5. There are perhaps some new things in this book, but in general I have not striven to appear original or merely different, having at heart the interests of the novice rather than the connoisseur.

In my opinion the slightly decadent fashion of conciseness has been overwrought, particularly in the writing of textbooks. The only valid argument I have heard for an excessively terse style is that it may encourage the reader to think for himself. Such an effect can be achieved equally well, for anyone who wishes it, by simply omitting every other sentence in the unabridged version.

This book contains about exercises consisting mostly of special cases and examples, second thoughts and alternative arguments, natural extensions, and some novel departures.

## A Course in Probability Theory-kai Lai Chung

With a few obvious exceptions they are neither profound nor trivial, and hints and comments are appended to many of them. If they tend to be somewhat inbred, at least they are relevant to the text and should help in its digestion. Some of these are needed in the book, but in any case the reader's study of the text wiII be more complete after he has tried at least those problems.

Over a span of nearly twenty years I have taught a course at approximately the level of this book a number of times. The penultimate draft of the manuscript was tried out in a class given in at Stanford University. Because of an anachronism that allowed only two quarters to the course as if probability could also blossom faster in the California climatel , I had to omit the second halves of Chapters 8 and 9 but otherwise kept fairly closely to the text as presented here.

The second half of Chapter 9 was covered in a subsequent course called "stochastic processes. Among those in the class who cooperated in this manner and who corrected mistakes and suggested improvements are: Jack E. Clark, B. Curtis Eaves, Susan D. Horn, Alan T.

Huckleberry, Thomas M. Liggett, and Roy E. Welsch, to whom lowe sincere thanks. The manuscript was also read by J. Doob and Benton Jamison, both of whom contributed a great deal to the final revision. They have also used part of the manuscript in their classes.

Aside from these personal acknowledgments, the book owes of course to a large number of authors of original papers, treatises, and textbooks. I have restricted bibliographical references to the major sources while adding many more names among the exercises. Some oversight is perhaps inevitable; however, inconsequential or irrelevant "name-dropping" is deliberately avoided, with two or three exceptions which should prove the rule.

It is a pleasure to thank Rosemarie Stampfel and Gail Lemmond for their superb job in typing the manuscript. It serves as a convenient bridge from elementary analysis to probability theory, upon which the beginner may pause to review his mathematical background and test his mental agility. Some of the methods as well as results in this chapter are also useful in the theory of stochastic processes. In this book we shall follow the fashionable usage of the words "positive ", "negative ", "increasing ", "decreasing" in their loose interpretation.

For example, "x is positive" means "x By a "function" we mean in this chapter a real finite-valued one unless otherwise specified. Thus for any two real numbers XI and X2, 1 We begin by reviewing some properties of such a function.

It X I,j. In general, we say that the function f has a jump at x iff the two limits in 2 both exist but are unequal. The value of f at x itself, viz. As a consequence of i and ii , we have the next result. It is worthwhile to observe that points of jump may have a finite point of accumulation and that such a point of accumulation need not be a point of jump itself.

Thus, the set of points of jump is not necessarily a closed set.

Let Xo be an arbitrary real number, and define a function I as follows: Before we discuss the next example, let us introduce a notation that will be used throughout the book.

We shall call the function Or the point mass at t. Example 2. Thanks to the uniform convergence why? This proves that the function I has jumps at all the rational points and nowhere else.

## A Course in Probability Theory, Third Edition - PDF Free Download

We now show that the condition of countability is indispensable. By "countable" we mean always "finite possibly empty or countably infinite". We shall prove this by a topological argument of some general applicability.

In Exercise 3 after this section another proof based on an equally useful counting argument will be indicated. Thus we may associate with the set of points of jump in the domain of f a certain collection of pairwise disjoint open intervals in the range of f. Now any such collection is necessarily a countable one, since each interval contains a rational number, so that the collection of intervals is in one-to-one correspondence with a certain subset of the rational numbers and the latter is countable.

Therefore the set of discontinuities is also countable, since it is in one-to-one correspondence with the set of intervals associated with it. Then f 1 and f 2 have the same points of jump of the same size, and they coincide except possibly at some of these points of jump. To see this, let x be an arbitrary point and let t. Such sequences exist since D is dense. The first assertion in v follows from this equation and ii.

Furthermore if f 1 is continuous at x, then so is f 2 by what has just been proved, and we 1. How can f 1 and h differ at alI? It will tum out in Chapter 2 see in particular Exercise 21 of Sec. The third modification is found to be convenient in Fourier analysis, but either one of the first two is more suitable for probability theory.

We have a free choice between them and we shall choose the second, namely, right continuity. Let us recall that an arbitrary function g is said to be right continuous at x iff lim. To prove the assertion vi we must show that "Ix: For then: It is easy to see that it is increasing if f is so. Even if 1 is defined in a larger domain, we may still speak of these properties "on D" by considering the "restriction of 1 to D".

This is a generalization of vi. Since E is arbitrary, it follows that j is right continuous at Xo, as was to be shown.

Spell this out. Suppose that 1 is increasing and that there exist real numbers A and B such that 'v'x: Hence prove iv , first for bounded 1 and then in general.

It should constitute the "jumping part" of F, and if it is subtracted out from F, the remainder should be positive, contain no more jumps, and so be continuous. These plausible statements will now be proved - they are easy enough but not really trivial.

Theorem 1. Let then Fe is positive, increasing, and continuous. F and so Fe is indeed positive. Next, F d is right continuous since each 8aJ is and the series defining F d converges uniformly in x; the same argument yields cf.

Example 2 of Sec. Now this evaluation holds also if F d is replaced by F according to the definition of a j and b j, hence we obtain for each x: This shows that Fe is left continuous; since it is also right continuous, being the difference of two such functions, it is continuous. O in Theorem 1. We may now summarize the two theorems above as follows. Every d. Such a decomposition is unique. Prove that the sum I: Give another proof of the continuity of Fe in Theorem 1.

A plausible verbal definition of a discrete d. But suppose that the set of points of jump is "discrete" in the Euclidean topology, then the definition is valid apart from our convention of right continuity. Exercise 2 in Sec. Define its purely discontinuous and continuous parts and prove the corresponding decomposition theorem.

A point x is said to belong to the support of the d. The set of all such x is called the support of F. Show that each point of jump belongs to the support, and that each isolated point of the support is a point of jump.

Give an example of a discrete d. Prove that the support of any d. Throughout the book this measure will be denoted by m; "almost everywhere" on the real line without qualification will refer to it and be abbreviated to "a.

The class of such functions will be denoted by L 1 a, b , and L I , 00 is abbreviated to L 1. It follows from a well-known proposition see, e. In particular, if F is a d. Conversely, given any f in L I satisfying the conditions in 2 , the function F defined by 3 'Ix: A function F is called singular iff it is not identically zero and F' exists and equals zero a.

Then the following assertions are true. Any positive function f that is equal to F' a. F ac is called the absolutely continuous part, F s the singular part of F. Note that the previous F d is part of F s as defined here. It is clear that F ac is increasing and F ac: Hence F, is also increasing and F, We are now in a position to announce the following result, which is a refinement of Theorem 1.

F can be written as the convex combination of a discrete, a singular continuous, and an absolutely continuous d. Prove Theorem 1. If the support of a d. The converse is false. Suppose that F is a d. Prove that a discrete distribution is singular. Exercise 13 of Sec. Prove that a singular function as defined here is Lebesgue measurable but need not be of bounded variation even locally.

Such a function is continuous except on a set of Lebesgue measure zero; use the completeness of the Lebesgue measure. For this purpose let us recall the construction of the Cantor ternary set see, e.

Let these removed ones, in order of position from left to right, be denoted by ] n. Now for each nand k, n This definition is consistent since two intervals, ]",k and J n'. The value of F is constant on each J nk and is strictly greater on any other J n',k' situated to the right of J n. This F is a continuous d. Thus F is singular. Alternatively, it is clear that none of the points in D is in the support of F, hence the latter is contained in C and of measure 0, so that F is singular by Exercise 3 above, [In Exercise 13 of Sec.

Prove that the support of F is exactly C. It is well known that any point x in C has a ternary expansion without the digit 1: Calculate 11 xdF x , 11 x2 dF x , 11 eitx dF x.

This can be done directly or by using Exercise 10; for a third method see Exercise 9 of Sec. Extend the function F on [0,1] trivially to , 0. Thus we have a singular d. Consider F on [0, I]. Modify its inverse F-1 suitably to make it single-valued in [0,1]. Show that F-I so modified is a discrete d. Such a problem becomes easier when the corresponding measure is considered; see Sec.

The Cantor d. F is a good building block of "pathological" examples. For example, let H be the inverse of the homeomorphic map of [0,1] onto itself: Hence deduce: Some of the usual operations and relations between sets, together with the usual notation, are given below.

WEE, Empty set: A nonempty collection SIt of subsets of Q may have certain "closure properties". Let us list some of those used below; note that j is always an index for a countable set and that commas as well as semicolons are used to denote "conjunctions" of premises. Ej E Lid, E jEd, I Ej Ed. It follows from simple set algebra that under i: Also, ii implies iv and iii implies v by induction.

It is trivial that viii implies ii and vi ; ix implies iii and vii. A nonempty collection ;;'ft of subsets of Q is called a field iff i and ii hold. It is called a monotone class M. It is called a Borel field B. Theorem 2. A field is a B. The "only if" part is trivial; to prove the "if" part we show that iv and vi imply viii. The collection J of all subsets of Q is a B.

If A is any index set and if for every a E A,:: This minimal B. In particular if ;-? Let d1: Since a B. To prove: Hence by Theorem 2. We shall show that it is closed under intersection and complementation. The proof is complete. The theorem above is one of a type called monotone class theorems. They are among the most useful tools of measure theory, and serve to extend certain relations which are easily verified for a special class of sets or functions to a larger class.

Many versions of such theorems are known; see Exercise 10, 11, and 12 below. The best way to define the symmetric difference is through indicators of sets as follows: All properties of t; follow easily from this definition, some of which are rather tedious to verify otherwise.

As examples: If Q has exactly 11 points, then J has 21Z members. The B. If Q is countable, then J is generated by the singletons, and conversely. All countable subsets of Q and their complements form a B. The intersection of any collection of B. The union of a countable collection of B. Fa, ex EA. Prove that if each? Let ,'7" be a B. Consider the class of all sets with the asserted property and show that it is a B. Let Q be a class of subsets of Q having the closure property iii ; let. Then sf contains the B.

This is Dynkin's form of a monotone class theorem which is expedient for certain applications. The proof proceeds as in Theorem 2. Each positive Borel measurable function is the limit of an increasing sequence of simple finitely-valued functions. An or a separable metric space containing all the open sets and closed sets. Prove that f: Show that the minimal such class is a field.

A probability measure g'J. The abbreviation "p. These axioms imply the following consequences, where all sets are members of: The following proposition 1 Ell ,J.. It is a particular case of the monotone property x above, which may be deduced from it or proved in the same way as indicated below. The axioms of finite additivity and of continuity together are equivalent to the axiom of countable additivity.

Let Ell ,J We have the obvious identity: Hence 1 is true. For a later application Theorem 3. Then ii holds whenever Uk Ek E If. The triple Q, 2'ft, 9 is called a probability space triple ; Q alone is called the sample space, and co is then a sample point. C Q, then the trace of the B. It is easy to see that this is a B. Example 1. Let Q be a countable set: Clearly axioms i , ii , and iii are satisfied. Conversely, let any such is? The entire first volume of Feller's well-known book  with all its rich content is based on just such spaces.

It is easily seen that? IJ and m may be defined as before. Example 3. The Euclidean B. A set in. Jel will be called a linear Borel set when there is no danger of ambiguity. However, the Borel-Lebesgue measure m on 0? Jeo, En t 0? For any countably infinite set Q, the collection of its finite subsets and their complements forms a field: Let Q be the space of natural numbers.

Hence ' is not a field. In the preceding example show that for each real number a in [0, I] there is an E in such that: Give an example of E that is not in r'-. Prove the nonexistence of a p. Hence criticize a sentence such as; "Choose an integer at random". Prove that the trace of a B.

Prove that the trace of Q, ;: T is a probability space, if? Now let t T be such that t Such a set is called thick in Q,. This procedure is called the adjunction of t T, fP. But it is not generated by all the singletons of ,gzl nor by any finite collection of subsets of. For the last two kinds of sets see, e. T be the B. Determine the most general p. Instead of requiring that the E j 's be pairwise disjoint, we may make the broader assumption that each of them intersects only a finite number in the collection.

Carry through the rest of the problem. The question of probability measures on:: There is in fact a one-toone correspondence between the set functions on the one hand, and the point functions on the other.

Both points of view are useful in probability theory. We establish first the easier half of this correspondence. Each p. Furthermore, let D be any dense subset of fRI, then the correspondence is already determined by that in 4 restricted to xED, or by any of the four relations in 5 when a and b are both restricted to D. Let us write "Ix E fRI: We shall show that F is a d. First of all, F is increasing by property viii of the measure.

The relations in 5 follow easily from the following complement to 4: Since Ix" t , x , we have by ix: To prove the last sentence in the theorem we show first that 4 restricted to XED implies 4 unrestricted. For this purpose we note that M , x] , as well as F x , is right continuous as a function of x, as shown in 6. Hence the two members of the equation in 4 , being both right continuous functions of x and coinciding on a dense set, must coincide everywhere.

Now 2. Hence 4 follows. Incidentally, the correspondence 4 "justifies" our previous assumption that F be right continuous, but what if we have assumed it to be left continuous? Now we proceed to the second-half of the correspondence. Each d.

F determines a p. This is the classical theory of Lebesgue-Stieltjes measure; see, e. However, we shall sketch the basic ideas as an important review.

The d. Such a function is seen to be countably additive on its domain of definition. What does this mean" Now we proceed to extend its domain of definition while preserving this additivity.

If S is a countable union of such intervals which are disjoint: Next, we notice that any open interval a, b is in the extended domain why? Now it is well known that any open set U in!? But even with the class of open and closed sets we are still far from the B. Although it has been shown to be possible to proceed this way by transfinite induction, this is a rather difficult task.

There is a more efficient way to reach the goal via the notions of outer and inner measures as follows. For any subset S of. C closed, CCS u. It is clear that u. Equality does not in general hold, but when it does, we call S "measurable" with respect to F. In this case the common value wiII be denoted by fL S. This new definition requires us at once to check that it agrees with the old one for all the sets for which fL has already been defined.

The next task is to prove that: Details of these proofs are to be found in the references given above. To finish: II with this property. It may be larger than:?: II, indeed it is see below , but this causes no harm, for the restriction of fL to gjl is a p. Let us mention that the introduction of both the outer and inner measures is useful for approximations. There is an alternative way of defining measurability through the use of the outer measure alone and based on Caratheodory's criterion.

It should also be remarked that the construction described above for: General Bibliography. We are always looking for ways to improve customer experience on Elsevier. We would like to ask you for a moment of your time to fill in a short questionnaire, at the end of your visit.

If you decide to participate, a new browser tab will open so you can complete the survey after you have completed your visit to this website. Thanks in advance for your time. Skip to content. Search for books, journals or webpages All Webpages Books Journals.

Kai Chung. Academic Press. Published Date: Page Count: Flexible - Read on multiple operating systems and devices. Easily read eBooks on smart phones, computers, or any eBook readers, including Kindle.

As a consequence of Theorem 3. Such a reduction is frequently useful when there are technical difficulties in the abstract treatment. We end this section with a discussion of "moments". The moments about the mean are called central moments. That of order 2 is particularly important and is called the variance, var X ; its positive square root the standard deviation.

Vie note the inequality o-2 X It is a special case of the next inequality, of which we will sketch a proof. Jensen's inequality. If cP is a convex function on 0'l1, and X and cp X are integrable r.

Convexity means: Let then X take the value Yj with probability Aj, 1: Then we have by definition 1. Finally, we prove a famous inequality that is almost trivial but very useful. Chebyshev inequality. We have by the mean value theorem: Another proof of Prove that If 0: Thus if X is a positive r. Jo 0 They are said to be painvise independent iff every two of them are independent. Note that 1 implies that the r. In terms of the p. Finally, we may introduce the n-dimensional distribution unction corresponding to J.

Let A j E 9 31, then! By Theorem 3. The proof of the next theorem is similar and is left as an exercise. Let 1 are independent. We give two proofs in detail of this important result to illustrate the methods.

First proof. Suppose first that the two r. Since X and Y are independent, we have for every j and k: Thus 5 is true in this case. Now let X and Y be arbitrary positive r. Then, according to the discussion at the beginning of Sec.

Here we have. For the general case, we use 2 and 3 of Sec. This again can be seen directly 01 as a consequence of Theorem 3. Hence we have, under our finiteness hypothesis: The first proof is completed.

Second proof. Consider the random vector X, Y and let the p. Then we have by TheOlem 3. Observe that we are using here a very simple form of Fubini's theorem see below. Indeed, the second proof appears to be so much shorter only because we are relying on the theory of "product measure" J. This is another illustration of the method of reduction mentioned in connection with the proof of 17 in Sec. A rigorous proof of this fact may Do independent random variables exist?

Here we can take the cue from the intuitive background of probability theory which not only has given rise historically to this branch of mathematical discipline, but remains a source of inspiration, inculcating a way of thinking peculiar to the discipline It may be said that no one could have learned the subject properly without acquiring some feeling for the intuitive content of the concept of stochastic independence, and through it, certain degrees of dependence.

Briefly then: If an unbiased coin is tossed and the two possible outeomes are recorded as 0 and 1, this is an r. Repeated tossing will produce a sequence of outcomes. If now a die is cast, the outcome may be similarly represented by an r. Next we may draw a card from a pack or a ball from an urn, or take a measurement of a phYSIcal quantity sampled from a given population, or make an observation of some fortuitous natural phenomenon, the outcomes in the last two cases being r.

Now it is very easy to conceive of undertaking these various trials under conditions such that their respective outcomes do not appreciably affect each other; indeed it would take more imagination to conceive the opposite! In this circumstance, idealized, the trials are carried out "independently of one another" and the corresponding r.

We have thus "constructed" sets of independent r. Can such a construction be made rigorous? We begin by an easy special case. Let n The product B. Recall that Example 1 of Sec. Since Qn is also a countable set, we may define a p. This p. It is trivial to verify that this is indeed a p. Furthermore, it has the following product property, extendmg Its defimtIOn 7: Now let Xj be an r.

The reader may recall the term "independent variables" used in calculus, particularly for integration in several variables. The two usages have some accidental rapport. The point of Example 2 is that there is a ready-made product measure there. The situation is somewhat more complicated than in Example 1, just as Example 3 in Sec. Indeed, the required construction is exactly that of the corresponding Lebesgue-Stieltjes measure in n dimensions.

ThiE will be subsumed in the next theorem. Assuming that it has been accomplished, then sets of n independent r. Can we construct r. The simplest case will now be described and we shall return to it in the next chapter.

For the sake of definiteness, let us agree that only expansions with infinitely many digits "1" are used. Now each digit E j of x is a function of x taking the values 0 and 1 on two Borel sets. Hence they are r.

This example seems extremely special, but easy extensions are at hand see Exercises 13, 14, and 15 below We are now ready to state and plOve the fundamental existence themem of product measures. Let a finite or infinite sequence of p. There exists a probability space Q,. Without loss of generality we may suppose that the given sequence is infinite. For each n, let Q n ,: XII with Jlll as its p. Exercise 3 of Sec. Let the collectlon of subsets of Q, each of which is the union of a finite number of disjoint finiteproduct sets, be.: We shall take the df in the theorem to be the B.

This;if is called the product B. First, for each finite product set such as the E given in 11 we set X: Next, if E c. In order to verify countable additivity it is sufficient to verify the axiom of continuity, by the remark after Theorem 2.

Note that each set E in: To simplify the notation and with no real loss of generality why? To see this we begin with the equatIOn o 14 This is trivial if C Il is a finite-product set by definition 12 , and follows by addition for any C n in d1: E and so fOlth by induction.

The next theorem, which is a generalization of the extension theorem discussed in Theorem 2. This extension is called the product measure of the sequence k? There exists a unique p. The uniqueness is proved in Theorem 2. Returning to Theorem 3. Suppose that f is measurable with respect to Q2 is measurable with respect to! The reader should readily recognize the particular cases of the theorem with one or both of the factor measures discrete, so that the corresponding integral reduces to a sum.

Thus it includes the familiar theorem on evaluating a double series by repeated summation. We close this section by stating a generalization of Theorem 3. This result is a particular case of Kolmogorov's extension theorem, which is valid for an arbitrary family of r.

In terms of dJ. For each m 2': For a proof of this first fundamental theorem in the theory of stochastic processes, see Kolrnogorov . Show that two r. If XI and X 2 are independent r. Find an example of n r. Fields or B F 's. C0 of any family are said to be independent iff any collection of events, one from each g;, forms a set of independent events. Prove, however, that the conditions 1 and 2 are equivalent. Use Theorem 2. X is independent of itself if and only if it is constant with probability one. Can X and f X be independent where f E 2 31? Find the dJ. What is the precise relation? Modify Example 4 so that to each x in [0, I] there corresponds a sequence of independent and identically distributed r. Jj EjEk ' 3.

A typical application of Fubini' s theorem is as follows. Here and hereafter the tenn "convergence" will be used to mean convergence to a finite limit. Thus it makes sense to say: The limit is then a finite-valued r. The sequence of r. This type of trivial consideration makes it possible, when dealing with a countable set of r. The following characterization of convergence a. Theorem 4. Suppose there is convergence a. For m Conversely, suppose 2 holds, then we see above that the set A E U:: A weaker concept of convergence is of basic importance in probability theory.

Strictly speaking, the definition applies when all Xn and X are finitevalued. But we may extend it to r. Since 2' clearly implies 5 , we have the immediate consequence below. Convergence a. Sometimes we have to deal with questions of convergence when no limit is in evidence. For convergence a. For convergence in pr. It can be shown Exercise 6 of Sec. If Xn converges to 0 in LP, then it converges to 0 in pr.

Hence there is no loss of generality to assume X - O. This proves the first assertion. If now IXnl The general result is as follows. To show that p. In The reader should not be misled by these somewhat "artificial" examples to think that such counterexamples are rare or abnormal. Natural examples abound in more advanced theory, such as that of stochastic processes, but they are not as simple as those discussed above. Here are some culled from later chapters, tersely stated.

ThIS kind of example can be formulated for any recurrent process such as a Brownian motion. The same holds fOl the martingale that consists in tossing a fail coin and doubling the stakes until the first head see Sec.

Finally, we mention another kind of convergence which is basic in flJnctional analysis, but confine ourselves to L 1. Clearly convergence in Ll defined above implies weak convergence; hence the former is sometimes referred to as "strong".

On the other hand, Example 2 above shows that convergence a. Let I be a bounded uniformly continuous function in 0' 1. Let I be a continuous function on 0' 1. The result is false if [HINT: The extended-valued r. Prove that X is bounded in pro if and only if it is finite a. The sequence of extended-valued r. Instead of the P in Theorem 4. Prove that these are metrics and that convergence in pro is equivalent to convergence according to either metric.

Convergence in pro for arbitrary r. In a space of uniformly bounded r. Unlike convergence in pr. If Xn"! Prove that for any bounded r. These notions can be defined for subsets of an arbitrary space Q. The main properties of these sets will be given in the following two propositions. A point belongs to liminfn En if and only if it belongs to all terms of the sequence from a certain term on.

Conversely, if w belongs to F m, then w E F m for every m. Were w to belong to only a finite number of the En's there would be an m such that w fj En for n 2: UE n Fm. In more intuitive language: The advantage of such a notation is better shown if we consider, for example, the events "IXn I Using the notation in the preceding proof, it is clear that F m decreases as m increases. Hence by the monotone property of p. By Boole's inequality for p. As an illustration of the convenience of the new notions, we may restate Theorem 4.

The intuitive content of condition 5 below is the point being stressed here. XII o Briefly stated: Then we have by 4: Here the mdex mvolved in "i. J1l that belongs to infinitely many Bn's.

For any sequence of r. Choose nk so that cf, Theorem 4. Use Theorem 4. Consider all pairs of rational numbers a. This is equivalent to:. IS the left Il Theorems 4. The first is more useful since the events there may be completely arbitrary.

The second has an extension to pairwise independent r. It is a useful technique in probability theory. What has been said so far is true for arbitrary En's. Now the hypothesis in 6 may be written as 00 I: Using Chebyshev's inequality, we 10 where a 2 ] denotes the variance of J.

Since ThIS IS an example of a "zero-or-one" law to be dIscussed though it is not included in any of the general results there. Let En be the event that a real number in [0, 1] has its n-ary expansion begin with 0. Prove that the probability of convergence of a sequence of independent r.

IXnl hm sup - Strengthen Theorem 4. Use 11 above. Is it true that limn J. The answer is no from trivial examples. Accordmg to our defirutlOn of a r. This example can be easily ramified; e. We leave this to the reader but proceed to give the appropriate definitions which take into account the two kinds of troubles discussed above. We shall see that it is unique below. Let ttn and tt be s. The following propositions are equivalent. Finally, suppose iii is true so that 1 holds.

The theorem is proved. As an immediate consequence, the vague limit is unique. Another consequence is: The case of strict probability measures will now be treated. Then i , ii , and iii in the preceding theorem are equivalent to the following "uniform" strengthening of i.

Then there exist an integer. Recall that the latter is sequentially compact, which means: Given any sequence of numbers in the set, there is a subsequence which converges, and the limit is also a number in the set. This is the fundamental Bolzano- Weierstrass theorem. The set of all s. It is often referred to as "Helly's extraction or selection principle". Given any sequence of s. Here it is convenient to consider the subdistribution function s. Fn defined as follows: To see this let rj be given. To F corresponds a unique s. L , xl as in Theorem 2. Now the relation 8 yields, upon taking differences: L, and the theorem is proved.

L where fJ. The reader should be able to confirm the truth of the following proposition about real numbers. In particular a bounded sequence such that every convergent subsequence has the same limit is convergent to this limit. The next theorem generalizes this result to vague convergence of s.

It is not contained in the preceding proposition but can be reduced to it if we use the properties of vague convergence; see also Exercise 9 below. If every vaguely convergent subsequence of the sequence v of s. L, then f.. To prove the theorem by contraposition, suppose f.. Ln does not converge vaguely to f.. Then by Theorem. L such that f.. Ln a, b does not converge to f.. By Theorem 4. L by hypothesis of the k theorem. Hence again by Theorem 4.

Perhaps the most logical approach to vague convergence is as follows. The definition given before implies this, of course, but prove the converse. Prove that if 1 is true, then there exists a dense set D', such that f.. Can a sequence of absolutely continuous p. Can a sequence of discrete p. If a sequence of p.

This is due to P6Iya. Renyi, . Use 7 of Sec. Prove a convergence theorem in metric space that will include both Theorem 4. Use Exercise 9 of Sec. This has to do with classes of continuous functions on. It is well known that Co is the closure of C K with respect to uniform convergence. An arbitrary function f defined on an arbitrary space is said to have support in a subset S of the space iff it vanishes outside S.

This lemma becomes obvious as soon as its geometric meaning is grasped. But let us remark that the lemma is also a particular case of the Stone-Weierstrass theorem see, e.

Such a sledgehammer approach has its merit, as other kinds of approximation soon to be needed can also be subsumed under the same theorem. Indeed, the discussion in this section is meant in part to introduce some modem terminology to the relevant applications in probability theory. We can now state the following alternative criterion for vague convergence. IIence by the linearity of integrals it is also true when f is any D-valued step function.

The second term converges to zero as n 00 because f f is a D-valued step function. Conversely, suppose 2 is true for f E CT. Let A be the set of atoms of It as in the proof of Theorem 4.

This must then be the same for every vaguely convergent subsequence, according to the hypothesis of the corollary. The vague limit of every such sequence is therefore uniquely determined why? A similar estimate holds with fL replacing fLn above, by 7. Now the argument leading from 3 to 2 finishes the proof of 6 in the same way. Theorems 4. This is the sense of Example 2 in Sec. It is sometimes demanded that such a limit be a p.

The following criterion is not deep, but applicable. Let a family of p. In order that every sequence of them contains a subsequence which converges vaguely to a p. UEA Suppose 11 holds.

We show that fL is a p. Let J be a continuity interval of M which contains the I in Then J C In for all sufficiently large 11, so that. The preceding theorem can be stated as follows: The word "relatively" purports that the limit need not belong to the family; the word "compact" is an abbreviation of "sequentially vaguely convergent to a strict p. The new definition of vague convergence in Theorem 4. There is no substitute for "intervals" in such a space but the classes C K, Co and C B are readily available.

We will illustrate the general approach by indicating one more result in this direction. Usually j IS allowed to be extended-valued; but to avoid complications we will deal with bounded functions only and denote by L and V respectively the classes of bounded lower semicontinuous and bounded upper semicontinuous functions.

This gives the first inequality in A sequence of r. More briefly stated, convergence in pr. Since f is bounded the convergence holds also in L 1 by Theorem 4. For instance, 4.

This is in contrast to the true convergence concepts discussed before; cf. Exercises 3 and 6 of Sec. But if Xn and Y n are independent, then the preceding assertion is indeed true as a property of the convergence of convolutions of distributions see Chapter 6.

However, in the simple situation of the next theorem no independence assumption is needed. The result is useful in dealing with limit distributions in the presence of nuisance terms. Exercise 4 below. Let f..

Ln and f.. Show that the conclusion in 2 need not hold if a f is bounded and Borel measurable and all f.. L are absolutely continuous, or b f is continuous except at one point and every f..

Ln is absolutely continuous. To find even sharper counterexamples would not be too easy, in view of Exercise 10 of Sec. L when the f.. Ln'S are s. Then for each each finite continuity interval I we have f df.. L be as in Exercise 1. If the f n's are bounded continuous functions converging uniformly to f, then f n df.. Give an example to show that convergence in dist. However, show that convergence to the unit mass oa does imply that in pr.

Let the r. Prove the Corollary to Theorem 4. If the r. Derive another proof of Theorem 4. The Levy distance of two s. Prove that this is indeed a metric in the space of s. Find two sequences of p.

L, where f.. If the [HINT: In general one may proceed by contradiction using an f that oscillates at infinity. Let Fn and F be dJ. Define Gn e and G e as in Exercise 4 of Sec. Do this first when F nand F are continuous and strictly increasing. Indeed, we have seen in Example 2 of Sec. It is useful to have conditions to ensure the convergence of moments when Xn converges a. We begin with a standard theorem in this direction from classical analysis. The next result should be compared with Theorem 4.