Encyclopædia Britannica, Ninth Edition/Probability

THE mathematical theory of probability is a science which aims at reducing to calculation, where possible, the amount of credence due to propositions or statements, or to the occurrence of events, future or past, more especially as contingent or dependent upon other propositions or events the probability of which is known. Any statement or (supposed) fact commands a certain amount of credence, varying from zero, which means conviction of its falsity, to absolute certainty, denoted by unity. An even chance, or the probability of an event which is as likely as not to happen, is represented by the fraction i. It is to be observed that will be the probability of an event about which we have no knowledge whatever, because if we can see that it is more likely to happen than not, or less likely than not, we must be in possession of some information respecting it. It has been proposed to form a sort of thermometrical scale, to which to refer the strength of the conviction we have in any given case. Thus if the twenty-six letters of the alphabet have been shaken together in a bag, and one letter be drawn, we feel a very feeble expectation that A has been the one taken. If two letters be drawn, we have still very little confidence that A is one of them ; if three be drawn, it is somewhat stronger ; and so on, till at last, if twenty-six be drawn, we are certain of the event, that is, of A having been taken.

Probability, which necessarily implies uncertainty, is a consequence of our ignorance. To an omniscient Being there can be none. Why, for instance, if we throw up a shilling, are we uncertain whether it will turn up head or tail ? Because the shilling passes, in the interval, through a series of states which our knowledge is unable to predict or to follow. If we knew the exact position and state of motion of the coin as it leaves our hand, the exact value of the final impulse it receives, the laws of its motion as affected by the resistance of the air and gravity, and finally the nature of the ground at the exact spot where it falls, and the laws regulating the collision between the two substances, we could predict as certainly the result of the toss as we can which letter of the alphabet will be drawn after twenty-five have been taken and examined.

The probability, or amount of conviction accorded to any fact or statement, is thus essentially subjective, and varies with the degree of knowledge of the mind to which the fact is presented (it is often indeed also influenced by passion and prejudice, which act powerfully in warping the judgment), so that, as Laplace observes, it is affected partly by our ignorance partly by our knowledge. Thus, if the question were put, Is lead heavier than silver? some persons would think it is, but would not be surprised if they were wrong ; others would say it is lighter ; while to a worker in metals probability would be superseded by certainty. Again, to take Laplace s illustration, there are three urns A, B, C, one of which contains black balls, the other two white balls ; a ball is drawn from the urn C, and we want to know the probability that it shall be black. If we do not know which of the urns contains the black balls, there is only one favourable chance out of three, and the probability is said to be J. But if a person knows that the urn A contains white balls, to him the uncertainty is confined to the urns B and C, and therefore the probability of the same event is . Finally to one who had found that A and B both contained white balls, the probability is converted into certainty. In common language, an event is usually said to be likely or probable if it is more likely to happen than not, or when, in mathematical language, its probability exceeds 1/2; and it is said to be improbable or unlikely when its probability is less than . Not that this sense is always adhered to ; for, in such a phrase as " It is likely to thunder to-day," we do not mean that is more likely than not, but that in our opinion the chance of thunder is greater than usual ; again, " Such a horse is likely to win the Derby," simply means that he has the best chance, though according to the betting that chance may be only 1/6. Such unsteady and elliptical employment of words has of course to be abandoned and replaced by strict definition, at least mentally, when they are made the subjects of mathematical analysis. Certainty, or absolute conviction, also, as generally understood, is different from the mathematical sense of the word certainty. It is very difficult and often impossible, as is pointed out in the celebrated Grammar of Assent, to draw out the grounds on which the human mind in each case yields that conviction, or assent, which, according to Newman, admits of no degrees, and either is entire or is not at all.[1] If, when walking on the beach, we find the letters " Constantinople " traced on the sand, we should feel, not a strong impression, but absolute certainty, that they were characters not drawn at random, but by one acquainted with the word so spelt. Again, we are certain of our own death as a future event ; we are certain, too, that Great Britain is an island ; yet in all such cases it would be very difficult, even for a practised intellect, to present in logical form the evidence, which nevertheless has compelled the mind in each instance to concede the point.[2] Mathematical certainty, which means that the contrary proposition is inconceivable, is thus different, though not perhaps as regards the force of the mental conviction, from moral or practical certainty. It is questionable whether the former kind of certainty is not entirely hypothetical, and whether it is ever attainable in any of the affairs or events of the real world around us. The truth of no conclusion can rise above that of the premises, of no theorem above that of the data. That two and two make four is an incontrovertible truth ; but before applying even it to a concrete instance we have to be assured that there were really two in each constituent group; and we can hardly have mathematical certainty of this, as the strange freaks of memory, the tricks of conjurors, &c., have often made apparent.

There is no more remarkable feature in the mathematical theory of probability than the manner in which it has been found to harmonize with, and justify, the conclusions to which mankind have been led, not by reasoning, but by instinct and experience, both of the individual and of the race. At the same time it has corrected, extended, and invested them with a definiteness and precision of which these crude, though sound, appreciations of common sense were till then devoid. Even in cases where the theoretical result appears to differ from the common-sense view, it often happens that the latter may, though perhaps unknown to the mind itself, have taken account of circumstances in the case omitted in the data of the theoretical problem. Thus, it may be that a person accords a lower degree of credence to a fact attested by two or more independent witnesses than theory warrants, the reason being that he has unconsciously recognized the possibility of collusion, which had not been presented among the data. Again, it appears from the rules for the credibility of testimony that the probability of a fact may be diminished by being attested by a new witness, viz., in the case where his credibility is less than 1/2. This is certainly at variance with our natural impression, which is that our previous conviction of any fact is clearly not weakened, however little it be intensified, by any fresh evidence, however suspicious, as to its truth. But on reflexion we see that it is a practical absurdity to suppose the credibility of any witness less than 1/2—that is, that he speaks falsehood oftener than truth for all men tell the truth probably nine times out of ten, and only deviate from it when their passions or interests are concerned. Even where his interests are at stake, no man has any preference for a lie, as such, above the truth ; so that his testimony to a fact will at worst leave the antecedent probability exactly what it was.

A celebrated instance of the confirmation and completion by theory of the ordinary view is afforded by what is known as James Bernoulli’s theorem. If we know the odds in favour of an event to be three to two, as for instance that of drawing a white ball from a bag containing three white and two black, we should certainly judge that if we make five trials we are more likely to draw white three times and black twice than any other combination. Still, however, we should feel that this was very uncertain ; instead of three white, we might draw white 0, 1, 2, 4, or 5 times. But if we make say one thousand trials, we should feel confident that, although the numbers of white and black might not be in the proportion of three to two, they would be very nearly in that proportion. And the more the trials are multiplied the more closely would this proportion be found to obtain. This is the principle upon which we are continually judging of the possibility of events from what is observed in a certain number of cases.[3] Thus if, out of ten particular infants, six are found to live to the age of twenty, we judge, but with a very low amount of conviction, that nearly six-tenths of the whole number born live to twenty. But if, out of 1,000,000 cases, we find that 600,000 live to be twenty, we should feel certain that the same proportion would be found to hold almost exactly were it possible to test the whole number of cases, say in England during the 19th century. In fact we may say, considering how seldom we know a priori the probability of any event, that the knowledge we have of such probability in any case is entirely derived from this principle, viz., that the proportion which holds in a large number of trials will be found to hold in the total number, even when this may be infinite, the deviation or error being less and less as the trials are multiplied.

Such no doubt is the verdict of the common sense of mankind, and it is not easy to say upon what considerations it is based, if it be not the effect of the unconscious habit which all men acquire of weighing chances and probabilities, in the state of ignorance and uncertainty which human life is. It is now extremely interesting to see the results of the unerring methods of mathematical analysis when applied to the same problem. It is a very difficult one, and James Bernoulli tells us he reflected upon it for twenty years. His methods, extended by De Moivre and Laplace, fully confirm the conclusions of rough common sense ; but they have done much more. They enable us to estimate exactly how far we can rely on the proportion of cases in a large number of trials, truly representing the proportion out of the total number that is, the real probability of the event. Thus he proves that if, as in the case above mentioned, the real probability of an event is 3/5, the odds are 1000 to 1 that, in 25,550 trials, the event shall occur not more than 15,841 times and not less than 14,819 times, that is, that the deviation from 15,330, or 3/5 of the whole, shall not exceed 1/50 of the whole number of trials.

The history of the theory of probability, from the celebrated question as to the equitable division of the stakes between two players on their game being interrupted, proposed to Pascal by the Chevalier de Mere in 1654, embracing, as it does, contributions from almost all the great names of Europe during the period, down to Laplace and Poisson, is elaborately and admirably given by Mr Todhunter in his History of the subject, now a classical work. It was not indeed to be anticipated that a new science which took its rise in games of chance, and which had long to encounter an obloquy, hardly yet extinct, due to the prevailing idea that its only end was to facilitate and encourage the calculations of gamblers, could ever have attained its present status that its aid should be called for in every department of natural science, both to assist in discovery, which it has repeatedly done (even in pure mathematics), to minimize the unavoidable errors of observation, and to detect the presence of causes as revealed by observed events. Nor are commercial and other practical interests of life less indebted to it:[4] wherever the future has to be forecasted, risk to be provided against, or the true lessons to be deduced from statistics, it corrects for us the rough conjectures of common sense, and decides which course is really, according to the lights of which we are in possession, the wisest for us to pursue. It is sui generis and unique as an application of mathematics, the only one, apparently, lying quite outside the field of physical science. De Moivre has remarked that, “some of the problems about chance having a great appearance of simplicity, the mind is easily drawn into a belief that their solution may be attained by the mere strength of natural good sense”; and it is with surprise we find that they involve in many cases the most subtle and difficult mathematical questions. It has been found to tax to the utmost the resources of analysis and the powers of invention of those who . have had to deal with the new cases and combinations which it has presented. Great, however, as are the strictly mathematical difficulties, they cannot be said to be the principal. Especially in the practical applications, to detach the problem from its surroundings in rerum natura, discard ing what is non-essential, rightly to estimate the extent of our knowledge respecting it, neither tacitly assuming as known what is not known, nor tacitly overlooking some datum, perhaps from its very obviousness, to make sure that events we are taking as independent are not really connected, or probably so, such are the preliminaries necessary before the question is put in the scientific form to which calculation can be applied, and failing which the result of the mathematician will be but an ignoratio elenchi a correct answer, but to a different question.

From its earliest beginnings, a notable feature in our subject has been the strange and insidious manner in which errors creep in often misleading the most acute minds, as in the case of D Alembert and the difficulty of detecting them, even when one is assured of their presence by the evident incorrectness of the result. This is probably in many cases occasioned by the poverty of language obliging us to use one term in the same context for different things thus introducing the fallacy of ambiguous middle ; e.g., the same word " probability " referring to the same event may sometimes mean its probability before a certain occurrence, sometimes after ; thus the chance of a horse winning the Derby is different after the Two Thousand from what it was before. Again, it may mean the probability of the event according to one source of information, as distinguished from its probability taking everything into account ; for instance, an astro nomer thinks he can notice in a newly-discovered planet a rotation from east to west ; the probability that this is the case is of course that of his observations in like cases turning oat correct, if we had no other source of informa tion ; but the actual probability is less, because we know that at least the vast majority of the planets and satellites revolve from west to east. It is easy to see that such employment of terms in the same context must prove a fruitful source of fallacies ; and yet, without wearisome repetitions, it cannot always be avoided. But, apart from mere logical errors, the main stumbling-block is no doubt the uncertainty as to the limits of our knowledge in each case, or though this may seem a contradiction in terms the difficulty of knowing what we do know ; and we certainly err as often in forgetting or ignoring what we do know, as in assuming what we do not. It is a not uncommon popular delusion to suppose that if a coin has turned up head, say five times running, or the red has won five times at roulette, the same event is likely to occur a sixth time ; and it arises from overlooking (perhaps from the imagination being struck by the singularity of the occurrence) the a priori knowledge we possess, that the chance at any trial is an even one (supposing all perfectly fair) ; the mind thus unconsciously regards the event simply as one that has recurred five times, and therefore judges, correctly, that it is very likely to occur once more. Thus if we are given a bag containing a number of balls, and we proceed to draw them one by one, and the first five drawn are white, the odds are 6 to 1 that the next will be white, the slight information afforded by the five trials being thus of great importance, and strongly influencing the probabilities of the future, when it is all we have to guide us, but absolutely valueless, and without influence on the future, when we have a priori certain information. The lightest air will move a ship which is adrift, but has simply no effect on one securely moored.

It is not to be supposed that the results arrived at when the calculus of probabilities is applied to most practical questions are anything more than approximations; but the same may be said of almost all such applications of abstract science. Partly from ignorance of the real state of the case, partly from the extreme intricacy of the calculations requisite if all the conditions which we do or might know are introduced, we are obliged to substitute in fact, for the actual problem, a simpler one approximately representing it. Thus, in mechanical questions, assumptions such as that the centre of gravity of an actual sphere is at its centre, that the friction of the rails on a railway is constant at different spots or at different times, or that in the rolling of a heavy body no depres sion is produced by its weight in the supporting substance, are instances of the convenient fictions which simplify the real question, while they prevent us accepting the result as more than something near the truth. So in probability, the chance of life of an individual is taken from the general tables (unless reasons to the contrary are very palpable) although, if his past history, his mode of life, the longevity of his family, &c., were duly weighed, the general value ought to be modified in his case ; again, in attempting to estimate the value of the verdict of a jury, whether unanimous or by a majority, each man is supposed to give his honest opinion, feeling and prejudice, or pressure from his fellow-jurors, being left out of the account. Again, the value of an expectation to an individual is taken to be measured by the sum divided by his present fortune, though it is clearly affected by other circumstances, as the number of his family, the nature of his business, &c. An event has been found to occur on an average once a year during a long period : it is not difficult to show that the chance of its happening in a particular year is 1−e−1 , or 2 to 1 nearly. But, on examining the record, we observe it has never failed to occur during three years running. This fact increases the above chance ; but to introduce it into the calculation at once renders the question a very difficult one. Even in games of chance we are obliged to judge of the relative skill of two players by the result of a few games ; now one may not have been in his usual health, itc., or may have designedly not played his best ; when he did win he may have done so by superior play, or rather by good luck ; again, even in so simple a case as pitch and toss, the coin may, in the concrete, not be quite symmetrical, and the odds of head or tail not quite even.

Not much has been added to our subject since the close of Laplace s career. The history of science records more than one parallel to this abatement of activity. When such a genius has departed, the field of his labours seems exhausted for the time, and little left to be gleaned by his successors. It is to be regretted that so little remains to us of the inner working of such gifted minds, and of the clue by which each of their discoveries was reached. The didactic and synthetic form in which these are pre sented to the world retains but faint traces of the skilful inductions, the keen and delicate perception of fitness and analogy, and the power of imagination though such a term may possibly excite a smile when applied to such dry subjects which have doubtless guided such a master as Laplace or Newton in shaping out each great design only the minor details of which have remained over, to be supplied by the less cunning hand of commentator and disciple.

We proceed to enumerate the principal divisions of the theory of probability and its applications. Under each we will endeavour to give at least one or two of the more remarkable and suggestive questions which belong to it, especially such as admit of simplification or improvement in the received solutions ; in such an article as the present we are debarred from attempting even an outline of the whole. We will suppose the general fundamental principles to be already known to the reader, as they are to be now found in several elementary works, such as Todhunter’s Algebra, Whitworth’s Choice and Chance, &c.

Many of the most important results are given under the apparently trifling form of the chances in drawing balls from an urn, <tc., or seem to relate to games of chance, as dice or cards, but are in reality of far wider application, this form being adopted as the most definite and lucid manner of presenting the chances of events occurring under circumstances which may be assimilated, more or less closely, to such cases. PROBABILITY 771 I. DETERMINATION OF THE PROBABILITIES OF COMPOUND EVENTS, WHEN THE PROBABILITIES OF THE SIMPLE EVENTS ON WHICH THEY DEPEND ARE KNOWN. 1. Under this head come a very large and diversified range of questions ; a very few of the most important are all that we can give. One great class relates to the fulfilment of given conditions in repeated trials as to the same event, knowing the probability of what will happen in each trial. 2. Lot there be an event which must turn out in one of two ways, W and B (as in drawing a ball from an urn containing white and black balls only) ; let the respective probabilities for each trial be p,q ; sothat j p + g f =l. Let two trials be made : the four possible cases which may arise are WW, WB, BW, BB. The probability of the first is p 1 , of the second pq, of the third pq, of the fourth q 2 . Thus the probability of a white and a black ball being drawn i an assigned order is pq ; but that of a white and a black in any order is 2/;g. Suppose now n trials to be made. The probability of W every time is p" ; that of B once and W (n-l) times in an assigned order is p n l q, but if the order is indifferent it is np n ~ 1 q ; that of B occurring twice only is p n ~^q 2 if the order is given, but n(n 1 ) _Jpn-2qi i n anv order; and so on. We have then this a result : in the binomial expansion (p + q} n = 1 =^" + np n - l q + - p n - 2 q 2 the terms in their order give the probabilities of the event W happening n times ; of W (n-1) times and B once ; of W (71 -2) times and B twice ; and so on, the sum of the whole giving 1, that is, certainty. 3. As an example, let A and B be two players whose respective chances of winning one game are p and q ; to find the probability of A winning in games before B wins ?i games, the play terminat ing when either of these events has occurred. The chance of A winning the first m games is p m . The chance of his winning in the first m + l games is mp m - 1 q. p = mp m q; for he must have won m - 1 games out of the first m, and then win the (tti + l)th; otherwise we should be including the first case. Again, the chance of A winning in the first m + 2 games is, in like (m + l)m (7)i + l)7)i manner, -^ p m - 1 q*p = p m q 2 ; and so on. Now the L 2 match must be decided at latest by the (m+n- l)th game ; for, if A fails to win ?)i games by that time, B must have won n. Hence the chance of A winning the match is ( m(m + l) m(m + l) . . . (m + n -2) , ,) p m 1 I + mq H ! ~ 2 - 1 - 1 n-l Thus, if A s skill be double that of B, the chance that A wins 112 131 four games before B wins two is ^-^ . That of B winning is . 24o 243 If A and B agree to leave off playing before the match is decided, the stakes ought clearly to be divided between them in proportion to their respective probabilities of winning, as given above, putting for m and n the numbers of games required to be won, at any given point of the match, by A and B respectively. This was one of the questions proposed to Pascal by the Chevalier de Mere in the year 1654. 4. In the expansion (1) it may be asked which combination of the events W, B is most likely to occur in the n trials. As the ratio of the 2d term to the 1st is n -. of the 3d to the 2d n - P 2 p and of the (r + l)th to the rth --, so long as this ratio continues to increase the terms will increase. The condition, therefore, for the rth term to be the greatest is n - r + 1 ^ p r J that is, r is the next integer above (71 + 1)7. We conclude that if r is the next integer below (n + l)q the (r + l)th term is the greatest that is, it is most likely that the event W occurs n-r times and B r times. If (n + l)q should be an integer (r), B is as likely to occur r as r+1 times ; and either is more probable than any other number. Thus, in twelve throws of a die, the ace is more likely to turn up twice than any other number; while in eleven throws it is as likely to turn up once only as twice. It is important to remark that, if the number of trials n be very large, we may treat qn and pi as whole numbers, and conclude that the event W is more likely to happen pn times and B qn times than in any other proportion. 5. Among the many questions which relate to the occurrence of different combinations in successive trials as to the same event, one is as to the chances for a succession, or run, of the same result several times. Let us consider the very simple case In n throws of a coin, what is the chance that head occurs (at least) twice running ? This will be an instance of the aid afforded by the calculus of finite differences in questions on probability. Let ?< r = the number of cases of r throws of a coin in which head turns up twice running, the whole number of cases being of course 2 r . Now if we consider the value of B + 3 , it includes 2w n+2 , because the (?i + 3)th throw may turn up two ways ; but it includes also those cases when head turns up in the last two throws, tail in the preceding one. and no run of two heads occurs in the n preceding ones. The number of these cases is 2 n - u n . We have therefore the equation Un+Z = 2l( n+2 + 2" - U n ..... (2). If E be an operator such that Ei r = w r +i , equation (2) is (E 3 -2E 2 + !) = 2"; or, (E-l)(E 8 -E-l)w*-2"; so that, if we put a, for the roots of the equation E 2 - E - 1 = 0, (3). since re =2" is a particular solution of (2), A, B, C being three undetermined constants. Now in two throws there is one case where head turns up twice, and in three throws there are three cases ; hence we have u : = = 2 + A + Ba + C/3 we shall easily find

and, remembering that o 2 = a + l, /3 2 = from these so that Now <- a a - n n+2 _ on+2 a-jB 1-V5 2 2 expanding by the binomial theorem and reducing, 71 + 2 l + : ^ . + l)7t(n-l)(7l-2), 2 ~~ (4). 5); dividing by the total number of cases 2 n , we have for the proba bility of head turning up at least twice running in 71 throws 6 . . . Another method of obtaining the same result is to consider the number of cases in which head never occurs twice running ; let u n be this number, then 2~ u n must be the number of cases when head occurs at least twice successively. Consider the value of + 2 ; if the last or (?i + 2)th throw be tail, w n + 2 includes all the cases (MB+J) of the n + 1 preceding throws which gave no succession of heads ; and if the last be head" the last but one must be tail, and these two may be preceded by any one of the u n favourable cases for the first n throws. Consequently n +2 = U n+ i + U n , If a, j8, as before, are the roots of the quadratic E 3 - E - 1 = 0, this equation gives Here A and B are easily found from the conditions ^ = 2, 2 = viz., A B- a-/3 whence _ S- 5+&C

as in eq. (5). The probability that head never turns up twice running is found by dividing this by 2", the whole number of cases. This probability of course becomes smaller and smaller as the number of trials (n) is increased. 6. Let us consider the chance of a run of three heads or tails during n throws, that of a run of two heads or tails being 9 n 2 1 evidently =1 r , as there are but two cases out of the 2 n 2 n 2 n which are alternately head and tail. Let 11 r be the number of cases, during r throws, which give at least one succession of three heads or three tails. Consider the value of u H+3 ; it includes 2u,,+ s , as the last throw may be head or tail ; but, besides these, every case of the first n throws which contains no run of three gives rise to one new case of the n + 3 having a run of three ; thus, if the ?th throw be head, the last four may be HTTT, or THHH if the Tith be tail. Hence 772 PROBABILITY the same equation of differences as (2). Its solution is equation (3), in which, if we determine the constants by the conditions ^ = 0, w., = 0, 3 =2, and divide by 2", we find for the probability of a run of three of either event during n trials K , Kl , _u n _ n + 1 (, , n -yi~ - 2 ^^TJ 1 " ~ Comparing this result with (6) we find that the chance of a run of two heads in n trials is equal to the cliance of a run of three, of eitJicr heads or tails, in n + 1 trials. 7. If an event may turn out on each trial in a + b ways, of which a are favourable and b unfavourable (thus a card may be drawn from a pack in fifty-two ways, twelve of which give court cards), and if we consider the probability that during n trials there shall occur a run of at least p favourable results, it is not difficult to see that (it r denoting the number of ways this may occur in r trials) n + P +i = (a + b) u n+p + ba f {( + &)"-}, as Un+p+j includes, besides (a + b)u n+p , those cases in which the last p trials are favourable, the one before unfavourable, and the n preceding containing no such run as stated. We will not enter on Laplace s solution of this equation, or rather of one equivalent to it, especially as the result is not a simple one (see Todhuiiter, p. 185). 8. Let the probability of an event happening in one trial be p, that of its failing q ; we have seen (art. 4) that, if a large number N of trials be made, the event is most likely to happen ;;N times and fail </X times. The chance of this occurring is, however, extremely small, though greater than that in favour of any other proportion. We propose now to examine the probability that the proportion of successes shall not deviate from its most probable value by more than a given limit that is, in fact, to find the probability that in N trials the number of times in which the event happens shall lie between the two limits pNr. Let 7rt=^N, n = qS, which are taken to be integers. The proba bility of the event happening m times is the greatest term T of the expansion (1), viz., IN The calculation of this would be impracticable when N, m, n are large numbers, but Stirling s theorem gives us 1.2. 3 . . . x = x*+le-* < /!fa , very nearly, when x is large ; and by substituting in the preceding value of T, and reducing, we easily find T= - Now the terms of the expansion (1) on either side of T are n(n-l) & T+ _n_P T , T m q m(m-V) q* . (m + lXm+2) 3 2 m+lg l+T +^+lp + (n+l)(n+-2)p^ + W- But if x is much greater than a, x- a = xe x nearly, so that n(n-l)(n-2) . . . (s terms)- e- - -^-^ } = n , e - s ^^ . also ( m + 1 )( m + 2) . . . (s terms) = m c James500 (talk) Hence the sth term before T in (9) is ^ r /> -L. < T* > OI 6 OIIIM Ov., 1 The sth term after T is Now the probability that the event shall happen a number of times comprised between m + r and m-ris the sum of the terms in (9) from the rth term before T to the rth term after T. (N. B. , though r may be large, it is supposed small as compared with N, m, or n.) Now the sth term before T+ the sth term after T = 2e~^T r i m - n . -2,wnen*--^- is small. Taking then each lie corresponding term after T, and putting since e term before T for shortne.-s 2mn (10). we have for the required probability Pr = 2(iT + Tc~ at + Te- 22 " 2 + Te- 32 " 2 + . If we now consider the curve whose equation is and take a series of its ordinates corresponding to x=0, a, 2a, 3a . . . . ra, where a is very small, and if A be its area from x = Q to 2 = ra, then A = (first + last ordinates) + sum of intermediate ordinates a 2 . . p r -A -I- last ordinate, or Pr"f-/ e * <fce+ ==e~ . . . (11). 9. We refer to the integral calculus for the methods of com puting the celebrated integral fe x *dx, and will give here a short table of its values. Table of the Values of the Integral I = - / T e~ x dx. T I T I T I T I o-oo o-ooooo 2 22270 1-3 93401 2 4 99931 01 01128 3 32863 1-4 95229 2-5 99959 02 02256 4 42839 ! 1-5 96611 2 6 99976 03 03384 5 52050

1-6

97635 2-7 99986 04 04511 6 60386 1 17 98379 2-8 99992 05 05637 7 67780 1-8 98909 2-9 99996 06 06762 1 -8 74210 1-9 99279 3 99998 07 07886

-9

79691 2-0 99532 00 i-ooooo 08 09008 i-o 84270 21 99702 09 10128 1 1 88020

2-2

99814 1 11246 1-2 91031 i 2 3 99886 If the value of I is 5, or J, T= 4769. 10. The second term in formula (11) expresses the probability that the number of occurrences of the event shall be exactly m + r or in-r, or more correctly the mean of these two pro babilities. It may be neglected when the number of trials N is very great and the deviation r not a very small number. We see from the foregoing table that when r _ = 3 it becomes practically a certainty that the number of occurrences will fall between the limits mr. Thus, suppose a shilling is tossed 200 times in succession ; here = . If therefore r= 30, it may be called a certainty that head will turn up more than 70 and less than 130 times. In the same case suppose we wish to find the limits mr such that it is an even chance that the number of heads shall fall between them, if the second term of (11) be neglected, we see from the table that ra=r = 48, . . r=4 8 ; so that the probability that the number of heads shall fall between 95 and 105 is 52 + -- e -k = 57 nearly, rather more than an even chance. 11. Neglecting the second term of (11), we see that p r depends solely on the value of ra, or that of TV ; so that, if the number of trials N be increased, the value of r, to give the same probability, increases as the square root of N; thus, if in N trials it is practically certain (when ra=3) that the number of occurrences lies between ^Nr, then, if the number of trials be doubled, it will be certain that the occurrences will lie between 2jjNr/2. In all cases, if N be given, r can be determined, so that there is a probability amounting to certainty that the ratio of the number of occurrences to the whole number of cases shall lie between the limits Now if N be increased roc >/N ; so that these limits are G being a constant. Hence it is ahvays possible to increase the number of trials till it becomes a certainty tliat the proportion of occurrences of the event ivill differ from p, its probability on a single trial, by a quantity less than any assignable. This is the celebrated theorem given by James Bernoulli in the Ars Conjectandi. (See Todhunter s History, p. 71.) PROBABILITY 773 12. We will give here a graphical representation (fig. 1), taken from M. Quetelet s Lettres sur la Theorie des Probability s, of the facilities of the different numbers of successes which may occur in 1000 trials as to any event which is equally likely to happen as not in each trial, as in 1000 tosses of a coin, or 1000 drawings from an urn containing one white and one black ball, replacing the ball each time, or again in drawing 1000 balls together from an urn containing a groat number of black and white in equal proportion. Asp = q = ^, we find from formula (8) that the chance of exactly half the entire number drawn, viz. , 500, being white is T= I == 02523; V5007T and the chance for any number 500 5 is found by multiplying _ s2 T by e~500. If then we take the central ordinate to represent T on any scale, and arrange along the horizontal line AB the different num bers of white balls which may occur, and erect opposite each number an ordinate representing the probability of that number, we have a graphical diagram of the relative possibilities of all possible proportions of black and white in the result. We see from it that all values of the number of white balls drawn less than 450, or greater than 550, may be considered impossible, the probabilities for them being excessively small. the JN white from A, or of the N white from B. As it is equally likely to have been any one of these, the chance that it came from A is N -f f N, or |. Suppose there had been two urns A and three urns B, and a white ball has been drawn from one of the five ; as in a great number N of drawings f N come from A and are white, |N from B and { of them are white, the chance that it came from one of the urns A is fHf+H)-f. In general suppose an event to have occurred which must have been preceded by one of several caitses, and let the antecedent probabilities of the causes be P 1 ,P 2 ,P 3 ... and let p^ be the probability that when the first cause exists the event will follow, p. 2 the same probability when the second cause exists, and so on, to find, after the event has occurred, the pro babilities of the several causes or hypotheses. Let a great number N of trials be made ; out of these the number in which the first cause exists is PjN, and out of this number the cases in which the event follows are ^PjN ; in like manner the cases in which the second cause exists and the event follows are /> 2 P 2 N ; and so on. As the event has happened, the actual case is one out of the number and as the number in which the first cause was present is j^PjN the a posteriori pro bability of that cause is ""i = , P _L ~ T> _L P j_ IT* (12). 500 Fig. 1. 1,80 The probability of the number of white balls falling between any two assigned limits, as 490 and 520, is found by measuring the area of the figure comprised between the two ordinates opposite those numbers, and dividing the result by the total area. II. PROBABILITY or FUTURE EVENTS DEDUCED FROM EXPERIENCE. 13. In our ignorance of the causes which influence future events, the cases are rare in which we know a priori the chance, or "facility," of the occurrence of any given event, as we do, for instance, that of a coin turning up head when tossed. In other cases we have to judge of the chances of it happening from experience alone. We could not say what is the chance that snow will fall in the month of March next from our knowledge of meteorology, but have to go back to the recorded facts. In walking down a certain street at 5 o clock on three different days, I have twice met a certain individual, and wish to estimate from these data the likelihood of again meeting him under the same circumstances in ignorance of the real state of things, viz., that he lives in that street, and returns from his business at that hour. Such is nearly the position in which we stand as to the probabilities of the future in the majority of cases. We have to judge then, from certain recorded facts, of the pro bability of the causes which have occasioned them, and thence to deduce the probabilities of future events occurring under the operation of the same causes. The term "cause " is not here used in its metaphysical sense, but as simply equivalent to "antecedent state of things." Let us suppose two urns, A containing two white balls, B con taining one white and one black ball, and that a person not know ing which is which has drawn a white ball from one, to find the probability that this is the urn A. This is in fact to find, suppos ing a great number of such drawings to be made, what proportion of them have come from the urn A. If a great number N of drawings are made indiscriminately from both urns, JN" come from the urn A and are all white, J N white come from the urn B, and N black. The drawing actually made is either one of So likewise for the other causes, the sum of these a posteriori probabilities being ir l + T 2 + ""3 + = 1 Supposing the event to have occurred as above, we now see how the probability as to the future, viz., whether the event will happen or fail in a fresh trial, is affected by it. If the first cause exists, the chance that it will happen is p 1 ; hence the chance of its happening from the first cause is p^i , so likewise for the second, third, &c. Hence the probability of succeeding on a second trial is P 1 v 1 + p 2 ir., + p. i v 3 + . . (13). 14. To give a simple example: suppose an urn to contain three balls which are white or black ; one is drawn and found to be white. It is replaced in the urn and a fresh drawing made ; find the chance that the ball drawn is white. There are three hypo theses, which are taken to be equally probable a priori, viz., the urn contains three white, two white, or one white, that of none white being now impossible. The probability after the event of the first is by (12) 4 + H + H > that of the second is , that of the third . Hence the chance of the new drawing giving a white ball is i+W+W-t- 15. The calculations required in the application of formulas (12) and (13) are often tedious, and such questions may often be solved in a simpler manner. Let us consider the following : An urn contains n black or white balls. A ball is drawn and replaced ; if this has been done r times, and in every case a white ball has appeared, to find the chance that the (r + l)th drawing will give a white ball. If s drawings are made successively from an urn containing n balls, always replacing the ball drawn, the number of different ways this may be done is clearly n . If there be n + 1 such urns, one with white balls, one with 1 white, one with 2 white, &c., the last with n white, the whole number of ways in which r drawings can be made from any one of them is (n+l)n r . Now the number of ways in which r drawings, all white, can be made from the first is 0, from the second 1, from the third 2 r , from the fourth 3 r , and so on ; so that the whole number of ways in which r drawings of a white ball can be made from the n + 1 urns is l r + 2 r +3 r + . . . n r . Hence the chance that if r drawings are made from an urn containing n black or white balls all shall be white is 774 PROBABILITY for all we know of the contents of such an urn is that they are equally likely to be those of any one of the n + l urns above. If now a great number N of trials of r drawings be made from such uriis, the number of cases where all are white is p r N . If r+ 1 drawings are made, the number of cases where all are white is p r+1 X ; that is, out of the p r N cases where the first r drawings are white there are^v+iN where the (r + l)th is alsojwhite ; so that the probability sought for in the question is p r+} 1 iH-i + 2--+i + 3 r + 1 + . . . n r + l " p r ~n l r + 2 r + 3 r + . . . n r 16. Let us consider the same question when the ball is not replaced. First suppose the n balls arranged in a row from A to B as below, the white on the left, the black on the right, the arrow marking the point of separation, which point is unknown (as it would be to a blind man), and is equally likely to be in any of its n + 1 possible positions. A 1 2 B OOOOOOOOOOO. ? Now if two balls, 1 and 2, are selected at random, the chance that both are white is the chance of the arrow falling in the divi sion 2B of the row. But this chance is the same as that of a third ball 3 (different from 1 and 2), chosen at random, falling in 2B, which chance is J, because it is equally probable that 1, 2, or 3 shall be the last in order. It is easy to see that these chances are the same if we reflect that, the ball 3 being equally likely to fall in Al, 12, or 2B, the number of possible positions for the arrow in each division always exceeds by 1 the number of positions for 3 ; therefore as 3 is equally likely to fall in any of the three divi sions, so is the arrow. The chance that two balls drawn at random shall both be white is thus ^ ; in the same way that for three balls is J, and so on. Hence the chance that r balls drawn shall all be white is the same chance for r + 1 balls is thus, as in a large number N of trials the number of cases where the first r drawn are white is p r ^, and the number where the first r + 1 are white is p r+l N, we have the result : If r balls are drawn and all prove to be white, the chance that the next drawn shall also be white is ^p r +i = r + I p r r + 2 This result is thus independent of n, the whole number of balls. This result applies to repeated trials as to any event, provided we have really no a priori knowledge as to the chance of success or failure on one trial, so that all values for this chance are equally likely before the trial or trials. Thus, if we see a stranger hit a mark four times running, the chance he does so again is ; or, if a person, knowing nothing of the water where he is fishing, draws up a fish each time in four casts of his line, the same is the chance of his succeeding a fifth time. 1 In cases where we know, or rather think we know, the facility as to a single trial, if the result of a number of trials gives a large ditference in the proportion of successes to failures from what we should anticipate, this will afford an appreciable presumption that our assumption as to the facility was erroneous, as indeed common sense indicates. If a coin turns up head twenty times running, we should say the two faces are probably not alike, or that it was not thrown fairly. We shall see later on, when we come to treat of the combination of separate probabilities as to the same event, the method of dealing with such cases (see art. 39). We will give another example which may be easily solved by means of (12), or by the simpler process below. There are n horses in a race, about which I have no knowledge eicept that one of the horses A is black ; as to the result of the race I have only the information that a black horse has certainly won : to find the chance that this was A supposing the propor tion of black among racehorses in general to be p ; i.e., the pro bability that any given horse is black is p. Suppose a large number N of trials made as to such a case. A wins in N of these. Another horse B wins in N ; out of these _ n _ n 1 It may be asked why the above reasoning does not apply to the case of ie chance of a coin which has turned up head r times doing so once more. The reason Is that the antecedent probabilities of the different hypotheses are not equal. Thus, let a shilling have turned up head once; to find the chance of its doing so a second time. In formula (12) three hypotheses may be made as to a double : throw 1 two heads, 2 a head and tail, 3 two tails ; but> the proba- es of these are respectively J, i, J; therefore by (12) the probability of the r SS eTent l3 H-(i+l D=J : tnat of the second is also J; and by (13) tne probability of succeeding on a second trial is because, if hypothesis 2 is the true one, the second trial must fail. B is black in NT>. Likewise for C : and so on. Hence the actual n case which has occurred is one out of the number 1 n - ! JS H Nw : n n and, as of these the cases in which A wins are -N, the required chance that A has won is 1 .- 17. We now proceed to consider the important theorem of Bayes (see Todhunter, p. 294 ; Laplace, Theorie Analytique dcs Prob., chap. 6), the object of which is to deduce from the experience of a given number of trials, as to an event which must happen or fail on each trial, the information thus afforded as to the real facility of the event in any one trial, which facility is identical with the proportion of successes out of an infinite number of trials, were it possible to make them. Thus we find in the Carlisle Table of Mortality that of 5642 persons aged thirty 1245 died before reaching fifty ; it becomes then a question how far we can rely on the real facility of the event, that is, the proportion of mankind aged thirty who die before fifty not differing from the ratio il|f by more than given limits of excess or defect. Again, it may be asked, if 5642 (or any other number of) fresh trials be made, what is the probability that the number of deaths shall not differ from 1245 by more than a given deviation ? The question is equivalent to the following : An urn contains a very great number of black and white balls, the proportion of each being unknown ; if, on drawing m + n balls, m are found white and n black, to find the probability that the proportion of the numbers in the urn of each colour lies between given limits. The question will not be altered if we suppose all the balls ranged in a line AB (fig. 2), the white ones on the left, the black on the right, the point X where they meet being unknown and all positions for it in AB being a priori equally probable. Then, m + n points having r been chosen at random in AB, m are found to - fall on AX, n on XB. That is, all we know of X is that it is the (i + l)th in order beginning from A of m + n + l points chosen at random in AB. If we put AB = 1, AX = x, the number of cases when the point X falls on the element dx, is measured by since for a specified set of m points, out of the m + n, falling on AX, the measure would be x m (l - x} n dx, and the number of such sets is j j . Now the whole number of cases is given by integrat- [ 1)1 I ?t ing this differential from 1 to ; and the number in which X falls between given distances o, /3 from A is found by integrating from /3 to o. Hence the probability that the ratio of the white balls in the urn to the whole number lies between any two given limits a, & is P- //3 x* (l - x) n dx 1 X m (l - x)"dx (14). The curve of frequency for the point X after the event that is, the ordiuate of which at any point of AB is proportional to the fre quency or density of the positions of X in the immediate vicinity of that point is 2/ = a: m (l -x)" ; the maximum ordinate KV occurs at a point K, dividing AB in the ratio TO : n, the ratio of the total numbers of white and black balls being thus more likely to be that of the numbers of each actually drawn than any other. Let us suppose, for instance, that three white and two black have been drawn ; to find the chance that the proportion of white balls is between f and of the whole ; that is, that it differs by less than ^ from $, its most natural value. P f* = f r ~ I xl- x)"dx Jo 2256 18 . = _=- nearly. 18. An event has happened m times and failed n times in m + n trials. To find the probability that in p + q further trials it shall happen ^> times and fail q times, that is, that, p + q more points PROBABILITY being taken at random in AB, p shall fall in AX and q in XB. The whole number of cases is measured by m + n r 1 m + n /- l - = !(l-xydx->= =/ x*(l-x) n dx. |TO _J The number of favourable cases, when any particular set of p points, out of the p + q additional trials, falls in AX, is measured by m+n r because, the number of cases as to the m + n points being, when X falls on the element dx, | m + n ===-a;* l (l x)"dx , m

each of these affords x?(l - x) J cases where p new points fall on AX, and q on XB. Now, the number of different sets of p points being > I- / the required probability is Pq_ . . . . (15); or, by means of the known values of these definite integrals, p + q m+p

+ q 

(16). For instance, the chance that in one more trial the event shall happen is ~. This is easy to verify, as the line AB has been ??i ~r ?i ~r Z divided into m + n + 2 sections by the m + n+l points taken on it (including X). Now if one more trial is made, i.e., one more point taken at random, it is equally likely to fall in any section ; and m + I sections are favourable. 19. When the number of trials m + n in art. 17 is large, the pro bability is considerable that the facility of the event on a single trial will not differ from its most natural value, viz., - by m + n more than a very small deviation. To make this apparent, v shall have to modify the formula (14), which gives for the chance that this facility lies between the limits a and $ (by substi tuting for the denominator its known value), (17). To find now the probability that the facility lies between the limits B = h 8 , and a = S , where 8 is small. Put for m + n m + n m x, - -fa;; and (In becomes m + n ~<S P +x S m + n J n m + n -x] dx . Now if x is small, and we put u = (a + x) m , X - ??1X logu^mlogct+m--- mx_mx* correct as far as the square of x. Hence the two factors under the sign of integration become mm ,(m+^)- ( -+ w l 2 ^ nn - ^- ( m +") 2 ^ im i and so that I TO + n + 1 m m ri (TO + n) n >5 n C ^( m + 7i ) + J c^n

  • da . . (18).

Now, since by Stirling s theorem |-;/i = m m +*e- m V 2ir . the constant coefficient here becomes (m + n + l)(m + 7i)" t +"+ig-*-, v /2^ m m n n _ (m + nfi (19), m m . n n .c- m n 2ir^mn (m + n) m + n j taking m + n + l = m + n. Now if we substitute in (18) f2mn where or finally / -4-/N V/TT^^/O dt 775 (20), (21), for the approximate value of the probability that the real facility of the event lies between the limits - 8 1/1 + n Thus, if out of 10,000 trials, the event has happened 5000 times, the probability that, out of an infinite number, the number of successes shall lie between 4Tb, or between ^ and -/fe, of the whole, will be j>=-678 = f nearly, 10 6 for we find from (20) ^ = ^ 1Q4 / 2 ?b= ? nearly; and, referring to the table in art. 9, we find the above value for the integral (21). We must refer to the sixth chapter of Laplace for the investigation of how far the number of successes in a given number of fresh trials may be expected to deviate from the natural proportion, viz., that of the observed cases as also for several closely allied questions, with important applications to statistics. III. Ox EXPECTATION. 20. The value of a given chance of obtaining a given sum of money is the chance multiplied by that sum ; for in a great number of trials this would give the sum actually realized. The same may be said as to loss. Thus if it is 2 to 1 that a horse will win a race, it is considered a fair wager to lay 10 to 20 on the result ; for the value of the expected gain is | of 10, and that of the expected loss J of 20, which are equal. Thus, if the probabilities for and against an event are p, q, and I arrange in any way to gain a sum a if it happens and lose a sum b if it fails, then if pa^qb I shall neither gain nor lose in the long run ; but if the ratio a : b be less than this, my expectation of loss exceeds that of gain ; or, in other words, I must lose in the long run. The above definition is what is called the mathematical expecta tion ; but it clearly is not a proper measure of the advantage or loss to the individual ; for a poor man would undoubtedly prefer 500 down to the chance of 1000 if a certain coin turns up head. The importance of a sum of money to an individual, or its moral value, as it has been called, depends on many circumstances which it is impossible to take into account ; but, roughly and generally, there is no doubt that Daniel Bernoulli s hypothesis, viz., that this importance is measured by the sum divided by the fortune of the individual l is a true and natural one. Thus, generally speaking, 5 is the same to a man with 1000 as 50 to one with 10,000 ; and it may be observed that this principle is very generally acted on, in taxation, &c. 21. To estimate, according to this hypothesis, the advantage or moral value of his whole fortune to the individual, or his moral fortune, as Laplace calls it, in contradistinction to his physical fortune, letx = his physical fortune, i/ = his moral fortune, then, if the former receive an increment dx, we have, from Daniel Ber noulli s principle, y-k log T (22), k, h being two constants, x and y are always positive, and x>h ; for every man must possess some fortune, or its equivalent, in order to live. 22. To estimate now the value of a moral expectation. Suppose a person whose fortune is a to have the chance p of obtaining a sum a, q of obtaining ft, r of obtaining 7, &c. , and let p + q + r+ . . . =1, only one of the events being possible. Now his moral expectation from the first chance that is, the increment of his moral fortune into the chance is a + a , a ) 7 , . . , , g = lSir ( =pKiog(a + a)-pKiog( . Hence his whole moral expectation is 2 1 This rule must be understood to hold only when the sum is very small, or rather infinitesimal, strictly speaking. It would lead to absurdities if it were used for large increments (though Buffon has done so ; see Todhunter, p. 345). Thus, to a man possessing 100, it is of the sume importance to receive a gift of 100 as two separate gifts of 30 ; but this rule would give as the measure of the importance of the first |8R=1; while in the other case, it would give $&+ jiyiv^f . The real measure of the importance of an increment when not small is a matter for calculation, as shown in the text. 2 It is important to remark that we should be wrong in thus adding the expectations if the events were not mutually exclusive. For the mathematical expectations it is not so. and, if Y stands for his moral fortune including this expectation, that is, k log ~ + E, we have

Y=kplog(a + α)+kq g(a + β) + . . . −klogh . . (23).

Let X be the physical fortune corresponding to this moral one, by (22)

Y=klog X − klogh.

Hence X.-=(a + a)> (a + fi)i(a + y) r (24); and X a will be the actual or physical increase of fortune which is of the same value to him as his expectation, and which he may reasonably accept in lieu of it. The mathematical value of the same expectation is 2)a + q0 + ry+ (25). 23. Several results follow from (24). Thus, if the sums o, , y . . . are very small, it is easy to see that the moral expectation coincides with the mathematical, for 24. We may show also that it is disadvantageous to play at even a fair game of chance (unless the stakes are very small, in which case the last article applies). Thus, suppose a man whose fortune is a plays at a game where his chance of winning a sum o is p, and his chance of losing a sum /} is q = l -p. If the game is fair, Now by (24) the physical fortune which is equivalent to his prospects after the game is Now the geometrical mean of r quantities is less than the arithmetical,[5] so that if there are β quantities a + a, and a quantities o-ft , &r o a ) a+ 3 ^3( + a) + a(a - ) (a + a) (a - 0) > <. or X<a, so that he must expect morally to lose by the game. 25. The advantage of insurance against risks may be seen by the following instance. A merchant, whose fortune is represented by 1, will realize a sum e if a certain vessel arrives safely. Let the probability of this be^;. To make up exactly for the risk run by the insurance company, he should pay them a sum (!-;>). If he does, his moral fortune becomes by (22) h while, if he does not insure, it will be (23). T 1 kplog . Now the first of these exceeds the second, so that he gains by insuring on these terms ; because that is , 771 for, putting w = , fo ^ m+n m + n because (see note art. 24), if m (1 + ) + n is divided into m + n equal parts, their product is greater than that of m parts each equal to 1 + ( and n parts each equal to 1. The merchant will still gain by paying, over and above what covers the risk of the company, a sum a, at most, which satisfies log (1 - a+pt) =.plog (1 + ) ; By paying any sum not exceeding this value, he still gains, while the insurance office also makes a profit, which is really a certainty when it has a large business ; so that, as Laplace remarks, this example explains how such an office renders a real service to the public, while making a profit for itself. In this it differs from a gambling establishment, in which case the public must lose, in any sense of the term.

It may be shown that it is better to expose one s fortune in separate sums to risks independent of each other than to expose the whole to the same danger.[6] Suppose a merchant, having a fortune a, has besides a sum e which he must receive if a ship arrives in safety. By (24) the value in money of his present fortune is

X = (a + f )rai,

where ^> = chance of the ship arriving, and q=I -p. Now suppose he risks the same sum in two equal portions, in two ships. We cannot apply (23), as the events are not mutually exclusive ; but we see that, if both ships arrive, the chance of this being p 1 , he realizes the whole sum e ; if one only arrives, the chance being 2pq, he receives |e ; if both are lost, the chance being q", he loses all. Thus (24) he is now worth a sum Now this sum is greater than the former ; for (a + e )P -P. (a + ^-M. a?-* > 1 , that is, (a + f)-^(a + Jf)*W for (jLi?); as is obviously true. Now suppose he risks the sum e in three separate ventures. His fortune will be . (a + e ) 8 ^. (a + l^i ; and we have to show that this is worth more than when there were two. If we put a outside each bracket, and put 8 = 5- we have to oct prove (1 + 35)^(1 + 25)"^. (1 + 8) 3 M 2 > (1 + 35) J. (1 + f 5) 2 ; 18) 2 , or, since p* - p = - pq, S)-*(1 + (1 + 28)3 hence the fraction in the brackets is always less than its _pth power as p < 1 ; and we can now show that that is, (l + 25) 3 >(l + 35)(l+fS) 2 , Laplace shows (ch. x.) that the gain continues to increase by subdivision of the risk ; it could no doubt be shown by ordinary algebra. He shows further that the moral advantage tends to become equal to the mathematical. This may be done more easily thus : The expression is, when e is divided into r equal parts,

X=(a + ...

and we have to find the limit towards which this tends as r becomes infinitely great.

Put 2 = a + e ; erpr-iq/ r ( r -^) p r -<i 2 / e rq r - l p -) ( 2 -2-r2 P . . .(z-(r-l)-} 12-c rj rj r Now in the binomial expansion the greatest term is the (<?r + l)th, viz., r(r-l) . . . (rp + l) r p rq 1-2-3 . . . rq p * The factor in X corresponding to this is ( f T TT T (z-rg-) -U, if we put U = 2 - qe . Let us now express the binomial series before and after T thus : PROBABILITY 777 and we have The factors towards the beginning and end may be all taken as 1, because the terms of the binomial increase rapidly in value from either end when r= oo, and we shall have the true limit for X by taking an indefinitely great number of factors on either side of U, which number, however, may be infinitely less than r. As the sth factors before and after U may be expressed thus (s being always very small compared to r) TT * - Ue ?-U and as . . . Ue Vu 2 . . . =1, we have Now we have seen in art. 8 that sfl r> ?!_, Pri, - ?=? T 8 = Te -2i>q>- IPV , t, = fe tpqr tpgr Hence sT - s *-^~ T > e 2*?r- Hence the exponent of c above becomes rU pq ^=0 r "- 1 " 1 $! being the extreme limit for s. IP s " r2 ./ N If we put x = T- , and x 2 c~ 2^ (> = <p(x), /r the above sum is Now it is easy to prove that is finite; and much more is it so when the superior limit is finite. Hence the exponent of e becomes -^f rU ^ pq where K is finite ; so that the exponent becomes infinitesimal when r = co . The limit therefore towards which X tends is X = U = s - qe = a + pe , that is, the mathematical value of the fortune. The very important applications of probability to annuities and insurance are to be found in the articles on those subjects, to which therefore we refer the reader. IV. PROBABILITY or TESTIMONY. 26. We have here to treat of the probability of events attested by several witnesses of known credibility, or which have several different probabilities in their favour, derived from different inde pendent sources of information of any kind, of known values. 1 A witness may fail in two ways : he may be intentionally dis honest, or he may be mistaken ; his evidence may be false, either because he wishes to deceive, or because he is deceived himself. However, we will not here take separate account of these two sources of error, but simply consider the probability of the truth of a statement made by a witness, which will be a true measure of the value of his evidence. To estimate this probability in any given case is not an easy matter ; but if we could examine a large number of statements made by a certain person, and find in how many of them he was right, the ratio of these numbers would give the pro bability that any statement of his, taken at random, whether past or future, is a true one. 27. Suppose a witness, whose credibility is p, states that a fact occurred or did not occur, or that an event turned out in one way, when only two ways are possible. If nothing was known a priori as to the probability of the fact, or if its real facility was |, it is clear that the probability that it did occur is p. For if a great 1 The question now before us is quite different from that of the chance of an event happening or having happened which may happen in different ways, in which case we add the separate probabilities. Thus if there are but two horses in a race, of equal merit and belonging to one owner, his chance of winning is 4 + 5=1. But suppose I only know that one of the two is his, and, besides, some one whose credibility is tells me lie has won the race; here I have two separate probabilities of { each for the same event ; but it would clearly be wrong to add them together. number N of trials were made (either really as to the event, if its facility is known to be , as in tossing a coin ; or as to it and other cases resembling it as to our ignorance of the real facility, if such is the state of things) in iN the event happens, and out of these the witness asserts in ^>N cases that it did happen. Now, out of the whole number, he asserts in |N cases that it happened, as there is no reason for his affirming oftener than he denies (or, it may be said, he affirms in i?;N cases where it did happen, and in i(l -p)N cases where it did not). Hence, dividing the whole number of cases when it happens and he affirms it by the whole number of cases where he aflirms it, we find i/>N-^N=^>. We have entered at length on the proof of what is almost self- evident (perhaps indeed included in the definition) in this case, because the same method will supceed in other cases which are not so easily to be discerned. 28. Let us now consider the same question when the a priori pro bability of the fact or event is known. Suppose a bag contains n balls, one white and the rest black, and the same witness says he has seen the white ball drawn ; what is the chance that it was drawn 1 A great number N of trials being made, the number in which the white ball is drawn is N, and out of these he states it in n~ 1 p~N cases. Out of the remaining (l-n~ l ) N cases where a black ball was drawn, he says (untruly) that in (1 -p) (1 -n 1 ) N cases it was white. Now, dividing the number of favourable cases, viz., those where he says it is white and it is so, by the whole number of cases, viz., those where he says it is white, we have for the probability required n ~ l P P (OQ] - This holds for any event whose a priori probability is n~ l . If n be very large, this probability will be very small, unless jj is nearly =1 ; and, indeed, if we go back to the common sense view, it is clear we should hesitate to believe a man who said he had drawn the white ball from a bag containing 10,000 balls, all but it being black. It may be observed that if n = 2, -zr=p, as in art. 27. We have thus a scientific explanation of the universal tendency rather to reject the evidence of a witness than to accept the truth of a fact attested by him, when it is in itself of an extraordinary or very improbable nature. 29. Two independent witnesses, A and B, both state a fact, or that an event turned out in a particular way (only two ways being possible), to find the probability of the truth of the statement. Supposing nothing is known a priori as to the event in question, let a great number N of trials be made as to such events ; the number of successes will be |N ; out of these the witness A affirms the success in |pN cases ; out of these the witness B affirms it, too, in ^;/N cases. 2 Out of the N failures A affirms a success in |(1 -j3)N cases ; and out of these B also affirms one in ^(1 -p)(l -/)N cases. Hence, dividing the favourable cases by the whole number, the probability sought is Pi / 97 > W = ,W_LM _.^n _,,M ^ /i where p, p are the credibilities of the two witnesses. This very important result also holds if p be the probability of the event derived from any source, and p the credibility of one witness, as in art. 28 ; or if p and p be any independent proba bilities, derived from any sources, as to one event. 30. We give another method of establishing the formula (27). Referring to art. 13, the observed event is the concurrent evidence of A and B that a statement is true. There are two hypotheses that it is" true or false. Antecedent to B s evidence the pro babilities of these hypotheses are p and 1 -p (art. 27), as A has said that it is true. The observed event now is that B says the same. On the first hypothesis, the probability that he will say this is/ ; on the second, "it is 1 -/. Hence by formula (12) the pro bability a posteriori of the first hypothesis, viz., that the joint statement is true, is, as before, PP 1 pp + (l-p)(i-p ) 31. If a third witness, whose credibility is p", concurs with the two former, we shall have to combine/ with -a in formula (27) ; hence the probability -a of the statement when made by three witnesses is asp ppp and so on for any number. 2 Here we are assuming the independence of the witnesses. If B, for instance, were disposed to follow A s statements or to dissent from them, he would affirm the success here in more or less than Ipp ^S. cases. XIX. 98 778 PROBABILITY As an example, let us find how many witnesses to a fact, the odds against which are 1,000,000,000,000 to 1, would be required to make it an even chance that the fact did occur, supposing the credibility of each witness to be # = & Let x be the number.

12 "~log9 so that thirteen such witnesses would render the chance more than an eveji one. 32. Let us now consider an event which may turn out in more than two ways, and let each way be equally probable a priori, and suppose a witness whose credibility is p states that it turned out in a certain way ; what is the chance that it did so ? Thus if a die has been thrown, and he states that ace turned up ; or if tickets in a lottery are numbered 1, 2, 3, &c., and he states that 1 was drawn ; to find the chance that he is right. Take the case of the die, and suppose a great number N of throws. In &N the ace turns up, and he says so in pN cases. In N the two turns up, and he is wrong in &(1 -j?)N cases out of these ; but he says ace in only of these, as there is no reason why he should give it more or less often than any of the five wrong numbers. In the same way for the other throws ; so that the whole number of cases where he says ace turned up is and, the number, out of these, when it actually turned up being &J0N, we find tJie chance it did turn up is p, the credibility of the witness. In any such case, this result will hold. We might indeed safely have argued that when the die is thrown a great number of times, any witness, whatever his veracity, will quote each face as often as any other, as there is no reason for one to turn up oftener than another, nor for him to affirm, rightly or wrongly, one rather than another ; so that he will say ace in &N of the throws, while he says ace in pN out of the &N cases where it does turn up. This result compared with art. 28 affords an apparent paradox. If a large number of tickets are marked 1,0,0,0,0,0 . . . . and a witness states that 1 has been drawn from the bag, we see from art. 28 that the chance he is right is very small ; whereas if the tickets were marked 1,2,3,4,5,6 .... and he states that 1 has been drawn, the chance he is right is p, his own credibility. However, we must remember that in the first case he is limited to two state ments, 1 and,0, and he makes the first, which is very improbable in itself; whereas in the other case, the assertion he makes is in itself as probable as any other he can make e.g., that 2 was the ticket drawn and therefore our expectation of its truth depends on his own crt libility only. 33. Suppose now that two witnesses A, B both assert that the event has turned out in a certain way, there being, as in art. 32, n equally probable ways. Both, for instance, say that in a lottery numbered 1,2,3,4,5 . . . . No. 1 has been drawn. A large number N of drawings being made, 1 is drawn in ?i- J N cases; out of these A says 1 in 71-^N cases, and out of these B also says 1 in n~ l pp N. No. 2 is drawn in 7i- J N cases ; here A is wrong in n~ -p)~S, but says 1 in only (n-lJ- n-^l-^N; and B will also say 1 in (1 -p )(n- I)- 1 of these ; that is, bol;h agree that 1 has been drawn in cases. So likewise if No. 3 has been drawn, and so on ; hence, when No. 1 has not been drawn, they both say that it has in cases. Hence the number of cases where they are right divided by the whole number of cases where they make the statement, that is, the probability that No. 1 has been drawn, is PP (29). If n be a large number the chance that they have named the ticket drawn is nearly certainty. Thus, it two independent witnesses both select the same man out of a large number, as the one they have seen commit a crime, the presumption is very strong against him. Of course, for the case to come under the above formula, it is supposed that some one of tiie number must be guilty. 34. In the same case, when the event may turn out in n ways not equally probable, as in a race between n horses A, B, C . . . . whose chances of winning area, b, c. . . ., so that a + 6 + c + . . . =1, if one witness whose credibility is p states that A has won, it is easily shown by the same reasoning as in art. 33 that the pro bability A has really won is ap --i- -,-,. . . (31). and if two witnesses say so, it is app -a)(-l) It is easily shown in formula (30) that if^>?i~ 1 the probability a is increased by the testimony, beyond a, its antecedent value. Thus, suppose there are ten horses in a race, and that one of them, A, has a chance 3 of winning, and that just after the race I learn that a black horse has won, black being A s colour ; now, if I know that ^ of racehorses in general arc black, this gives me a new chance (see art. 16) that A has won. Therefore from (30) the chance of the event is now T = f. 35. To illustrate the effect of discordant testimony. In art. 29 let A have asserted that the fact occurred, and let B deny it. It is easy to see that 1 p 1 is to be put for p , so that the probability that it did occur is if there had been an a priori probability a in favour of the fact this would have been X1 -P } ,OON rwi ^ (33). Thus if the credit of both witnesses were the same, p=p , and we find from (33) -a = a, so that the evidence has not altered the likelihood of the event. 36. Where the event may turn out in n equally probable ways as in art. 33, and the witness A asserts one to have occurred, say the ticket marked 1 to have been drawn, while the witness B asserts another, say the ticket marked 2 ; to find the chance that No. 1 was drawn, By the same reasoning as in art. 33 we find for the chance (34)

This result will also follow if we consider B s evidence as testi mony in favour of No. 1 of the value (1 -_?/)( - 1 )~ J . When the number of tickets n is very great, (34) gives p-pp " i / * 1 -pp 37. As remarked in art. 26, the methods we have given for de termining the probability of testimony apply to cases where the evidence is derived from other sources. Thus, suppose it has been found that a certain symptom (A) indicates the presence of a certain disease in three cases out of four, there is a probability f that any patient exhibiting the symptom has the disease. This, however, must be considered in conjunction with the a priori probability of the presence of the disease, if we wish to know the value of the evidence deduced from the symptom being observed. For instance, if we knew that f of the whole population had the disease, the evidence would have no value, and the credibility of the symptom per sc would be , telling us nothing either way. For if a be the a priori probability, -a that after the evidence, p the credibility of the evidence, we have found ap so that, if w = a, p = . If & and a are given, the credibility p of the evidence is deduced from this equation, viz., a + -ar - 2 38. Suppose now the probability of the disease when the symptom A occurs is ar (that is, it is observed that the disease exists in z?N cases out of a large number N where the symptom is found), and likewise the same probability when another independent symptom B occurs is -a . What is the probability of the disease where both symptoms occur ? Let a be the a priori probability of the disease in all the cases ; then the value of the evidence of B is, as explained above, P - ; ^ i a + TV - 2a-ar and this has to be combined with -a, which is the probability of the disease after A is observed. We find the probability (n) required to be PROBABILITY 779 whence n (35). 1 Thus, if the a priori probability of the disease ill all the patients was xV, and 3 out of 4 have the disease where A is observed, and also 3 out of 4 where B is observed, the chance is that the disease exists when both symptoms are present. This question illustrates the exceeding delicacy and care required in reasoning on probabilities. If we had combined the two given probabilities in the usual way without considering the a priori value (as would be correct if this were quite unknown, or = ^) we should have had n = + (1 ^f}(l TZT ) The fallacy of so doing will appear if we consider a large population, and a very uncommon disease, and that the latter is observed to exist in 4 the cases where the symptom A occurs, and also in 4 for the symptom B ; this formula would give ^ for the chance when both are" present. This is clearly absurd ; for, both the disease and the symptoms being by hypothesis extremely rare, and the symptoms being independent, that is, having no connexion with each other, it is next to impossible that any one individual of the ^N(A) calling N(A) the number who have the symptom A who have not the disease should also be comprised in the ^N(B) who have not the disease, because this iN(A), ^N(B) are very small numbers (relatively) taken indiscriminately from the whole population who are free from the disease. It is different for the ^-N(A), N(B) cases who have the disease ; these cases all come out of the very small number N(D) who have the disease ; therefore several individuals will be probably common to both ; hence, if both symptoms coexist, it is highly probable that the case is one of the disease. We find from (35) the true probability to be in the present case so that, if only 1 in 1000 have the disease, the chance is 999 to 1, instead of an even one. 39. If a coin thrown m times has turned up head every time, the chance derived from this experience alone that the real facility for head exceeds ^ is, by formula (14), , 7 x ax

I m , 2 + 1 x ax But there is here a very strong a priori presumption that the facility is ^ ; suppose then that there is a very small a priori probability (p) that either in the coin itself or the way it is thrown there is something more favourable to head than to tail ; after the new evidence the probability of this will be Thus if there is an a priori probability TT ftn7, an( l if the coin has turned up head 5 times and never tail, the probability that the facility for head exceeds that for tail becomes 63 60 62 + 1000 1000 40. From art. 19 we see that if a large number of trials m + n be made as to any event, m being favourable, it may be considered certain that the real facility differs from m/(m + n) by a very small fraction at most. If then our a priori idea as to the facility gives it outside the limits derived from formula (21), the evidence from experience will overrule onr a priori presumption. Thus, if a shilling thrown up 1000 times gives head 560 times and tail 440, the evidence thus afforded that the throws were not fair is so much stronger than any antecedent conviction we could have to the contrary that we may conclude with certainty that, from some cause or other, head is more likely than tail. 41. Closely allied to the subject of our present section are the applications of the theory of probabilities to the verdicts of juries, the decisions of courts, and the results of elections. Our limits, 1 Or thus : let N = whole population and n,n the numbers who show the symptoms A and 15 respectively, all these numbers being large. Now aN = whole number who have the disease; -stii, zv n the numbers out of n,n who have it. Now afn, -a ri are both comprised in aX ; and, out of zj n , the number also included in ism is the same fraction of arn that -as n is of aX; that is, the nu nber who have both symptoms and the disease is tf n an =:; ; aN and those who have both symptoms and have not the disease is so that, if both symptoms arc present, the odds that it is a case of the disease are as however, will hardly allow of even a sketch of the methods given by Coudorcet, Laplace, and 1 oisson, as it is not possible to render them intelligible within a short compass. AVe must therefore refer the reader to Todhuntcr s History, as well as the original works of these writers, especially to Poisson s licchcrches sur la Prolubilite des Jugements. 42. We will consider here one remarkable question given by Laplace, because the mathematical difficulty may be solved in a simpler way than by deducing it as a case of a general problem given in his chap, ii., or than Todhunter s method (see his p. 545), which depends on Lejeune Dirichlet s theorem in multiple integrals. An event (suppose the death of a certain person) must have pro ceeded from one ofrc, causes A, B, C, &c., and a tribunal has to pronounce on which is the most probable. Let each member of the tribunal arrange the causes in the order of their probability according to his judgment, after weighing tlie evidence. To compare the presumption thus afforded by any one judge in favour of a specified cause with that afforded by the other judges, we must assign a value to the probability of the cause derived solely from its being, say, the rth on his list. As he is supposed to be unable to pronounce any closer to the truth than to say (suppose) H is more likely than D, D more likely than L, &c., the probability of any cause will be the average value of all those which that probability can have, given simply that it always occupies the same place on the list of the probabilities arranged in order of magnitude. As the sum of the n probabilities is always 1, the question reduces to this Any whole (such as. the number 1) is divided at random into n parts, and the parts are arranged in the order of their magnitude least, second, third, . . . greatest ; this is repeated for the same whole a great number of times ; required the mean value of the least, of the second, &c., parts, up to that of the greatest. A Let the Avhole in question be represented by aline AB = a, and let it be divided at random into n parts by taking n - 1 points indis criminately on it. Let the required mean values be A 3 a, , 2 n, A 3 .... A n , where X 1 , 2 ,A 3 . . . must be constant fractions. As a great number of positions is taken in AB for each of the n points, we may take a as representing that number ; and the whole number N of cases will be N-a"- 1 . The sum of the least parts, in every case, will be S 1 = NA 1 = A. 1 ". Let a small increment, B6= 8, be added on to the line AB at the end B ; the increase in this sum is SS^nA^ -^Sa. But, in dividing the new line A.I, either the n - 1 points all fall on AB as before, or n- 2 fall on AB and 1 on B& (the cases where 2 or more fall on B& are so few we may neglect them). If all fall on AB, the least part is always the same as before except when it is the last, at the end B of the line, and then it is greater than before by Sa ; as it falls last in n -1 of the whole number of trials, the increase in Sj is n - l a n - : 5a. But if one point of division falls on Eb, the number of new cases introduced is (n-l)a n ~-$a ; but, the least part being now an infinitesimal, the sum Sj is not affected ; we have therefore .. To find A 2 , reasoning exactly in the same way, we find that where one point falls on B& and n-2 on AB, as the least part is infinitesimal, the second least part is the least of the n-l parts made by the n- 2 points ; consequently, if we put A J for the value of A! when there are n - 1 parts only, instead of n, .  ?^ 2 = ~ 1 + (n - l); but A : = (-!) ~ . . n.-, = n 1 + (n- I)" 1 . In the same way we can show generally that 7lA r =?l" 1 + (n- l) r -l , and thus the required mean value of the rth part is A r = a?i- 1 {- 1 + (-l)- 1 + (-2)- 1 + . . . (n-r+l)- Thus each judge implicitly assigns the probabilities . (36). n-lj n

n-l n 

to the causes as they stand on his list, beginning from the lowest. Laplace now says we should add the numbers thus found on the different lists for the cause A, also for B, &c. ; and that cause which has the greatest sum is the most probable. This doubtless seemed self-evident to him, but ordinary minds will hardly be convinced 780 PROBABILITY of its correctness without proof. Let the lists of two of the judges be, beginning from the lowest, B , II , R . , K , A . . . . C , K , D , H , B . . . . Probabilities A : , A., , A 3 , 4 , X 5 . . . . As the opinions of all the judges are supposed of equal weight, the cause H nere is as likely as the cause K ; but the probability that H or K was the cause is 1 ., + 4 . Hence prob. (H) + prob. (K) = 2 prob. (II) = A, + A 4 ; .-. prob. (II)-i(Aj + A 4 ); that is, the probability of any cause is the mean of its probabili ties on the two lists, the circumstance being clearly immaterial whether the same cause K is found opposite to it or not. The same follows for 3 or more lists. 43. Laplace applies the same method to elections. Suppose there are n candidates for an office ; each elector is to arrange them ia what he believes to be the order of merit ; and we have iirst to rind the numerical value of the merit he thus implicitly attributes to each candidate. Fixing on some limit a as the maximum of merit, n arbitrary values less than a are taken and then arranged in order of magnitude least, second, third, .... greatest; to find the mean value of each. till i A X Y Z B Take a line AB = a, and set off n arbitrary lengths AX, AY, AZ .... beginning at A ; that is, n points are taken at random in AB. Now the mean values of AX, XY, YZ, .... are all equal ; for if a new point P be taken at random, it is equally likely to be 1st, 2d, 3d, &c., in order beginning from A, because out of n + l points the chance of an assigned one being 1st is (n+l) 1 ; of its being 2d (71 + 1)- 1 ; and so on. But the chance of P being 1st is equal to the mean value of AX divided by AB ; of its being 2d M(XY)4-AB; and so on. Hence the mean value of AX is AB (?i + 1)- 1 ; that of AY is 2AB(?i + l)- 1 ; and so on. Thus the mean merit assigned to the several candidates is aOi + 1)- 1 , 2a(i + l)- 1 , 3(/i + l)- 1 . . . . 7ia(n + l)-i. Thus the relative merits may be estimated by writing under the names of the candidates the numbers 1, 2/3, . . . . n. The same being done by each elector, the probability will be in favour of the candidate who has the greatest sum. Practically it is to be feared that this plan would not succeed, though certainly the most rational and logical one if the conditions are fulfilled because, as Laplace observes, not only are electors swayed by many considerations independent of the merit of the candidates, but they would often place low down in their list any candidate whom they judged a formidable competitor to the one they preferred, thus giving an unfair advantage to candidates of mediocre merit. Tkere are, however, many cases where snch objections would not apply, and therefore where Laplace s method would be certainly the most rational. Thus, suppose a jury or committee or board of examiners have to decide on the relative merit of a number of prize essays, designs for a building, &c. ; each member should place them in what he judges to be the order of merit, beginning with the worst, and write over them the numbers 1, 2, 3, 4, &c. ; then the relative merit of each essay, &c., would be represented by the sum of the numbers against it in each list. No doubt there would be cases where a juror would observe a great difference in merit between one essay and the one below it, which difference would not be adequately rendered by an excess of 1 in the number. But even then, as such superiority could not fail to be recognized by the other members of the tribunal, it is not likely that any injustice would result. 44. An argument advanced in support of a proposition differs from the case of testimony in that, if the argument is bad, the previous probability of the" conclusion is unaffected. Let p be the 11 priori probability of the proposition, q the chance that the argument is correct ; then, in a large number N of cases, in <?N the argument is good, and therefore the proposition is true ; and out of the remaining (1 -q) N, where the argument is bad, there are p (1 -?)N cases where the proposition is nevertheless true. Hence the probability of the conclusion is p+q-pq. Hence any argument, however weak, adds something to the force of preceding arguments. V. Ox MEAN VALVES AXD THE THEORY OF ERRORS. 45. The idea of a mean or average among many differing magnitudes of the same kind is one continually employed, and of great value. It gives us in one result, easily pictured to the mind and easily remembered, a general idea of a number of quantities which perhaps we have never seen or observed, and we can thus convey the same idea to others, without giving a long list of the quantities themselves. AVe could scarcely form any clear concep tion as to the duration of human life, unless by taking the average, that is, finding the length of life each individual would have if the whole sum of the years attained by each were equally divided among the entire population. How, again, could we so easily form a idea of the climate of Rome or Nice as by learning the mean of the temperatures of each day for a year, or a series of years ? Here, again, it will be an important addition to the information to find also the mean summer temperature and the mean in winter, as we thus learn what extremes of heat and cold are to be expected. We may even go further and inquire the diurnal variation in the temperature in summer or in winter ; and for this we should know the average of a number of particular cases. It may be said that the whole value of statistics depends on the doctrine of averages. The price of wheat and of other commodities, the increase or decrease of a particular crime, the age of marriages both for men and women, the amount of rain at a given locality, the advance of education, the distribution of wealth, the spread of disease, and numberless other subjects for inquiry are instances where we often see hasty and misleading conclusions drawn from one or two particular cases which happen to make an impression, but where the philosophical method bids us to observe the results in a large number, and then to present them as summed up and represented by the average or mean. 46. There is another application of averages of a different nature from the foregoing. Different estimates of the same thing are given by several independent authorities : thus the precise moment of an earthquake is differently stated by correspondents in the papers ; different heights are given for a mountain by travellers ; or suppose I have myself measured the height of a building a number of times, never obtaining exactly the same result. In all such cases (if we have no reason to attach greater weight to one result than to another) our common sense tells us that the average of all the estimates is more likely to be the truth than any other value. In these cases, as M. Quetelet remarks, there is this important distinction from the preceding, that the mean value represents a thing actually existing ; whereas in the others it merely serves to give a kind of general idea of a number of indi viduals essentially different, though of the same kind. Thus if I take the mean of the heights of 200 houses in a long street, it does not stand for any real entity, but is a mere ideal height, repre senting as nearly as possible those of the individual houses, whereas, in taking 200 measurements of the same house, their mean is intended to give, and will very nearly give, the actual height of that house. 47. So far it is obvious how to proceed in such cases ; but it becomes a most important question in the theory of probabilities, to determine how far we can rely on the mean value of the different observations giving us the true magnitude we seek, or rather, as we never can expect it to give exactly that value, to ascertain with what probability we may expect the error not to exceed any assigned limit. Such is the inquiry on which we are about to enter. This investigation is of the more importance, because we find what is really the same problem present itself again under circum stances different from what we have been considering. In the measurement of any whole by means of repeated partial measure ments as, for instance, in measuring a distance by means of a chain the error in the result is the sum of all the partial errors (with their proper signs) incurred at each successive application of the chain. If we would know, then, the amount of confidence we may have in the accuracy of the result, we must determine, as well as we can, the probability of the error that is, the sum of all the partial errors not exceeding assigned limits ; and to this end, we have in the first place to try to determine the law of facility, or frequency, of different values of this sum. The problem only differs from the preceding in that here we seek for the facility of the sum of the errors ; in the farmer, of the nth part of that sum. In both these cases, we may reasonably and naturally suppose that the error incurred in each observation, or each measurement, follows the same law as to the frequency of its different possible values and as to its limits, as each is made by the same observer, under the same circumstances, though what that law is may be unknown to us. But there is another class of cases where the same problem presents itself. An astronomical observation is made (say) of the zenith distance of a star at a particular instant; the error in this determination is a complex one, caused by an error in the time, an error in the refraction, errors of the instruP 11 OB ABILITY 781 ment, personal error of the observer, and others. The error of the observation is in fact the sum of the partial errors arising from these different sources ; now these evidently cannot be taken each to follow the same law, so that we have here a more general problem of the same species, viz., to combine a number of partial errors, each having its own law of facility and limits. There is every reason to suppose that the error incurred in any single observation or measurement of any kind is generally due to the operation of a large number of independent sources of error ; if we adopt this hypothesis, we have the same problem to solve in order to arrive at the law of facility of any single error. 48. We will consider the question as put by Poisson (Rechcrches, p. 254 ; see Todhunter, History, p. 561), and will adopt a method which greatly shortens the way to the result. Let x be the error arising from the combination or superposition of a large number of errors e 1( .,, e 3 . . . . each of which by itself is supposed very small, then x = e 1 + e 2 + 3 + (37). Each partial error is capable of a number, large or small, of values, all small in themselves ; and this number may be quite different for each error e l f. 2 , f 3 There may be more positive than negative, or less, for each. 1 If n lt n.,, n s .... be the numbers of values of the several errors, the number of different values of the compound error x will be Wjn^ij . . . We will suppose it, however, to take an indefinite number of values N, some multiple of the above, so that the n l} n. 2 , n. A .... different values are repeated, but all equally often, so as to leave the relative facility of the different values unaltered. We will suppose the same number N of values in every case, whether more or fewer of the partial errors e lt e.,, e 3 . . . . are included or not. Let the frequency of an error of magnitude x be called y, and let the equation expressing the frequency be y=f(x) (38); i.e., ydx = number of values of x between x and x + dx. The whole number of values is where /n, /j. are the sums of the higher-and lower limits of all the partial errors. If now a new partial error e be included with the others, let it have n particular values c, e, e" ; if it had but the one value c, then to every value x of the old compound error would correspond one x of the new, such that x + e x ; and the number of values of the new from x to x + dx is the same as of the old from x to x + dx that is, f(x)dx, or f(x - e]dx . Now the next value c gives, besides these, the number f(x c )dx , and soon. Thus the whole number of values of the new compound error between x and x + dx is / fix c) + f(x c ) + fix 1 c") + . , dx . Hence the equation of frequency for the new error is (dropping the accent, and dividing by n that is, reducing the total number of values from X?i to N, the same as before) y = n-i{f(x-e)+f(x-e )+J(x-c")+ . . .} . . (39). Hence y-f(*) - e+e>+e + -f(x) + f +e " +e " 2 ^^f"(x) , It Ib neglecting higher powers of c, e . . . . Hence if a new partial error , whose mean ralue = a, and whose mean square is A, be superposed onthe compound error (38) resulting from the combination of a large number of partial errors, the equa tion of frcquancy for the resulting error is It thus appears that each of the small errors only enters the result by its mean value a, and mean square A. If a second error were superposed, we should thus have y = (1 - BI D + iAjD -Xl - aD + UD 2 )/(.T) ; as A is a lower infinitesimal than a, we retain no other terms. - (a+ qJD + A + A/ - a i 2 + (a + ^ )2 D-; /(*). Thus any two errors enter the result in terms of a + a 1 and A + A - a 2 - c^- ; as this holds for any two, it is easy to see that all the partial errors in (37) enter the equation of frequency (38) only in terms of m and h-i ; putting ?H = a 1 + a_, + a 3 + . . . = sum of mean errors, /(. = l + ., + A 3 + . . . =sum of mean squares of errors, J- (41). i = a + a: + al + . . . = sum of squares of mean errors, 1 An error may have all its values positive, or all negative. In estimating the Instant when a star crosses the meridian we may err in excess or defect, but in estimating that when it emerges from behind the moon, we can only err in excess. We have heard this instance given by Clerk Maxwell. Thus, y=f(x)-=$(x, m,h- i) . Let m receive an increment 8m ; this is equivalent to superposing a new error whose mean value is 8m, and mean square infinitely smaller (e.r/., let its values be all +, or indeed we nJay take it to have but the single value 5m) ; ... 8w-^5m- - ^-Sm, by (40); dm dx il ^L = _ dy dm dx Hence y is a function of x-m ; so our equation must be of the form y = Y(x-m, h-i) (42). Let h receive an increment Sh ; or conceive a new error whose mean value a = 0, and whose mean square = 5/t ; we have (40) Hence (43). dx- dh Let us now suppose in (37) that all the values of every error are increased in the ratio r ; all the values of x are increased in the same ratio ; consequently there are the same number of values of x from rx to r(x + dx) as there were before from x to x + dx. This gives Y(x m,h- i)dx = J?(rx rm , r-(h - i))rdx , for m is increased in the ratio r, and h and i in the ratio ? i2 . Let us write for shortness | = x - m , r] = h i, so that 2/=F(|,7)) (44); we have ? ~ 1 F(|, 77)=F(r|, r-ri) . Let r = l + o>, where o> is infinitesimal ; This equation, and i = 2-- " 7J identical with (43), contain the solution of the problem. Thus, (45) gives by integration (45). (46), (47). Again, combining (45) and (46), =0; Substitute for y the value (47); and we find that is, a function of |TJ - * identical with a function of 77. This cannot be, unless both sides are constant. Hence TJ-| + ?/ = c. Xow c = 0, for ^ vanishes with |, by (47); and, y being always finite, the left hand number vanishes with ; T Substituting in (47) and restoring the values of {, 77, we find the form of the function (42) to be ij^C(h-i)-^e-2(i^i) (48)- C is a constant depending on the number X. The probability of the error x falling between x and x + dx is found by dividing ydx by the whole area of the curve (48) ; i.e., p (49). 49. If, instead of eq. (37), we had put a- = 7] e i + 72 f 2 + 73^+ ....... ( 5 ) where y lt 7 2 , 7., .... are any numerical factors, the formula (49) gives the probability for x, provided h, i, m are taken to be instead of the values in (41). 50. If we take the integral of eq. (49), between any two limits H, v, it gives us the probability that the sum x of the errors lies 782 PROBABILITY between /x ami v that is, that the mean of all the errors lies between ft.r- 1 and vr- 1 , if r is the number of the partial errors in (37). The most likely value of x (that is, for which the frequency is greatest) is of course x = m, and the chance that x dees not differ from m by more than 5 is In this put (.r- //0{2(/i - /}} --=; ; . . tlx^dt^^h-i). The limits m8 for x become 8 (2(A-i)} -* for t ; hence, putting we find is tlie probability that the sum x of the errors in (37) lies between the limits mT/2(A- 1) ; w is also the probability that the mean of all the errors, xr~ l , lies between the limits j!r- 1 rr- 1 /2(/i-t) . 51. The important result (48), which is the key to the whole theory of errors, contains several particular cases which Laplace gives in his fourth chapter. We may first make one or two remarks on it. (1) h i is always positive ; for in (41)

1 >aj, Ao>cc, &c. , 

because the mean of the squares of n numbers is always greater than the square of the mean. 1 (2) To find the mean value M(a?) of the sum x, and the mean value of its square M(ar ! ), we have fx*ydx J " fxtidx ~r f J - ~ M(a:) -f~r ; fydx the limits being cc . Hence The first is obvious from the fact that to every value m + z for 9- there corresponds another m - z. Both results also easily follow from common algebra : the case is that of a sum, x, X=f 1 + f 2 + e 3 -> r , &c., where each quantity ej , e 2 , f 3 . . . . goes through an independent series of values ; and it is easily proved that 52. One particular case of the general problem in art. 48 is when the errors e 1? e 2 , f 3 . . . .in (37) all follow exactly the same law; as, for instance, if e l , t ., , e 3 . . . .are the errors committed in observing the same magnitude, under exactly the same circumstances, a great number of times; and we are asked to find the chance that the sum of the errors, or that their arithmetical mean, shall fall between given limits. Here the law of facility for each error is of course the same, though we may not know what it is. We have then from (41) dt j, so that in eq. (52), 2 rr= - / JtrJO is the probability that the mean_ofall the errors shall lie between a 1 TV2r rr (A~- a, here is the mean of all the possible values of the error in this par ticular observation, which are of course infinite in number ; and (53) shows us, what is evident beforehand, that the more the number r of observations is increased the narrower do the limits for the mean error become for a given probability -a ; so that if, suppose, we take r = 3, and r=oo, we have very nearly w=l, and it becomes practically certain that the mean of the actual observations will differ from a, by an infinitesimal deviation. 53. What we have found hitherto would be of very little practi cal use, because the constants involved suppose the amounts of the Trors known, and therefore the true value known of the quantity ich is observed or measured. It is, however, precisely this true value which we usually do not know and are trying to find. Let us now suppose a large number r of measurements, which we will call i"a 3 a r,

  • narte_oi_a_magnitu(le whose true but unknown vulue is A.

The (unknown) errors of the observations will be or M( 1 ) = M( 1 ) - A ; or the mean of the errors is the error committed in taking the mean of the observations as the value of A. Hence (53) -a is the probability that the error committed in taking the mean of the observations as the truth shall lie between Here oj is the true mean of the errors of an infinite number of observations, Aj the mean of their squares. As we have no means of determining aj (except that it is nearly equal to the mean of the errors we are dealing with, which would give us no result), we have to limit the generality of the question by assuming that the law of error of the observation gives positive and negative errors with equal facility ; if so o 1 = 0, and we have the probability TXT that the error lies between T/2r J Aj . Here A-,, which is the mean of the squares of all possible values of the error of the observation, will be at least very nearly the mean square of the actual values of the errors, if r is large ; = M(J) - (Ma : ) 2 + (Mfltj - A) 2 . Rejecting the last term, as the square of a very small quantity, A 1 = M(;)-(M 1 ) 2 , and we have the probability -sr (in (53)) that the error in taking the mean of the observations as the truth lies between T V2y-i{M(aJ)-(Mo 1 )} .... (f,4), a value depending on the mean square, and mean first power, of the observed values. These limits may be put in a different form, rather easier for calculation. If fi,f^,fy ... / r be the apparent errors, that is, not the real ones, but what they would be on the hypothesis that the mean is the true value, then, putting M for r~ l (a l + a. 2 + . . . a r ) , /! = !- M, / 2 = ,-M, . . . / r = ,.-M; M(o)-2M. M + M*-M(aJ)-(Ma,); so that A = M(/i), and (54) may be written T X /27 7T 70^+/!+ . . ./>-!. . . . (55). 54. In the last article we have made no assumption as to the law of frequency of the error in the observation we are considering, except that it gives positive and negative values with equal facility. If, however, we adopt the hypothesis (see art. 47) that every error in practice arises from the joint operation of a number of independ ent causes, the partial error due to each of which is of very small importance, then the process in art. 48 will apply, and we may con clude that the errors of every series of observations of the same mag nitude made in the same circumstances follow the law of frequency in formula (48) ; and if we suppose, as is universally done, that positive and negative values are equally probable, the law will be and the probability (49) will be (56), where c is a constant, which is sometimes called the modulus of the system. Every error in practice, then, is of the form (56), and is similar to every other. If c be small, the error has small amplitudes, and the series of observations are accurate. If, as supposed in art. 53, a set of observations have been made, we can determine the modulus c, with an accuracy increasing with the number in the set. For (art. 51) ^c 2 = true mean square of all possible values of the error. This we have called A t in last article, and have shown it nearly equal to M(a] 2 ) - (M^) 2 or M(/j - ) ; so that |c 2 = mean square of obs. - (mean of obs. )- = mean square of apparent errors. 55. Thus, if a set of observations have been made, and c thus determined from them, it is easy to see that Moan error = r 7r~* = 5642c Menu square of error c .... (57). Probable error = 4769c The mean error means that of all the positive or all the negative errors. The probable error is the value which half the errors exceed and half fall short of, so that it is an even chance that the error of any particular observation lies between the limits 4769c. Its value is found from the table in art. 9, taking ! = . 56. We have often to consider the law of error of the sum of 783 several magnitudes, each of which has been determined by a set of observations. Suppose A and B two such magnitudes, and X their sum, to find the law of error in X-A + B. Let the functions of error for A and B be c- l ir-k X2c - 2 dx, f-^v-le- V tdx. In formula (49) let m = o, i = o, 27i = c 2 ; then the function for A is the law for the sum of a number of errors (37) the sum of whose mean squares is h = c 2 ; likewise that for B is the law for the sum of a number the sum of whose mean squares is ^/ 2 ; the same formula (49) shows us -that the law for the sum of these two series of errors that is, for the sum of the errors of A and B is that is, the modulus for X or A + B is Hence (58). Probable error of X= -4769Vc 2 +/ 2 . . (p.e. ofX) 2 = (p.e. of A) 2 + (p.e. of B) 2 . So likewise for the mean error. If X were the difference A - B, (58) still holds. If X be the sum of m magnitudes A, B, C . . . instead of two, its probable error is in like manner (p.e. X) 2 = (p.e. A) 2 + (p.e. B) 2 + , &c.; and if the function of error for A, B, C . . .be the same for all (p.e. X) 2 = ??i(p.e. A) 2 . Also the probable error in the mean is the mfh part of the above ; .-. p.e. of M(A) = 7?i-i(p.e. of A) . . . (59). Airy gives the following example. The co-latitude of a place is found by observing m times the Z.D. of a star at its upper cul mination and n times its Z. D. at its lower culmination ; to find the probable error. By (59) p.e. upper Z.D. =?/i~i(p.e. of an upper obs.) ; p.e. lower Z.D. = w-*(p.e. of a lower obs.) ; Now co-latitude = i(U. Z. D. + L. Z. D. ). Hence (58) (p.e. co-lat. ) 2 = ^?;i- 1 (p.e. up. obs. ) 2 + 4?i~ 1 (p.e. low. obs. ) 2 . If the upper Z.D. observations are equally good with the lower, p.e. co-lat. =J(p.e. &nobs.)/m~ l + n- 1 . 57. The magnitude to be found is often not observed directly, but another magnitude of which it is some function. Let A = true but unknown value of a quantity depending on another whose true unknown value is a, by the given function A=/() ; let an observed value for a be v, the corresponding value for A being V, then V =/(*). Let e = error of v, then the error of V is V-A=/( + e) -/(a) = </ (*) .... (60), as v is nearly equal to a. Suppose now the same magnitude A also a given function /^a,) of a second magnitude 1; which is also observed and found to be v l ; also for a third, and so on ; hence, writing C=f (v), C 1 =f 1 (v 1 ), &c. V-A-/*().f-C (61) ; and we have to judge of the best value for the unknown quantity, whose true value is called A. The arithmetical mean of V.V^V,, . . . seems the simplest, but it is not here the most probable, and we shall assume it to be a different mean, viz., (As VjVi.Vg .... are very nearly equal, it would be easy to show that any other way of combining them would be equivalent to this. ) The factors m, m 1} m. 2 .... remain to be determined. From (61) the error of X is X ~ A = ^ L ^^T^f^~ L ^ (62) " Let the moduli of the errors e, e l5 e 2 .... be c,C],c., .... (see art. 56) ; then (see art. 49) for modulus of the error X- A we have (mod. )"- If the factors mm^n^ ... are determined so as to make this modulus the least possible, the importance of the error X - A is the least possible. Differentiate with regard to in, and we find Likewise for m lt and so on. Hence mC 2 c 2 = mjCtf = m 2 C 2 c 2 = , &c. ; so that the most accurate mean to take is V V, V, t 1_ i ? i f+. t ,) ~t~ ri-> ? rv, > T J-+J-+J-. + . j2y2 r~j 2 ,,2 (J^c 2 The modulus of error in this value is, from (63), 1 _4_+4_ i+ 4_ i+i (mod. ) 58. The errors ,6j,e 2 first (64). (65). . are unknown. We have as to the Let the values of the quantities observed corresponding to the value X for that sought be x, x lt x 2 . . . so that X =f(x) =fi(x i ) =f.2(x. 2 ) . . . then X - A = (x - a}f (v) ; and, subtracting, V - X = (v - x)f (v) = (v- x)C . Here V-X is the apparent error in V, v x the apparent error of the observation v, taking X, x as the true values. Of course we have also If now we were to determine X so as to render the sum of squares of apparent errors of the observations, each divided by the square of its modulus, a minimum, that is, (v x) (Vj - XT) (V-X) 2 ( ~~c*?~" (V 2 -X) 2 "" ri2 T 2 " a minimum, we shall find the same value (64) for X. Of course if the modulus is the same for all the observations the sum of squares simply is to be made a minimum. To take a very simple instance. An observed value of a quantity is P ; an observed value of a quantity known to be the square root of the former is Q ; what is the most probable value ? If X be taken for the quantity, the apparent error of P is P - X ; the apparent error of Q is found from (Q- C ) 2 = X; .-. C = (Q 2 -X)/2Q; . . (P - X) 2 + (Q 2 - X) 2 /4Q 2 = minimum ; .-. X = (4P + 1)Q 2 /(4Q 2 + 1) ; the weight of both observations being supposed the same. Again, suppose a circle is divided by a diameter into two semi circles ; the whole circumference is measured and found to be L ; also the two semicircles are found to be M and N respectively. What is the most probable value of the circumference ? If X be taken as the circumference, the apparent error in L is L - X ; those of M and N are M - X, N - X. Hence, if all the measurements are equally good, (L - X) 2 + (M - ^X) 2 + (N - ^X) 2 = minimum , is the most probable value. The modulus of error of this result is (65) found to be (mod. ) 2 = f(mod. of measurements) 2 so that probable error = (prob. error of a measurement) Vf- 59. In the last article we have explained the method of least squares, as applied to determine one unknown element from more than one observation of the element itself or of others with which it is connected by known laws. If several observations of the element itself are made, it is obvious that the method of least squares gives the arithmetical mean of the observations as the best value, thus justifying what common sense seems to indicate. If the observations are not equally good, the best value will be calling w, the weights of the different observations i. e., w = c - - , u = Cj ~ - , u: 2 = c. 2 - - , &. It would carry us beyond our assigned limits in this article to attempt to demonstrate and explain the method of least squares when several elements have to be determined from a number of observations exceeding the elements in number. We must there fore refer the reader to the works already named, and also to the following : Gauss, Theoria Combinations Obscrvationum ; Gauss, Theoria Motus ; Airy, Theory of Errors of Observation ; Leslie Ellis, in Camb. Phil. Trans., vol. viii. 784 PROBABILITY The rule in such cases is that the sum of squares of the apparent errors is to be made a minimum, as in the case of a single element. To take a very simple example : A substance is weighed, and the weight is found to be W. It is then divided into two portions, whose weights are found to be P and Q. What is the most probable weight of the body ? Taking A and B as the weights of the two portions, the apparent errors are P - A, Q - B, and that of the whole is W - A - B ; hence (P - A) 2 + (Q - B) 2 + ( W - A - B) 2 = minimum there being two independent variables A, B. 2B + A-Q + W dx-=2c. 2 Thus the probability required is p--2c/a. Laplace in solving this question suggests that by making a great number of trials, and counting the cases where the rod falls on a line, we could determine the value of ir from this result. He further considers, for a given value of a, what length 2c should be chosen for the rod so as tp give the least chance of error in a given large number N of throws. In art. 8 we have shown that the chance that the number of successes shall lie between ^>Nr is 9 /~ ra x? -: c/ c dx > where i 2?(1 -p)N For a given probability =-, ra is given. We have then a given chance that the number of successes shall differ from its most pro bable value //NT by an error r which is the least possible fraction of the latter when r/pN, or when 1/apN, or when VpU~-p)/P is trie least possible; that is, when p~ l - I =im/2c- 1 1 is the least 1 If S = number of successes, we have an assigned chance zr that S lies between ;A"ir; that is, the value uf ir Us between _-* , or c -( - -L ) a 8r a S S 2 / Hence the error in n is least when 2cr/S 2 is lea-t. Now roc vXl />)", Scocj), and St-xp nearly; hence Vp(l-p>/P is to be- the least possible. which are the most probable weights of the whole and the two parts. VI. ON LOCAL PROBABILITY. 60. It remains to give a brief account of the methods of deter mining the probabilities of the fulfilment of given conditions by variable geometrical magnitudes, as well as the mean values of such magnitudes. Recent researches on this subject have led to many very remarkable results ; and we may observe that to English mathematicians the credit almost exclusively belongs. It is a new instance, added to not a few which have gone before, of a revival for which we have to thank the eminent men who during the 19th century have enabled the country of Newton to take a place less unworthy of her in the world of mathematical science. At present the investigations on this subject have not gone beyond the theoretical stage ; but they should not be undervalued on this account. The history of the theory of probabilities has sufficiently shown that what at first seems merely ingenious and a [ matter of curiosity may turn out to have valuable applications to : practical questions. How little could Pascal, James Bernoulli, and De Moivre have anticipated the future of the science which they were engaged in creating ? 61. The great naturalist Buffon was the first who proposed and solved a question of this description. It was the following : A floor is ruled with equidistant parallel lines ; a rod, shorter than the distance between each pair, being thrown at random on the floor, to find the chance of its falling on one of the lines. Let x be the distance of the centre of the rod from the nearest line, 6 the inclination of the rod to a perpendicular to the parallels, 2a the common distance of the parallels, 2c the length of rod ; then, as all values of x and Q between their extreme limits are equally probable, the whole number of cases will be represented by dxdO- Now if the rod crosses one of the lines we must Lave so that the favourable cases will be measured by /-., .. /-cco.sf possible, or when c is the greatest possible. Now the greatest value of c is a ; the rod therefore should be equal to the distance between the lines. Laplace s answer is incorrect, though originally given right, (see Todhunter, p. 591 ; also Czuber, p. 90). 62. Questions on local probability and mean values arc of course reducible, by the employment of Cartesian or other coordinates, to multiple integrals. Thus any one relating to the position of two variable points, by introducing their coordinates, can be made to depend on quadruple integrals, whether in finding the sum of the values of a given function of the coordinates, with a view to obtaining its mean value, or in finding the number of the favour able cases, when a probability is sought. The intricacy and difficulty to be encountered in dealing with such multiple integrals and their limits is so great that little success could be expected in attacking such questions directly by this method ; and most of what has been done in the matter consists in turning the difficulty by various considerations, and arriving at the result by evading or simplifying the integrations. AVe have a certain analogy here in the variety of contrivances and artifices used in arriving at the values of definite integrals without performing the integrations. We will now select a few of such questions. 63. If a given space S is included within a given space A, the chance of a point P, taken at random on A, falling on S, is j-S/A. But if the space S be variable, and M(S) be its mean value (66). For, if we suppose S to have n equally probable values Sj, S 2 , S 3 . . . . , the chance of any one S x being taken, and of P falling on S lf is fc-n- Sj/A; now the whole probability p = Pi + 2>->+2 } a + > which leads at once to the above expression. The chance of two points falling on S is, in the same way, and so on. In such a case, if the probability be known, the mean value follows, and vice versa. Thus, we might find the mean value of the ntii power of the distance XY between two points taken at random in a line of length Z, by considering the chance that, if n more points are so taken, they shall all fall between X and Y. This chance is for the chance that X shall be one of the extreme points, out of the whole (?i + 2), is 2(?i + 2)~ 1 ; and, if it is, the chance that the other extreme point is Y is (n + 1)- 1 . Therefore M(XY) = 2Z(n + 1) -

+ 2) - 1 . 

64. A line I is divided into n segments by n - 1 points taken at random ; to find the mean value of the product of the n segments. Let a, b, c, . . . be the segments in one particular case. If n new points are taken at random in the line, the chance that one falls on each segment is 1.2.3... nabc . . . /l n ; hence the chance that this occurs, however the line is divided, is Now the whole number of different orders in which the whole 2n - 1 points may occur is 1 2/( - 1 ; out of these the number in which one of the first series falls between every two of the second is easily found by the theory of permutations to be | n | n - 1 . Hence the required mean value of the product is 65. If M be the mean value of any quantity depending on the positions of two points (e.g. , their distance) which are taken, one in a space A, the other in a space B (external to A) ; and if M be the same mean when both points are taken indiscriminately in the whole space A + B ; M a , M& the same mean when both points are taken in A and both in B respectively ; then (A + B) 2 M - 2 ABM + A 2 M a + B 2 M 6 . If the space A = B, 4M if, also, M B = M6, 2M - 66. The mean distance of a point P within a given area from a fixed straight line (which does not meet the area) is evidently the distance of the centre of gravity G of the area from the line. Thus, if A, B are two fixed points on a line outside the area, the mean value of the area of the triangle APB = the triangle AGB. From this it will follow that, if X, Y, Z are three points taken at random in three given spaces on a plane (such that they cannot PROBABILITY 785 all be cut by any one straight line), the mean value of the area of the triangle XYZ is the triangle GG G", determined by the three centres of gravity of the spaces. For example Two points X, Y are taken at random within a triangle. What is the mean area M of the triangle XYC, formed by joining them with one of the angles of the triangle ? Bisect the triangle by the line CD ; let Mj be the mean value when both points fall in the triangle ACD, and M 2 the value when one falls in ACD and the other in BCD; then 2M = M 1 + M 2 . But M! = P[; and M 2 = GG C, where G, G are the centres of gravity of "ACD, BCD, this being a case of the above theorem; hence M.,= ijABC, and M-^ABC. Hence the chance that a new point Z falls on the triangle XYC is J T ; and the chance that three points X,Y,Z taken at random form, with a vertex C, a re-entrant quadrilateral, is . 67. If M be a mean value depending on the positions of n points falling on a space A ; and if this space receive a small increment a, and M be the same mean when the n points are taken on A + a, and Mj the same mean when one point falls on a and the remaining n-l on A ; then, the sum of all the cases being M (A + a) n , and this sum consisting of the cases (1) when all the points are on A, (2) when one is on o the others on A (as we may neglect all where two or more fall on a), we have .-. (M -M)A = z a (M x -M) ..... (68), as M nearly = M. As an example, suppose two points X, Y are taken in a line of length I, to find the mean value M of (XY)", as in art. 63. If I receives an increment dl, formula (68) gives Now M! here = the mean nth power of the distance of a single point taken at random in I from one extremity of I ; and this is l"(n + 1)~ l (as is shown by finding the chance of n other points falling on that distance) ; hence as in art. 63, C being evidently 0. 68. If;; is the probability of a certain condition being satisfied by the n points within A in art. 67, p the same probability when they fall on the space A + a, and p l the same when one point falls on a. and the rest on A, then, since the numbers of favourable cases are respectively _?/( A + a)", pA.", iip^A."- 1 , we find (p p}A.=na(pip) (69). Hence if ^> =_p then p=p this result is often of great value. Thus if we have to find the chance of three points within a circle forming an acute-angled triangle, by adding an infinitesimal con centric ring to the circle, we have evidently p =p ; hence the required chance is unaltered by assuming one of the three points taken on the circumference. Again, in finding the chance that four points within a triangle shall form a convex quadrilateral, adding to the triangle a small band between the base and a line parallel to it, the chance is clearly unaltered. Therefore by (69) we may take one of the points at random in the base of the triangle without altering the pro bability. 69. Historically, it would seem that the first question given on local probability, since Buffon, was the remarkable four-point problem of Prof. Sylvester. It is, in general, to find the pro bability that four points taken at random within a given boundary shall form a re-entrant quad rilateral. It is easy to see that this problem is identical with the problem of finding the mean area of the triangle formed by three points taken at random ; for, it M be this mean, and A the given area, the chance of a fourth point falling on the triangle is M/A ; and the chance of a re-entrant quadrilateral is four times this, or 4M/A. A w Let the four points be & taken within a triangle. We may take one of them W (fig. 3) at random on the base (art. 68) ; the others X, Y, Z within the triangle. Now the four lines from the vertex B to the four points are as likely to occur in any specified order as any other. Hence it is an even chance that X, Y, Z fall on one of the triangles ABW, CBW, or that two fall on one of these triangles and the remaining one on the other. Hence the probability of a re-entrant quadrilateral is ift+iPs, where p t = prob. (WXYZ re-entrant), X,Y,Z in one triangle ; p. t do., X in one triangle, Y in the other, Z in either. But PI = T> ( ap t- 66). Now to find p.,; the chance of Z falling within the triangle WXY is the mean area of WXY divided by ABC. Now by the principle in art. 66, for any particular position of W, M(WXY) = WGG , where G, G are the centres of gravity of ABW, CBW. It is easy to see that W T GG = 1 & ABC = ? putting ABC = 1. Now, if Z falls in CBW, the chance of WXYZ re- entrant is 2M(IYW), for Y is as likely to fall in WXZ as Z to fall in WXY ; also if Z falls in ABW the chance of WXYZ re-entrant is 2M(IXW). Thus the whole chance is j 2 = 2M(IYW + IXW) = . Hence the probability of a re-entrant quadrilateral is 1 . 4 4-1 . ? 1 2 IfT 5 u 3 That of its being convex is f. 70. If three points X, Y, Z are taken at random in a triangle, the mean value of the triangle XYZ = -jVf the given triangle. For we have seen that the chance of four points forming a re-entrant figure is 4M/A, where M is the required mean and A the given triangle ; as this has been shown to be ^, M = f a A. 71. Let the three points be taken within a circle; and let M be the mean value of the triangle formed. Adding a concentric ring o, we have (68) since M :M as the areas of the circles, M = r M. where Mj is the value of M when one of the points is on the circumference. Take fixed ; we have to find the mean value of OXY (fig. 4). Taking (p, 6} (p, 6 ) as coordinates of X, Y, . (OXY) . o sin(0 - e )pp dpdp d6dO = (7r 2 a 4 )- 1 . ffrW*. sin(0- 6 )dedO , putting r = OH, ?- = OK ; asr = 2asin0, / = 2f sin , M, = 4-.. ( ^ f*f 9 A-D*9wMKD.(9-tf)Mdff. 7T-0. 4 9 JQ JQ P-ofessor Sylvester has remarked that this double integral, by means of the theorem " rx f( x , y}dxdy=f a f X f(a-y,a-xy. X dy , is easily shown to be identical with TT /~0 sin 4 Osin s Q cosddddO =J 35a 2 . . M 35 Hence the probability that four points within a circle shall form a re-entrant figure is 35 P 127T 2 72. Professor Sylvester has remarked that it would be a novel question in the calculus of variations to determine the form of the convex contour which renders the probability a maxi mum or minimum that four points taken within it shall give a re-entrant quadri lateral. It will not be difficult to show, by means of the principles we have been examining, that the circle is the contour which gives the minimum. For, if p be the probability of a re entrant figure for four points within a circle of area A, p the same probability when a small addition a, of any kind p. , which still leaves the whole contour convex, is made to the circle, we have by (69) where p l -=te probability when one point is taken in a that is, in the limit, when one point is taken on the circumference of the circle. But Pi=p, as is shown in art. 68 ; hence p - p = . Hence any infinitesimal variation of the contour from the circum ference of the circle gives Sp, the variation of the probability, zero, the same method being applicable when portions are taken away, instead of being added, provided the contour is left convex. XIX. --99 786 PROBABILITY Hence, for the circle, the probability is a maximum or minimum. It will be a minimum, because in the formula (68) for the mean triangle formed by three points (M -M)A = 3a(M 1 -M). ilj, which is the meau triangle when one point is in a, is really greater than when it is on the circumference, though the same in the limit ; hence Mi>iM; .-. (M -M)A>oM; .-. M /(A + a)>M/A. Therefore, if we consider infinitesimals of the second order, the chance of a re-entrant figure is increased by the addition of the space a to the circle. It will be an exercise for the reader to verify this when the space is subtracted. For an ellipse, being derived by projection from the circle, the probability is the same, and a minimum. It is pretty certain that a triangle will be found to be the con tour which gives the probability the greatest. Mr Woolhouse has given (Educ. Times, Dec. 1867) the values of p for 73. Many questions may be made to depend upon the four-point problem. Thus, if two points A, B are taken at random in a given convex area, to find the chance that two others C, D, also taken at ran Join, shall lie on opposite sides of the line AB. Let p be the chance that ABCD is re-entrant. If it is, the chance is easily seen to be that any two of the four lie on opposite sides of the line joining the two others. If ABCD is convex, the same chance is ^ ; hence the required probability is Triangle. Parallelogram. Reg. Hexagon. Circle. J or -3333 u 3056 289 -5TT 2973 M*- 2 2955

Or we might proceed as follows, e.g. , in the case of a triangle : The sides of the triangle ABC (fig. 5) produced divide the whole triaugle into seven spaces. Of these, the mean value of those marked a is the same, viz., the mean value of ABC, or T J 5 of the whole triangle, as we have shown, the mean value of those marked being f of the triangle. This is easily seen : for instance, if the whole area = i, the mean value of the space PBQ gives the chance that if the fourth point D be taken at random B, shall fall within the triangle ADC ; now the mean value of ABC gives the chance that D Fig. 5. shall fall within ABC ; but these two chances are equal. Hence we see that if A, B, C be taken at random, the mean value of that portion of the whole triangle which lies on the same side of AB as (. does is r j of the whole, and that of the opposite portion is T V Hence the chance of C and D falling on opposite sides of AB is T V 74. We can give but few of the innumerable questions depend ing on the position of points in a plane, or in space. Some may be solved without any aid from the integral calculus, by using a few very evident subsidiary principles. As an instance, we will state the following two propositions, and proceed to apply them to one or two questions : (1) In a triangle ABC, the frequency of any direction for the line CX is the same when X is a point taken at random on the base AB as when X is taken at random in the area of the triangle. (2) If X (fig. 6) is a point taken at random in the triangle AB6 (Bb being infinitesimal), the x . frequency of the distance , . , I AX is the same as that of A Y z B AZ, Y and Z being two Fi g- 6 - points taken at random in AB, and Z denoting always that one nf the two v:hich is nearest to B. For the frequency in each case 13 proportional to the distance AX or AZ. Let us apply these to the follow ing question : A point is taken at random in a triangle (fig. 7) ; if n more points arc taken at random, to find the chance that they shall all lie on some one of the three triangles AOB, AOC, BOC. If C be joined with all the points in question, every joining line is equally likely to be nearest to CB. Fig. 7. Hence the chance that all the n points fall on the triangle ACD is If this is so, we have to find the chance that all lie on AOC. Now if range over the infinitesimal triangle DCd, we may, by prin ciple (2) above, suppose it to be the nearest to D of two points taken at random in CD. If so, the chance that AO is nearer to AD than any of the lines from A to the n points is 2(?H-2)- 2 ; for, by (1) above, we may suppose all the points taken at random in CD; now any one of the + 2 is equally likely to be the last ; and is the last of the two additional points. Hence, if is in the triangle CDd, the chance that the n points fall on AOC is therefore this is the chance wherever falls in ABC. Therefore the required chance that the n points fall on some one of the triangles AOB, AOC, BOC is Again, if be taken at random in the triangle, and three more points X, Y, Z be also taken at random in it, to find the chance that they shall fall, one on each of the triangles AOB, AOC, BOC. First, two of the points are to fall on one of the triangles ACD, BCD, and the remaining one on the other ; say two on ACD, the chance of this is j, as CO must then be the third in order of the four distances from C. If this is so, the chance that the point X in BCD falls on BOC is . For, as above, if ranges over the triangle CDd, we may take it to be the lowest of two points taken at random on CD ; and the chance that, if another point be also taken at random in CD, it shall be lower than is $. Now if one of the points X is in BOC, the frequency of in CDd will be the same as that of the lowest of three points taken on CD; and the chance that one of the remaining points shall fall in AOC and the other in AOD is the chance that 0, the lowest of three particular points out of five, all tnken at random in CD, shall be the fourth in order from C. It is easy to see that this chance is T V Hence the chance that one point falls on BOC, one on AOC, and the third on AOD is 12 S _ 1 4 5 TT TV And it will be the same for the case where the third falls on BOD. Hence the chance that one point falls on each of the three triangles above is double this, or ^V 75. Straight lines falling at random on a Plane. If an infinite number of straight lines be drawn at random in a plane, there will be as many parallel to any given direction as to any other, all directions being equally probable ; also those having any given direction will be disposed with equal frequency all over the plane. Hence, if a line be determined by the coordinates p, u, the perpen dicular on it from a fixed origin 0, and the inclination of that perpendicular to a fixed axis, then, if p, u be made to vary by equal infinitesimal increments, the series of lines so given will represent the entire series of random straight lines. Thus the number of lines for which p falls between p and p + dp, and a> between and ca + du, will be measured by dpdta, and the integral between any limits, measures the number of lines within those limits. It is easy to show from this that the number of random lines which meet any closed convex contour of length L is measured by L. For, taking inside the contour, and integrating first for p, from to p, tho perpendicular on the tangent to the contour, we have fpd<a ; taking this through four right angles for u, we have by Legendre s theorem on rectification, N being the measure of the number of lines, N= /" 2 V?o. = L. 1 yo Thus, if a random line meet a given contour, of length L, the chance of its meeting another convex contour, of length I, internal to the former, is p-J/L. If the given contour be not convex, or not closed, N will evi- i This result also follows by considering that, if an infinite plane be covered by an infinity of lines drawn at random, it is evident that the number of these which meet a given finite straight line is proportional to its length, and is the same whatever be its position. Hence, if we take I the length of the line as the measure of this number, the number of random lines which cut any element ds of the contour is measured by ds, and the number Which meet the contour is therefore measured by $L, half the length of the boum aiy. If we fake 2/ as the measure for the liiiCj the measure for the contour will be L, as above. Of course we have to remember that each line must meet the contour twice. 1 would bo possible to rectify any closed curve by means of this principle. Suppose it traced on the surface of a circular disk, of circumference L, and the disk thrown a great number of times on a system of parallel lines, whose distance asunder equals the diameter, if we count the number of cases in which the closed curve meets one of the parallels, the ratio of this number to the whole number of trials will be ultimately the ratio of the circumference of the curve to that ot the circlfc. PROBABILITY 787 dently be the length of an endless string, drawn tight around the contour. 76. If a random line meet a closed convex contour, of length L, the chance of it meeting __ _R another such contour, external to the former, is where X is the length of an endless band envelop ing both contours, and crossing between them, and Y that of a band also enveloping both, but P ~ P not crossing. This may Fig. 8. be shown by means of Legendre s integral above ; or as follows : Call, for shortness, N(A) the number of lines meeting an area A ; N(A, A ) the number which meet both A and A ; then (fig- 8) N(SROQPH) + N(S Q OR P H ) = N(SROQPH + S Q OR P H ) + N(SROQPH, S Q OR P H ), since in the first member each line meeting both areas is counted twice. But the number of lines meeting the non-convex figure consisting of OQPHSR and OQ S H P R is equal to the band Y, and the number meeting both these areas is identical with that of those meeting the given areas ft, ft ; hence Thus the number meeting both the given areas is measured by X - Y. Hence the theorem follows. 77. Two random chords cross a given convex boundary, of length L, and area ft ; to find the chance that their intersection falls inside the boundary. Consider the first chord in any position ; let C be its length ; considering it as a closed area, the chance of the second chord meeting it is 2C/L ; and the whole chance of its coordinates falling in dp, d<a and of the second chord meeting it in that position is 2C ^dpdco __2^ L ffdpdo) L" But the whole chance is the sum of these chances for all its positions ; . . prob. = 2L - tffCdpdu . Now, for a given value of o>, the value of jCdp is evidently the area ft ; then, taking <a from ir to 0, required probability = 27r,flL~ 2 . The mean value of a chord drawn at random across the boundary is J/Cdpdca Trft M = ^ --JT- . JJdpda 78. A straight band of breadth c being traced on a floor, and a circle of radius r thrown on it at random ; to find the mean area of the band which is covered by the circle. (The cases are omitted where the circle falls outside the band. ) l If S be the space covered, the chance of a random point on the circle falling on the band is This is the same as if the circle were fixed, and the band thrown on it at H random. Now let A (fig. 9) be a position of the random point ; the favourable cases are when HK, the bisector of the band, meets a circle, -p- Q centre A, radius |c ; and the whole number are when HK meets a circle, centre 0, radius r + Jc; hence the probability is 2r + c This is constant for all positions of A ; hence, equating these two values of p, the mean value required is The mean value of the portion of the circumference which falls on the band is the same fraction of the whole circumference. 1 Or the floor may be supposed painted with parallel bands, at a distance asunder equal to the diameter; so that the circle must fall on one. If any convex area whose surface is ft and circumference L be thrown on the band, instead of a circle, the mean area covered is For as before, fixing the random point at A, the chance of a random point in ft falling on the band is jj = 27r. Jjc/L , where L is the perimeter of a parallel curve to L, at a normal distance ^c from it. Now M(S)_ jrc_ 79. Buffon s problem may be easily deduced in a similar manner. Thus, if 2r = length of line, a distance between the parallels, and we conceive a circle (fig. 10) of diameter a with its centre at the middle of the line, 2 rigidly attached to the latter, and thrown with it on the parallels, this circle must meet one of the parallels ; if it be thrown an in finite number of times, we shall thus have an infinite number of chords crossing it at random. Their number is measured by 2?r . a, and the number which meet 2r is measured by 4r. Hence the chance that the line 2r meets one of the parallels is 2? = 4r/Trrt . 80. To investigate the probability that the inclination of the line joining any two points in a given convex area ft shall lie within given limits. We give here a method of reducing this question to calculation, for the sake of an integral to which it leads, and which is not easy to deduce otherwise. First let one of the points A (fig. 11) be fixed ; draw through it a chord PQ = C, at an inclination to some fixed line; put AP = r, AQ = r ; then the number of cases for which the direction of the line joining A and B lies between and 6 + dd is measured by Now let A range over the space be tween PQ and a parallel chord distant dp from it, the number of cases for which A lies in this space and the direction of AB is from 9 to 6 + de is (first considering A to lie in the element drdp) Idpdef (r 2 + r 2 )dr = Let p be the perpendicular on C from a given origin 0, and let o> be the inclination of p (we may put dta for d6), C will be a given function of p, o> ; and, integrating first for o> constant, the whole number of cases for which u falls between given limits co , o> is J/~ W dw/C s dp ; J<a" J the integral f(Pdp being taken for all positions of C between two tangents to the boundary parallel to PQ. The question is thus reduced to the evaluation of this double integral, which, of course, is generally difficult enough ; we may, however, deduce from it a remarkable result ; for, if the integral be extended to all possible positions of C, it gives the whole number of pairs of positions of the points A, B which lie inside the area ; but this number is ft 2 ; hence the integration extending to all possible positions of the chord C, its length being a given function of its co ordinates p, (a. COR. Hence if L, ft be the perimeter and area of any closed c ivex contour, the mean value of the cub of a chord drawn across it at random is i>ft~/L . 81. Let there be any two convex boundaries (fig. 12) so related that a tan gent at any point V to the inner cuts off a constant segment S from the outer (e.g., two concentric similar ellipses) ; let the annular area between them be called A ; from a point X taken at random on this annulus draw tangents 2 The line might be anywhere within the circle without altering the question. 3 This integral was given by the present writer in the Comptes Rendus, 18<i9, p. 1469. An analytical proof was given by Serret, Annales scient. de CEcole Normale, 1869 p. 177. 12. 7 ss PROBABILIT Y XA, XB to the inner. Find the mean value of the arc AB. We shall find M(AB)-LS/A, L being the whole length of the inner curve ABV. We will first prove the following lemma : If there be any convex arc AB (tig. 13), and if Xj be (the measure of) the number of random lines which -^^ meet it once, N._> the number which meet ^/^ ^, it twice, A B 2arcAB = X, + 2X,. Fig. 13. For draw the chord AB ; the number of lines meeting the convex figure so formed is Xj -f X., = arc -f chord (the perimeter) ; but Xj = number of lines meeting the chord = 2 chord; . . 2 arc + Xj = 2Xj + 2X., , . . 2 arc = N x -f 2X 2 . Xow fix the point X, and draw XA, XB. If a random line cross the boundary L, and p l be the probability that it meets the arc AB once, ;> 2 that it does so twice, and if the point X range all over the an- uulus, and p v p.^ are the same probabilities for all positions of X, 2M(AB)/L-. Pl + 2p J . Let now IK (fig. 14) be any position of the random line ; drawing tangents at 1, K, it is easy to see that it will cut the arc AB twice when X is in the space marked a, and once when X is in either space marked tion of the line, Fig. 14.

hence, for this posi-

P! -f 2/? 2 = I 2S .... = -r- , which is A hence M(AB) S Hence the mean value of the arc is the same fraction of the perimeter that the constant area S is of the annulus. If L be not related as above to the outer boundary, M(AB)/L = M(S)/A, M(S) being the mean area of the segment cut off by a tangent at a random point on the perimeter L. The above result may be expressed as an integral. If s be the arc AB included by tangents from any point (x, y} on the annulus, ffsdxdy = LS. It has been shown (Phil. Trans., 1868, p. 191) that, if d be the angle between the tangents XA, XB, The mean value of the tangent XA or XB may be shown to be where P = perimeter of locus of centre of gravity of the seg ment S. 82. If C be the length of a chord crossing any convex area n ; 2, 2 the areas of the two segments into which it divides the area; and p, u the coordinates of C, viz., the perpendicular on C from any fixed pole, and the angle made by p with any fixed axis ; then ff&djKlu = 6 jfcs djMlu, both integrations extending to all possible values of p, &> which give a line meeting the area. This identity will follow by proving that, if p be the distance between two points taken at random in the area, the mean value of p will be MO>) = o -2/722 . M, ...... (i), and also M(p) = JQ-^C 4 <(pd ...... (2). The first follows by considering that, if a random line crosses the area, the chance of its passing between the two points is 2L- 1 M(p), L being the perimeter of ft. Again, for any given position of the random line C, the chance of the two points lying on opposite sides of it is 222 H 2 ; therefore, for all positions of C, the chance is 2n- 2 M(22 ); but the mean value M(22 ), for all positions of the chord, is M , ffitidpii* tr -- = ffdpd* To prove equation (2), we remark that the mean value of p is found by supposing each of the points A, B to occupy in succession every pos sible position in the area, and dividing the sum of their distances in each case by the whole number of cases, the mea sure of which number is fl 2 . Con fin- T ing our attention to the cases in which the inclination of the distance AB to some fixed direction lies between 6 and 6 + dO, let the position of A be fixed (fig. 15), and draw through it a chord HH = C, at the inclination ; the sum of the cases found by giving B all its positions is /* /" ./o _/o where r = AH, ?- = AH . Now let A occupy successively all posi tions between HH and kk , a chord parallel to it at a distance = dp ; the sum of all the cases so given i be Xow, if A moves over the whole area, the sum of the cases will be where jo = perpendicular on C from any fixed pole 0, and the integration extends to all parallel positions of C between two tan gents T, T to the boundary, the inclination of which is 0. Remov ing now the restriction as to the direction of the distance AB, and giving it all values from to IT, the sum of all the cases is or, if o> = inclination of p, du> = d6, and the sum is The mean value of the reciprocal of the distance AB of two points taken at random in a convex area is easily shown to be Thus, for a circle, 16 It may also be shown that the mean area of the triangle formed by taking three points A, B, C within any convex area is M(ABC) = n - Ci-*ff($-S?dpdu . 83. In the last question if we had sought for the mean value of the chord HH or C, which joins A and B, the sum of the cases when A is fixed and the inclination lies between and 6 + dd would have been and when A lies between II H and hh Cdedp/ V 2 + r -)dr and finally, the mean value of C is Thus the mean value of a chord, passing through two points taken at random within any convex boundary, is double the mean dis tance of the points. 84. } T e have now done enough to give the reader some idea of the subject of local probability. We refer him for fuller information to the very interesting work just published by Emanuel Czuber of Prague, Geometrische WahrschcinlicJikeiten und Mittclwcrtc, Leip- sic, 1884 ; also to the Educational Times Journal, in which most of the recent theorems on the subject have first appeared in the form of questions, under the able editorship of Mr Miller, who has him self largely contributed. In AVilliamson s Integral Calculus, and a paper by Prof. Crofton, Phil. Trans., 1868, the subject is also treated. Literature. Besides the works named in the course of this article, see De Morgan s treatise in the Encyclopedia Metropolitans; Laurent, Tratte au Calcul des Probabilities, Paris, 1873 ; Gourand, llistoire du Calcul des Prob., Paris, 1848; J. V. L. Glaisher, "On the Law of Facility of Errors of Observations, and the Method of Least Squares," Trans. K.A.S., vol. xxxix. ; Cournot, Theorie des Chances; Liagre, Calcul des Prob.; General Didion, Calcul des Prob. applique au tir des projectiles. Those who are interested in the metaphysical aspect of the question may consult lioole s Laws of Thought, also J. S. Mill s Logic. To these and the other works we have named we refer the reader for an account of what we have had to omit, but above all, to the great work of Laplace, of which it is sufficient to say that it is worthy of the genius of its author the Theorie analytique des Probability s. It is no light task to master the methods and the reasonings there employed; but it is, and will long continue to be, one that must be attempted by all who desire to understand and to apply the theory of probability. (M. W. C.)

  1. “There is a sort of leap which most men make from a high probability to absolute assurance . . . analogous to the sudden consilience, or springing into one, of the two images seen by binocular vision, when gradually brought within a certain proximity.”—Sir J. Herschel, in Edin. Review, July 1850.
  2. Archbishop Whately’s jeu d’esprit, Historic Doubts respecting Napoleon Bonaparte, is a good illustration of the difficulties there may be in proving a conclusion the certainty of which is absolute.
  3. So it is said, “the tree is known by its fruits”; “practice is better than theory”; and the universal sense of mankind judges that the safest test of any new invention, system, or institution is to see how it works. So little are we able by a priori speculations to forecast the thousand obstacles and disturbing influences which manifest them selves when any new cause or agent is introduced as a factor in the world’s affairs.
  4. Men were surprised to hear that not only births, deaths, and mar riages, but the decisions of tribunals, the results of popular elections, the influence of punishments in checking crime, the comparative values of medical remedies, the probable limits of error in numerical results in every department of physical inquiry, the detection of causes, physical, social, and moral, nay, even the weight of evidence and the validity of logical argument, might come to be surveyed with the lynx-eyed scrutiny of a dispassionate analysis.—Sir J. Herschel.
  5. A very simple proof of this principle is as follows : let, a number N be divided into r parts a, b, c, &c.; if any two of these, as a, b, are unequal, since it follows that the product abed ... is increased by substituting ^J- , -^ , for a and b. Hence as long as any two are unequal we can divide N differently so as to obtain a greater product; and therefore when the parts we all equal the product is greatest, or /a + b+c+ . . .r^
  6. The familiar expression net to “put all one’s eggs in the same basket” shows us how general common sense has recognized this principle.