Math Puzzle

This cool puzzle (and solution) comes from my friend Mike.

Alice secretly picks two different real numbers by an unknown process and puts them in two (abstract) envelopes.  Bob chooses one of the two envelopes randomly (with a fair coin toss), and shows you the number in that envelope.  You must now guess whether the number in the other, closed envelope is larger or smaller than the one you’ve seen.

Is there a strategy which gives you a better than 50% chance of guessing correctly, no matter what procedure Alice used to pick her numbers?

I initially thought there wasn’t, or that the problem was paradoxically defined, but it turns out that it’s perfectly valid and the answer is “Yes.”  See my first comment for an example of one such winning strategy.

This puzzle is similar to, but not the same as, the two envelopes paradox.  For more time-devouring reading, see Wikipedia’s List of Paradoxes.

411 replies on “Math Puzzle”

  1. Strategy:

    Call the number you saw "x". Use the logistic function to calculate p(x)=1/(1+e^-x). Choose a random real number between 0 and 1. If it's lower than p(x), guess "lower". Otherwise, guess "higher".

    In other words, guess "lower" with probability p(x), and "higher" with probability 1-p(x).

    This works because p(x) is a function which is 0 at negative infinity, 1 at positive infinity, and increases monotonically in between. So for any pair of numbers, if you call the smaller one A and the larger one B, you have a (slightly) better chance of guessing “lower” if you saw B than if you saw A. Your overall chances of guessing correctly — given that there is a 50% chance you’re seeing A and a 50% chance you’re seeing B — are:

    N = 0.5 * (1-p(A)) + 0.5 * p(B)

    (remembering that p(x) is 1/(1+e^-x))

    Since p(A) is always smaller than p(B), N is always greater than 50%. If Alice picked 1.0 and pi, your chances of guessing correctly are 61.37%. If Alice picked 10 and 11, your chances of guessing correctly are 50.00053%. No matter what two numbers Alice picked, the chance of guessing right is always better than 50%. Sometimes it’s only better by a tiny, tiny amount, but it’s always better.

    An important thing to keep in mind here is that there’s no such thing as a random sample of all the numbers. There’s no way to choose one integer at random, where every integer has an equal chance of being chosen. (For more on this, see this article.) Similarly, there’s no such thing as a randomly-chosen function out of all the possible functions. You can’t try to model Alice’s process by figuring out the chances her procedure (or numbers) will fall into one category or another because you do not know what procedure she’s using.

    For another version of this puzzle, see question B here and the solution here. If any of this seems wrong to you, try writing a program to simulate the game as you understand it and measure the probabilities yourself. You’ll find that there’s no method Alice can use to pick the numbers that keeps you from having at least a slightly better than even chance of winning (although if she picks two particularly large positive or negative numbers, the difference is probably too small to measure).


  2. You are right about picking a random real number, but you never stated that the number was randomly chosen. You just said, she picked two numbers, stipulating only that they were not equal. If she can pick an arbitrary real number (almost all of which cannot be described by logic, which is countable), then why can’t she choose an arbitrary large one as often as any other one?

    Put another way, if she can come up with an infinite sequence of digits between 1 and 9 (the decimal expansion), why can’t she pick a location out of infinitely many with equal probability to place the decimal point if she wants to?

    Of course, if you are playing this game in real physical reality with a real person or computer, then it is reduced to finite limits, so your method would work well, due to things like Benford’s law that hold in practice but not in theoretical math.

    It’s like the game theory paradox: pick a real number x > 0 and your pay-off will be 1/x. What is your best strategy/Nash Equilibrium? Of course, there is none, because x can be chosen to make your pay-off of 1/x arbitrarily large.


  3. xkcd wrote:
    >> closed envelope is larger or smaller than the one you’ve seen.

    When you say larger, do you mean higher absolute value (farther from zero), or more positive (close to +inf)?


  4. What first came to my mind was: if the number is negative I’d guess the other number is larger, and if it’s positive I’d guess the other number is smaller. Does anyone know where am I making a mistake?


  5. This puzzle posits magic envelopes that hold paper that can hold any real number, and nothing about the envelopes gives you any information about which real numbers are contained therein.


  6. Using the logistic function is unnecessary here: you can just choose any real number y you like, and if x is greater then y say “larger” and if x<y say "smaller".

    Of course, using something like the logistic function and then picking a number uniformly from [0,1] lets you do the probably calculations in the comment above, rather than just prove that you win more than half the time.


  7. Since is solution contains no ABS notation I think it would be safe to assume that he is using the common understanding of ‘larger’.


  8. “Hey Alice, is the number in the other envelope larger or smaller”. The probability of guessing correctly is based on whether or not you trust Alice. Let alone Bob.


  9. Always pick greater than, and you have a 100% chance of winning.

    There is a finite amount of numbers lower than the first one picked, and an infinite amount greater than.

    Is that it?


  10. It is randomly chosen whether the number you see is the larger one or the smaller one, though. The margin of error gets smaller as the numbers become closer together, so you’d have to run many more trials before the barely-better-than-even odds start to weigh in your favor, but it always happens eventually.

    My question: does the specific function used matter as long as it has the same boundary conditions and always-increasing behavior? E.g. would p(x)=1/(1+2^-x) or p(x) = atan(x)/pi + 1/2 also work in the long run?


  11. By the way, I don’t dispute your solution, which is typical of strangeness in Analysis. I just want to assert that there is a difference between an arbitrary real number, which mathematicians take all the time (let x be a real number…) and an evenly distributed random real number, which is, as you state, impossible (the distribution would have to be 0 everywhere but have an integral from -oo to oo equal to 1).

    If you don’t believe me, here is my proof that 0 is not a real number: Let x be a real number. Then, because the probability that x = 0 is exactly 0 (see link in original post), then x != 0. Because x was arbitrary, no real number = 0. Therefore, 0 is not a real number.

    New question: could you still do it if A and B could be equal to each other (the strategy above breaks down in that case)?


  12. Out of curiosity, why is the “pick a random number from 0 to 1” part needed? Why not just say “if p(x) < 0.5, pick that the other envelope has a larger number, else pick that the other envelope has a smaller number"?

    I suppose 0 would have to be decided arbitrarily to be on one side or the other, but if you draw 0, your chances are 50% anyway, right? Or am I missing something here?


  13. “You must now guess whether the number in the other, closed envelope is larger or smaller than the one you’ve seen.”

    No option C) for equal?

    So if Alice picks the same number twice, you’ll end up losing every time.


  14. @Aaron Meurer

    You never stated that x was a randomly chosen real number. Therefore, discussion of probability cannot be done as such a discussion would require knowledge of the distribution.

    True, there is no uniform distribution from negative infinity to positive infinity. There are, however, several random distributions covering the same range. The Normal Distribution is one of these.

    When Mathematicians say “let x be a real number” they are discussing properties relevant to all real numbers, as opposed to a specific number. It’s not really weird.

    Back to the proposed solution: This solution is based on the fact that no process, procedure, person, or computer can generate a truly random number. Any uniformly chosen random number would be infinitely long, to the left AND the right of the decimal point. So I’m guessing that this solution is based on the fact that any person or machine would generate ‘short’ numbers?


  15. You have arbitrarily picked a probability function of the numbers Alice is likely to choose. It is most likely wrong.

    You should have either performed a psychology study to determine which numbers people are likely to pick (the science answer) or stopped at “there’s no such thing as a random sample of all the numbers.” (the mathematics answer).


  16. Edge cases aside* (see below)
    If both numbers has a 50% chance of being larger than 0 (and a 50% chance of being less than 0)

    and you look at one of the numbers (X) and it is larger than 0,
    the hidden number (H) could be any number greater than X or any number less than 0 and less than X.
    the potential number of values of H could be if less than X: ∞ + X or if greater than X: ∞ – X
    the ratio of numbers of H which are potentially less than X is: (∞ + x)/(∞ – h)

    which looks like it would be greater than 1.

    Is that greater than one? intuition says it would be, but i think that
    ∞ – n = ∞ (correct me if i am wrong)
    so (∞ + X)/(∞ – X) = ∞/∞ = 1/1

    So if Alice picks *any* two real numbers (between -∞ and ∞) then there is no more numbers which are larger (or smaller) than the first number you looked at. So there is only a 50% chance of guessing correctly. However if there is a chance of Alice picking a smaller range than that, then the ratio becomes:
    (n + x)/(n – h) which is larger than 1
    (where n is a very large but finite set of real numbers)

    so i think that if you guessed H is less than X when X is >0 (and viceversa)
    then you have at least a 50% chance of being right
    Unless the universe is against you and the coin toss screws you over each time with a probability of 1/(2^n) where n is the number of tosses(guesses)

    *You also get the interesting edge case where (say) one number is 4 and the other is 3.999… which are two different real numbers but they have the same value.


  17. Peter: You actually entered the function correctly.

    It’s not 0 at -10, or 1 at +10.. it merely looks that way because of the limited resolution of the screen.

    The function get’s very, very close, but will never actually reach 0 or 1, except in the limits where x->+- infinity. The lines at y=0 and y=1 are known as asymptotes. See for more details, if you are interested.


  18. Peter: It’s not 0 at -10 and 1 at +10, it just looks close. The function gets asymptotically closer to 0 on the negative side and asymptotically closer to 1 on the positive side, which means it gets closer and closer but never quite reaches it. And monotonically just means it only increases in between going from lower values to higher values of x. It looking like a curve does not preclude this.


  19. Peter, the problem seems to be that Grapher doesn’t have enough precision to distinguish the function from 0 or 1 outside the [-10, 10] interval. The function gets very, very close to zero as you move towards -inf, but never quite reaches it. Similarly, it gets very close to one as you move toward +inf, without ever reaching that either.

    WV: amoebas have (…souls too?)


  20. I feel that the problem can be simplified.

    You randomly pick either the lesser or greater of two numbers.

    If your number is greater than zero, keep it. Otherwise, exchange it.
    (I don’t account for the possibility of the number being zero, just as you fail to account for the possibility of the number equaling p(x).)

    If you did happen to pick the greater, you are less likely to exchange it, because it is more likely greater than zero than the lesser number.


  21. How does one go about picking a random real number, unqualified? Leaving aside the apparent paradox that any result will have odds 1/infinity, I can’t even think of an algorithm that would return an unqualified random real in finite time.


  22. @Peter

    The function seems right. It IS monotonically crescent.

    If x -> -inf, the denominator goes to +inf and the function goes to 0.
    If x = 0, the function goes to 1/(1+1) = 0.5
    If x -> +inf, the denominator goes to 1 and the function goes to 1


  23. I worked on this puzzle some time back. I think, you can choose any probability function (for switching) that is monotonically decreasing. Its easier to understand for a finite range of numbers but the idea can be easily extended to the set of all real numbers.


  24. What an interesting puzzle. After graphing the provided function and being explained there is no process that can choose random numbers with an equal distribution from an infinite set I think I understand it intuitively, but the math doesn’t make too much sense.
    Also, because I’m sure it never gets tiring to hear it, I love your comic and look forward to trying to decode your book. Seeing how I only managed to persevere for about 2 hours before looking up your answer to this and google’s insanely high page-rank rating for the forums I’ll probably end up accidentally or purposefully looking up all the answers.


  25. There’s nothing special about the logistic function here; the same result holds if you compare your given x to a random number selected from any everywhere-positive probability distribution on the reals.

    (Your approach of comparing a random in (0, 1) to p(x) is really just comparing x to a random from (-inf, inf) selected with inverse-logistic distribution.)


  26. OK, here’s what your procedure boils down to, in statistician terms:

    –Assume a distribution over all elements of the range over which Alice draws. This might be the Normal (Gaussian) distribution, or something crazy that focuses on even numbers, or who knows what.
    –Write down its cumulative distribution function. This is $cdf(x)=int_{-infty}^x f(y)dy$, where f(y) is your probability distribution, and you’re integrating from the bottom of the range up to x to find the CDF at x.
    –Make your draw, then look up on your CDF(x) table to find the odds that the true value is less than x. If the integral of the prob. distribution up to x is 92% of the distribution, then you’ve got 92% odds that the other envelope is smaller.

    Of course, the odds you get at the end are in the context of the model you assumed in step one. That is, the odds calculation N = 0.5 * p(x)… uses the function p(.) that you picked, not one that is objectively true, so you’re saying that the odds are better than 50/50 based upon subjective odds that you made up.

    To be precise, the logistic function you use is the CDF of the Logistic distribution (see, so you assumed Alice is making draws from a Logistic distribution at the outset, and then evaluate the odds of getting the right answer using that same assumed distribution.

    You could use other distributions: why not a Normal distribution centered around zero, or a Normal distribution centered around fifteen, both of which have the appropriate domain, and both of which produce a CDF that is of course zero at $-infty$ and one at $+infty$. Or maybe just a point mass at zero, which gets across the same neutral belief that positive is as likely as negative that you express in the Logistic distribution assumption, without the extra baggage?

    I could go on: the setup above also implicitly assumes that A and B are independently drawn. The logistic distribution assumes zero is the halfway point, and that values closer to plus or minus infinity are less likely than values near zero. We assume that 10 is less likely than nine (i.e. that Alice isn’t being human). These are all probably OK, but add structure that was not stated in the problem. Given what was written, the only not-incorrect distributional assumption would be a neutral prior, like the improper uniform distribution (p(x)=1 for all x), where the CDF is hopelessly undefined, and we’re back to not having a way to guess which envelope has the bigger value. Instead, you’ve made a number of assumptions in the logistic distribution, and then used those assumptions to assign subjective probabilities. That’s a fine human way to approach the problem, but works exactly to the extent that you believe the assumptions.


  27. When you say “An important thing to keep in mind here is that there’s no such thing as a random sample of all the numbers.” you overstate the case a little. You have just demonstrated (indirectly) how to take a random sample of the real numbers, just not a uniformly-weighted one. After all, your strategy is isomorphic to ‘Choose a random real number between 0 and 1, and call it “x”. Calculate random real number r(x)=-ln((1/x) – 1). If it’s lower than the number you saw, guess “lower”. Otherwise, guess “higher”.’


  28. Just to nitpick, but it’s not quite right to say that you can’t pick from the real numbers randomly, just that you can’t pick from the real numbers with a uniform distribution. Random != uniform.


  29. Well then, if relying on p(x) is good, then p(x-100) or p(x+100000) is good too?

    That said, do we have to choose just one function to be successful, or can we switch arbitrarily between any of these?


  30. @Aaron Meurer – You said:
    “Put another way, if she can come up with an infinite sequence of digits between 1 and 9 (the decimal expansion), why can’t she pick a location out of infinitely many with equal probability to place the decimal point if she wants to?”

    Because choosing a random place to put the decimal point is the same as picking a random natural number (ie. the position at which to place the decimal point). However, there’s no way to uniformly pick a random natural number.

    @Peter Hosey – The image you linked to is the correct graph of the function, and it satisfies all of the properties that Randall mentioned (monotonicity, etc). Just because something looks close to 1 or 0 on a graph doesn’t mean that it exactly equals 1 or 0 there.


  31. Peter Hosey, you didn’t graph it wrong, nor is it stated wrong. If you zoom in really close at x=-10, you’ll see that it’s only really close to 0. It’s never exactly zero until you reach -infinity.


  32. It was my mistake. I didn’t have Grapher showing me enough decimal precision, and when I sampled the function at x=10, I believed it when it claimed that y=“1.0000”. Zooming in and cranking up the precision revealed that it was, in fact, asymptotic, as I’d initially thought (and dismissed based on the imprecise sample).

    Thanks to Nick Devenish for his gentle, understanding comment on my screenshot, which provoked me to look deeper the second time. Thanks also to Tim Parenti for his comment about the definition of “monotonic”.


  33. I see a problem in your solution. You rely on the fact that it’s impossible to pick a truly random number out of all real numbers and then require in your solution that you pick a truly random number out of [0, 1], despite these sets having the same cardinality. If it’s impossible in one, it’s impossible in the other.

    For anyone reading who doesn’t follow that, it helps if you consider the following. Given a number between 0 and 1: if it’s in [0, .5], double it to get a number – any number – in [0, 1]. Otherwise, it’s in (.5, 1]. Subtract .5 and double to a get a number in (0, 1], and take the inverse to get any number in [1, infinity). This gives a way of mapping any point in [0, 1] to any point in [0, infinity). It should be easy enough to see how to extend that to all positive and negative numbers. Therefore if you can pick a truly random number in [0, 1] you can trivially map it to any number.


  34. @xkcd

    You have calculated the “overall chances of guessing correctly” as (something easily rearranged to)

    N = (1 + p(B) – p(A)) / 2

    and since p, as defined, is strictly monotone increasing (and B>A)

    N > 1/2

    which is your desired outcome.

    However let q be any other strictly monotone increasing function such that 0 < q(x) < 1 for all real x. Note that

    0 < q(B)-q(A) < 1

    but we can't say anything stronger about it than that.

    Now applying the same argument that you used for p (which only relies on the fact that p satisfies the criteria we imposed on q, but not on any other detail of the definition of the logistic function) you also have to conclude

    N = (1 + q(B) – q(A)) / 2

    But N hasn't changed its meaning, so you're forced to conclude

    q(B)-q(A) = p(B)-p(A)

    (for all such q) which clearly is not valid.

    Your argument leads to contradiction.


  35. Here’s the thing: ABC is thinking of canceling Better Off Ted. I suspect you may enjoy–or would enjoy–Better Off Ted, which is often about scientists, generally of the mad variety. If you do (and allow me to suggest watching an episode, or at least a clip, on Hulu, if you have not seen it), please, I’m begging you, say something about it to your legions of followers.

    Also, as regards this puzzle: whoa. The beauty of being in statistics accidentally (via the social sciences) is that I continually have my mind blown by this kind of thing.


  36. Strictly speaking you could use any monotonically increasing function with a bounded range to do this, could you not?


  37. Awesome puzzle, it made my day (and it had been a crummy one so far). 🙂

    @Aaron: He does state that the number is chosen randomly. It’s the third sentence in the solution: “Choose a random real number between 0 and 1.” He doesn’t say Alice’s numbers are randomly chosen, but that isn’t implied anywhere either.


  38. If the real numbers are unbounded on both sides (-infinity,infinity), how can you pick a more probable side of a number? Maybe it’s the “even distribution” that I don’t get, but afaik, any number chosen has both infinite numbers greater and less than it.


  39. So, basically, you should always guess that the other number has the opposite sign? All positive numbers will yield lower as most likely answer and vice verse. This assumes that 0 i in the middle, an interesting assumption.

    I’m guessing, that if you asked a bunch of people, in real life, to be Alice and pick 2 numbers with any arbitrary process, more than 50% of the numbers would be positive. Brains are poor random number generators. =)


  40. @Rikard: No, you shouldn’t guess that the other number has the opposite sign. I got the same impression by looking the p(x) function. But you’re forgetting the step about taking a random number between 0 and 1 comparing that to p(x). I was comparing 0.5 to p(x), which is an intuitive thing to do, but not what the solution states.

    Basically the solution is: the higher the number, the bigger the chance you hold on to that envelope.


  41. I think Rikards thought is correct if you assume a normal distibution around 0 (why shouldn’t we?). From a measure-theoretic point of view, the interval (-∞, 0] has the same measure as [0, ∞) (because the graph is symmetric, this is 0.5).
    There is no way to chose an integer at random. Nevertheless it *makes* sense to talk about probabilities of picking random real numbers from certain intervals. Let’s say, the number you got first is a. Then what you want to have is the probability of b being picked in the interval (-∞, a] (and 1 minus that). So, why can’t you compare that to 0.5 and base your decision on this?


Comments are closed.

%d bloggers like this: