If the Sun shines on day , then the day which is most likely to also see Sun is day , with a chance of approximately .

This is how I found the solution. As usual, there could be other ways to obtain the solution.

First of all, we want to understand the problem. According to the limerick, if the Sun shines on a given day, then it has a one percent chance to shine again on the next day. If it doesn't shine the second day, then on the third day it has a chance of two percent. This repeats until it shines, and it will most certainly shine at least by day on which it would have a chance to shine even if it failed to shine the previous days. Whenever it does decide to shine, then the pattern will of course start over. Therefore, on any given day (past day two) there may be multiple ways in which the Sun will shine.

For instance, day three can shine by two ways:

(1) If the Sun shines on day two (one percent chance) and again on day three (again a one percent chance given that it shone on day two). The chance of this is then .

(2) If the Sun does not shine on day two ( chance) then there is now a chance to shine on day three. In other words, this scenario has a chance of happening.

Then the chance that the Sun will shine on day three is obtained by adding case (1) and case (2) which gives us the total chance of .

Notice that day as given by the setup will always be true. And looking at the scenario of it shining on day , then day will always be true. Then the total number of scenarios which will result in day shining will be , which gets huge very quickly. We can see this will get very complicated, as the further from day one we go, the more "scenarios" there are which will result in the Sun shining on that day. For instance, on day 4 there are the following scenarios (where means the Sun shined and means the Sun didn't shine):

So, to avoid having to perform these numerous calculations, let's take a different approach. Let's pretend that we have some function which tells us the exact probability of the Sun shining on day .

We know that will depend in some way on , , etc., down to if or if .

Now pretend we know the chance of it shining on day . Then day can shine via this particular scenario with a chance of

Moving next to , which is the chance of day being true, we need the scenario false, true, which is

We don't need to look at the case because all scenarios with were included in . Therefore, for each we are only interested in while all in between take . Thus, we obtain

terminating at if or if . We can write a little more succinctly as

where if or if .

Since we know as defined in the problem, then each can be built up starting with , then , and so on as high as we would like. These calculations may be cumbersome by hand, but it is simple to write a loop structure in a program which can handle very large (and very small) numbers. I just implemented a couple lines in PARI/GP to do the calculations for me, using the following code (Note: "?" is the beginning of a command line in PARI/GP) :

? f=vector(5000); ? f[1]=1.0; ? for(n=2,101,f[n]=sum(i=1,n-1,f[i]*(99!/((100-n+i)!))*(n-i)/(100.^(n-i)))); ? for(n=102,5000,f[n]=sum(i=n-100,n-1,f[i]*(99!/((100-n+i)!))*(n-i)/(100.^(n-i)))); ? for(i=2,5000,write("shine_probs.txt",{i," ",f[i]}));

This computes for all and outputs them to a file called "shine_probs.txt" in the same directory you are in while running PARI/GP. By examining this file, you can see the limiting behavior of the probabilities over time as the chance to shine approaches the expected value of (another set of calculations not asked for in the statement of the problem). But there is a clear maximum at day with a chance to shine of , as you can see in this snippet of the output file:

2 0.010000000000000000000000000000000000000 3 0.019900000000000000000000000000000000000 4 0.029503000000000000000000000000000000000 5 0.038623870000000000000000000000000000000 6 0.047098557100000000000000000000000000000 7 0.054791952138999999999999999999999999999 8 0.061603835291470000000000000000000000000 9 0.067472483459970699999999999999999999998 10 0.072375725623108842999999999999999999998 11 0.076329454074908884749999999999999999998 12 0.079383816112694389540299999999999999997 13 0.081617499375789496299642999999999999997 14 0.083130665539032827220429549999999999997 15 0.084037168251357251499712578699999999997 16 0.084456706395836517722907309162999999997 17 0.084507515173309360387921132513869999997 18 0.084300094688822103007229782362676299996 19 0.083932333825561971488875645354793178996 20 0.083486224781764267400384344044402259626 21 0.083026200066502305891420278215012420796 22 0.082598976626996366200920282448253594185 23 0.082234674962842870122315077776895294629 24 0.081948903291329620204862399957767369783 25 0.081745461136546946713671878881667012704 26 0.081619320859135601056771908582577814413 27 0.081559583015775679234911544521974318831 28 0.081552162539795666561298686044850671013 29 0.081582036816964185114928930607826380337 30 0.081634963354108899257474604059547122470 31 0.081698645042514095653415406927112399269

To check my analytical solution, I implemented a Monte Carlo experiment in C++ with the following code:

#include<iostream> #include<fstream> #include<ctime> #include<cstdlib> using namespace std; int main() { srand(time(NULL)); ofstream outfile, randcheck_file; unsigned long long num_days=50; unsigned long long num_trials=10000000; unsigned long long* day_count; unsigned long long rand_check[100]={}; double rand_num; double chance; unsigned long long status=5; outfile.open("shinecheck.txt"); randcheck_file.open("shine_rand_check.txt"); day_count = new unsigned long long [num_days]; for(unsigned long long i=0; i<num_trials; i++) { if((double)i/(double)num_trials>=(double)status/100.0) { cout<<status<<"%-"<<flush; status=status+5; } chance=1.0; for(int j=0; j<num_days; j++) { rand_num=(double)rand(); while(rand_num==(double)RAND_MAX) rand_num=(double)rand(); rand_num=rand_num/(double)RAND_MAX; //checking random number spread for(int m=0; m<100; m++) { if(rand_num<(double)(m+1)/100.0) rand_check[m]++; } //done checking random number spread if(rand_num < chance/100.0) { chance=1.0; day_count[j]++; } else { chance=chance+1.0;; } } } for(int k=0; k<num_days; k++) { outfile << (k+2) << " " << (double)day_count[k]/(double)num_trials << endl; } for(int i=0; i<100; i++) randcheck_file << i+1 << " " << (double)(rand_check[i]*100)/((double)(num_days*num_trials)) << endl; cout<<endl; outfile.close(); randcheck_file.close(); delete[] day_count; return 0; }

This program basically recreates the events using random numbers and stores the results over 10 million trials, finally outputting each day's percent "true"s to a file "shinecheck.txt" which should be very similar to the analytical solution. The above code also includes a random number counter to ensure that the random numbers were spread evenly across the interval. The analytical and experimental solutions agree very closely, as you can see in the graph below.

So we have rather long-windedly arrived at our confident solution.