When is the best time to catch Pokemon in PokemonGo?

Like you all, I’ve been enjoying (or as my wife would say “obsessed with”) PokemonGo since its launch in July. So imagine my excitement when I came across PokemonGoRadar. They’re a company who have built a prediction algorithm for Pokemon spawns. They released this table of average spawn times for each of the 145 Pokemon that can currently be found in the wild. I thought I would scrape the data and see when the best time to catch Pokemon is!

Plotting the distribution of times weighted by the spawn rates, my first answer is that the best time to catch Pokemon is during the early hours of the morning.


I wasn’t necessarily expecting a perfect uniform distribution but I didn’t think it would be so skewed! When I dug deeper, I found that this is because of how annoyingly common Pidgeys and Rattatas are (as anyone who’s played the game would understand). They both have an average spawn time between 1am and 2am and a combined spawn rate of 3,000 / 10,000.

But I’m not interested in Pidgeys and Rattatas. I want the rare and more powerful Pokemon like Dragonite and Vaporeon. The real question I wanted to answer was “What is the best time to catch rare Pokemon?”

To answer this question, I first needed to define “rare”. Being a Pokemon expert, I can name all 151 by heart so could have come up with a list of the rare Pokemon. quite easily. However, the data scientist in me thought there must be a better way. My first choice (really because it sounded cool) was to define rare as Pokemon having a spawn rate of less than 100 per 10,000 (1%). To check if that was reasonable, I plotted the distribution of spawn rates:distribution

There are 119 Pokemon with a spawn rate less than 100 per 10,000, so I should probably change “rare” to “uncommon”.

Now here’s the plot showing the distribution of average spawn times just for the uncommon Pokemon.


The best time to catch uncommon Pokemon is still during the early hours of the morning.

So if you want to be the very best, like no one ever was – forget the late afternoon and early evenings, get out in the early mornings!

If this short piece has whetted your appetite for PokemonGo data science, check out this amazing article by Matthew Harris.

Note: This table is based on a predictive model, not on actual spawn rates. Secondly, the spawn times here are just averages, we don’t have the underlying distribution of spawn rates.

As usual, you can find the code here.



How good am I at guess the correlation?

I looked into some of my data behind the guess the correlation game

Not very good, I suck.

Here, you have a go and tell me if you can do better than me.

To be fair, I actually improved the more I played! When I had to stop because I thought I was going to get addicted, my high score was 412 and #20 in the world was 586.

However, once I saw you could download your data, the whole game changed. I started to think about all the cool analytics I could possibly draw from this. I wanted to see if there was anything non obvious about the accuracy of my guesses. But to my disappointment (only realised this after playing for about an hour), the game only saves your last 100 goes. Anyway, I managed to get 300 data points and here are couple of interesting things I found:

  1. My estimates are much better at the extreme end – well this is expected, I can pick out the signal when it’s very strong but I’m pretty rubbish when it’s a bit hazy.guess_corr
  2. While playing, it seemed that guessing the correlation seemed much easier when the new chart wasn’t much different from the old chart. Could argue that when there’s a massive difference, I’m terrible.guess_corr2
  3.  I wonder if I would have a better overall score if the sample from my 300 data points was more like the uniform distribution I expected? I reckon if there were more of those really low r‘s I would be in the top ten :p. But chart A below shows that there were less than expected around 0.25 too (my worst performing area) sampled_distribution_of_true_correlation

What was the point of all this?

Well, I got to practise some R and ggplot2 specifically (yay me). But also this is the first on this blog of many (hopefully more useful) posts on data analytics.


I’ve slept on this and I’m wondering what effect this will have on me. At my day job, I am constantly looking at correlations and regressions in trying to infer what causes car accidents. Have I missed out on some correlations because the true correlation wasn’t strong and therefore I thought it was insignificant?