Simulating World Cup Russia 2018

Soccer World Cup has arrived and maths are being used to analyze potential results. Spanish newspaper El Pais combined current statistics with a parametric model based on a Poisson regression and uncertainty analysis to simulate potential results and to estimate the probability of each team to win games and to become the new world champion. At the end of the post you can find the results. Much more important than the results however, is the methodology applied in the model. In this post we are interested in the maths and we present a short summary about this model. Further statistical-mathematical questions about the model may be discussed later.

The model can be described in three parts: strength of the team, simulating individual games and simulating the whole tournament.

Strength of team
Some teams (Brazil, Argentina or Germany) are stronger than others (Panama, Egypt or Saudi Arabia). This difference is quite important in this case, because national teams do not play many games together. Thus, some individual players like Neymar may be vital to win some games. The strength was calculated based on the well known Elo Rating System. This system calculates the relative skills of each participant based on its performance ratings. Although originally developed for chess players, the Elo Rating System has been successfully applied to several sports. The model from El Pais used 3 different Elo ratings. One for the players, one for the teams, and one based on the goals marked by each team.

Simulating individual games
Individual games simulate the probability of goals marked by each team. This  technique is based on a Poisson regression model proposed by Dixon and Coles (1995). Thus, the model calculates the probability of victory. The model was calibrated considering more than 17000 games. The model calibration considered difference performances for home games, away games and games in neutral field.
The calibrated model was evaluated based on the Rank Probability Score proposed by Constantinou and Fenton (2012)

Image. Calibration of the model (Source: El Pais)

Simulating the whole tournament
The previous step not only simulates the victory, but also simulates the goals. This is an important point to simulate the whole tournament. By simulating the number of goals the model can predict the first place and second place of each group; hence, defining the matches for the following stages. The last two steps were repeated 10 000 times (10 000 iterations) in order to consider different uncertainties. Although there are no details about the probability rules to define the next iteration, this was an important step because the model estimated the probability of each team to win a specific game, to pass to the next stage and to become the new Champion.

The following image shows the result of the beast teams. You can visit the whole table.
Image. Simulation result of the best teams

Satellite images from Hidroituango hydropower dam crisis Colombia 2018

Last days we have been following the Hidroituango Hydropower dam crisis in Colombia. We published posts about the crisis time frame (link to post) and the analysis from the technical committee (link to post). In this post we will not discuss about the event. Instead, we present satellite images to visualize the problem, like we did with the Oroville dam (Link to post).

Image 1 shows a comparison of the site in March 26 (before the event) and in May 17 (during the crisis). We want to point to 3 details:

  • The first noticeable detail is the water behind the dam (on the reservoir). On XXX the reservoir is empty. On the other hand, by May 17th the water level has risen so much that the reservoir is almost full. It is possible to see that the water level is very close to the spillways and the top of the dam.
  • Other important detail is the visualization of the landslides. As mentioned in a previous post, this crisis began because of landslides that blocked the tunnels. The image from May 17th clearly shows 2 landslides on the right margin of the river.
  • The third important detail is the water flow downstream the dam. The image from May 17th shows water flowing downstream the dam. This is an important detail, because the spillways were not working yet. Thus, the water flow downstream is a sign of the seepage described by the technical committee. 

Image 1. Hidroituango dam satellite image March 26 (Source: Planet Labs)

Image 2. Hidroituango dam satellite image May 17 (Source: Planet Labs)

The other image (Image 3) pair shows 2 images in an animated GIF (images from May the 02nd and May the 07th). This pair of images has less detail, but it covers a much bigger area. This second comparison shows the backwater effect of the dam. Several tributaries were flooded by the backwater effects.

Image 3. Animated GIF of images from Hidroituango (Source: GIPHY)

5 facts about the Fuego volcano eruption in Guatemala (June 2018)

Last week, Fuego volcano (Guatemala) erupted. It was the strongest eruption in several decades. We already posted some basic concepts about volcanoes. This post presents 5 facts about this eruption.

  1. The erupted ashes were about 650 degrees Celsius
  2. The erupted ashes elevated almost 10 000 m high
  3. The Fuego volcano eruption did not throw much lava. The eruption threw tons of ashes
  4. The eruption was the strongest eruption in 4 decades
  5. Currently, the main problem are the so called lahares. That is, the mixture of the volcanic ash with rain.

Check the video