*Fitting the distribution of ticket sales yields unexpected results*

The movie business is risky, but the returns can be high. Better yet, the movie industry performance is only loosely correlated to that of the rest of the economy, which makes it an attractive diversification option.

In the Netherlands, investing in motion picture production is especially good because of the favourable tax regulation. To put it simply, if the movie flops, the losses of the private investors are deduced from the tax they have to pay in that year. The production company usually agrees that beforehand with the tax office. Thanks to that, potential loss of a private investor is limited to a small fraction of the investment, while potential returns are high.

Recently I got an offer to invest in a new movie. However attractive this seems at the first sight, I need to do my homework before deciding to invest. I have to calculate the expected return and variance and compare these to other investment options. The movie production company published the projected returns contingent on the movie’s gross box office earnings in the Netherlands, so if I can estimate the box office, I will know the distribution of the investment’s returns. In this blog post I will analyse the historical distribution of the movies’ earnings.

Existing research [1] shows that the gross earnings of the movies released in the US each year follow a power law distribution with Pareto exponent . To estimate the box office of a Dutch movie, I will need the statistics of the movies’ earnings in the Netherlands.

If I had limited the statistics to the Dutch-language movies released in a single year, I would have got hopelessly little data. To remedy the situation, I assumed that

- The distribution of movies earnings is the same every year.
- Movies have equal chances of success regardless of the language.

I do not have any evidence to support these claims, but they sound reasonable. Besides, I cannot get much further without any of them. Together, they allow me to use the complete movie-going statistics of several years, which provides a sufficient number of data points.

Ticket prices change, so instead of looking at gross earnings, I considered the number of tickets sold; this would give equal weight to each year’s results. Luckily, the ticket sales data for the top 1000 movies released from 1991 to 2012 are available from the Dutch Film Distributors’ Association (NVF); the pdf can be downloaded here.

As I was looking for a power law distribution, I plotted the logarithm of the film’s rank versus the logarithm of the number of sold tickets. The data (blue dots) are shown on Figure 1 below.

Figure 1

I expected the dots to lie close to a straight line; that is what a power law distribution looks like on a log-log plot. The best linear fit to the data is shown by the black line. This fit is rather poor; . The Pareto exponent implied by the linear fit is .

Surprisingly, a quadratic fit (red line) is very good; . This contradicts the findings in [1], but the data points lie remarkably close to a parabola! I have no explanation for this; I would be curious to know if anyone got similar results from the analysis of the movie earnings distribution elsewhere.

Finally, I have to reiterate that the fit was done for the top 1000 movies released from 1991 to 2012. According to the data published on boxofficenl.net, the total of 5846 movies were released in the Netherlands during these 22 years. The top thousand represents of all released movies.

**References**

- Blockbusters, Bombs and Sleepers: The income distribution of movies. Sitabhra Sinha and Raj Kumar Pan, 2005. Available at http://arxiv.org/abs/physics/0504198.

**Disclaimer**

This is not an offer, solicitation or advice to invest. Past performance offers no guarantee for the future.