The habits of Netflix’s users
September 1, 2022
Originally published at notes on cinema
Like other streaming services, Netflix does not make its user data public. To date, there are two exceptions to this privacy. Netflix released a large dataset of anonymized user activity when it offered a one million dollar prize for the best AI model that could predict user ratings with data between 1998 and 2005. Netflix has also been publishing weekly data since July 2021. Each week, the data are about the top ten films and the top ten television shows, which are ranked by viewing hours. Although ten films and ten TV shows are tiny pieces of content in the world of Netflix, weekly data of its most popular fare will prove to be useful for our analysis. For example, Netflix’s top ten data provides breakdowns by country.
We can do the best we can with the data that are available to us. This post will produce snapshots of Netflix’s user behavior. These snapshots are connected to a working hypothesis: Netflix desires that its content will be consumed unequally, where some content will get lots of attention and the rest will receive little. In response to this hypothesis, a skeptical reader might raise two points. First, Netflix has always worked with a subscription model of payment. Its users are not buying this or that film; they are instead paying for monthly access to a wide library of content. Thus, it seems unnecessary for each user to consume specific titles from the available library – especially when a library of content is digital. Second, Netflix has promoted itself as having the diversity of content, the technological infrastructure and the computer algorithms to deliver user satisfaction for any taste. Much like a well stocked gym that has a variety of equipment, no single person will use everything on Netflix. Not a problem, goes the argument. Netflix’s has value in being a recommender of sorts; the platform helps each of us navigate its massive library of content.
But we should reconsider the strength of this skepticism. In contrast to the idea that Netflix is enabling users to consume content more equitably – the majority of its content is technically accessible to any user – the trends of inequality keep appearing in different snapshots of Netflix’s user behavior. Patterns of inequality are not new in the history of media consumption. For instance, Figure 1 shows a long tail in the distribution of US theatrical revenues in 2015. The distribution of revenues is ranked by size, from largest to smallest. A small handful of films received an overwhelming majority of all theatrical revenues in 2015. To the right of the steep slope lies the long tail of low revenues, which is where the majority of films reside.
Figure 2 uses top ten data to introduce the long tails that exist on Netflix’s platform. The panels on the left present the average distribution of weekly viewing hours for the top ten films and TV seasons. Both film and television on Netflix have, each week, unequal distributions of attention, whereby the viewing hours of content rapidly declines from the heights of the most popular. The panels on the right rank the total hours for each film or TV season. Here the long tails in the distributions are pronounced. The viewings of small amounts of content are multiples greater than the rest of the distribution.
Does Netflix desire this inequality of attention? Would a more equal distribution of content consumption be better for Netflix financially? A theoretical interpretation of risk and the long tails of cultural consumption will come later, in a post that provides a power theory of Netflix’s political economy. For now, we will compare the empirical data for two periods of user behavior on Netflix. Without yet theorizing why user behavior on Netflix changes, certain characteristics in present-day consumption, which we will label as the “Digital Period”, are different than Netflix’s “DVD Period”, of which we have data from 2000 to 2005. Two notable characteristics of the Digital Period are missing or weaker in the DVD Period:
- Popular Netflix content accumulates a lot of attention in the early weeks. This advantages content that can behave like a big Hollywood blockbuster and grab above-average attention from day one.
- Television seasons and films infrequently receive significant waves of attention in later periods of their lifespan. Older content can sometimes re-surge in popularity, and there are instances of new content creating incentives to re-watch old content on Netflix; but the dominant trend is for peaks of increased user attention to only appear in early weeks.
The next big hit in popular culture seems able to capture our attention any way it can. Suddenly everyone’s talking about this or that hit on social media or personal conversation. And when you think nobody actually likes this hit, nothing changes. The big hit appears to have been destined to be big.
According to Elberse (2013), there is a release strategy behind the production of a big hit:
… most blockbuster bets in entertainment are released using what is known as a “wide” or “mainstream” release strategy. Wide releases … are not designed with efficiency in mind; instead, the goal is to “break through the clutter” and immediately capture the attention of as large an audience as possible.
Elberse, A. (2013, p. 64). Blockbusters: Hit-Making, Risk-Taking, and the Big Business of Entertainment.
This release strategy does not neatly apply to the entire history of Netflix. During the DVD Period, the big hits were DVD copies of films that were already released in theaters. And any title that was popular was, ultimately, owned by another company – Netflix had only licensed the distribution rights from a Hollywood studio. If there was a blockbuster release strategy during the DVD Period, it was passive, whereby Netflix acquired extra copies of the Hollywood titles that were blockbusters in theaters. The Digital Period, by contrast, has plenty of opportunities for Netflix to actively use the blockbuster release strategy. Not only are users binging shows as soon as they are released on the streaming platform, but Netflix has its own in-house content, which the company wants to market like popular films and television shows of the past.
Whether applied by Netflix or another media company, the blockbuster strategy is unequal in two ways. First, a blockbuster is meant to have a level of success that is much greater than the rest of the content out there. While some content, whether film, television or music, grows to have blockbuster-like success, the blockbuster strategy does not expect there to be an egalitarian world of culture. (Imagine the Variety headline: “Top Gun: Maverick and 500 other films make $400 million in 2022.”) Second, a blockbuster strategy is designed to function at a faster pace than other release strategies. The opening date and the first weeks are when the blockbuster makes a big splash. Other release strategies – labelled with terms like “limited”, “select” and “niche” – have slower, longer journeys to comparable success, if they can even reach blockbuster-like levels.
Figure 3 uses the first eight weeks of each release to investigate Netflix’s blockbuster strategies. In each panel we model the above average attention of every release above the 90th percentile in the DVD Period and the TV and Film subsets of the Digital Period. Without dis-aggregated marketing data, we cannot say that these sets of content became successful from extra marketing. Nevertheless, the blockbuster strategy is designed to grab as much consumer attention as one can in the early weeks. Being above average for part or all of the eight week period is a good sign for a piece of content.
The figure reveals there is a difference between the DVD Period and the Digital Period, particularly its TV subset. All of the content that are represented in this figure are performing above the weekly average, but TV-Digital has top-tier content that are not declining linearly, from week one to week eight. This top tier is having massive above-average success around the third, fourth and fifth weeks of release. And compared to the other subsets, TV-Digital has the highest peak in the span of eight weeks.
To what does this above-average attention amount? A successful blockbuster is above average from day one, which should let it accumulate lots of views, clicks or dollars for a short-but-intense period. In Netflix’s case, TV in the Digital Period is popular for around eight weeks, but then the intensity of this popularity slows down. Figure 4 shows the cumulative performances of content above the 90th percentile. The windows of the plots are twelve weeks instead of eight; the annotated lines show the maximum values at four, eight and twelve weeks. A cumulative plot highlights the relative smallness of film in Netflix’s Digital Period. While the Film-Digital Period has a similar shape to the DVD Period in Figure 3, the former is not as successful in producing large multiples above the weekly average, which leads to a much smaller accumulation over twelve weeks. By contrast, the content of the DVD Period was able to maintain intense popularity for twelve weeks.
Figure 4 also demonstrates how the first few weeks are crucial to the TV-Digital Period. On the assumption that above-average measures are comparable, the DVD Period eventually beats the TV-Digital Period in cumulative views, but the TV-Digital Period accumulates greater views at the four and eight week marks.
Fast accumulation, which is central to the blockbuster strategy, would not be as important if the inequality of consumption evened out later on — i.e., why push for a big opening if all roads lead to the same level of consumer attention? Yet the ability for neglected content to have big peaks of attention late in life might be one of our more popular imaginations about streaming content. Unfortunately, the data does not support this type of imagination. Instead, the data help us understand the high significance of the blockbuster strategy in Netflix’s digital era.
We will use a python library,
scipy.signal, to analyze the frequency of peaks in Netflix data. Much like the annotations of peaks and valleys in stock market data, this library calculates where local maxima and minima reside in a time series. Figure 5 uses Clint Eastwood’s Unforgiven as an example. This film was in the Top 10 of the DVD Period for a total of 29 weeks. There were five peaks: weeks 2, 5, 10, 21, and 27. To estimate the significance of each peak’s timing, the figure shows the average median age of content — i.e., when content is, on average, about to begin its second half of life in the Top 10.
In the case of Unforgiven, three of its five peaks (60%) occur later than the average median age of Top 10 content in the DVD Period. How does this compare to other content in the DVD Period, as well as the Digital one? Table 1 provides summary data of every single piece of content in the datasets. The fourth and fifth columns present the frequencies and percentages of peaks that occur after the average median age. The numbers in the Digital Period are not encouraging; not only they are smaller than the DVD Period, but they disrupt the idea that streaming technology is, by default, a leveler of inequality in distribution. Rare is the content that, late in its life, finds new peaks of attention late in the top ten.
Elberse, A. (2013). Blockbusters: Hit-Making, Risk-Taking, and the Big Business of Entertainment. New York: Henry Holt and Company.