What we can learn from measuring the internet
Internet measurement is a field of research that unlocks the mysteries of where all the traffic goes on the internet, the most complex man-made system in the world.
Measurements from the Global Internet Phenomena Report, released in September 2019 by network equipment company Sandvine, tell us downstream traffic on the internet today is dominated by video-streaming, with 60 percent of usage coming from sites such as Netflix, YouTube, and Amazon Prime. US technology company Cisco predicts video traffic will grow to 80 percent by 2022.
Internet measurement studies have focused on user-generated content (UGC) video sharing sites such as YouTube and discovered fascinating insights. These UGC sites are referred to as Web 2.0 sites because they allow users to contribute content and allow interaction through comments and ratings.
And just as we are seeing a rise of mainstream UGC video-sharing sites, the trend is also being embraced by online pornography sites. YouTube-style porn video streaming sites are massively popular, among them PornHub, which is one of the most well-known and prolific sites.
We collected meta-data on more than 2.9 million videos from these sites and discovered that, over a period of 10 years, these videos had been uploaded and viewed more than 354 billion times.
This astonishing number was uncovered in research for a University of Auckland longitudinal comparative analysis of some of the biggest porn-streaming sites in the world. This has been a largely ignored area in the field of internet measurement and, to the best of our knowledge, ours is the first longitudinal comparative analysis of online porn streaming services.
We presented our peer-reviewed research on these sites at the Network Traffic Measurement and Analysis (TMA) Conference 2019 in Paris to shed light on the video characteristics and user engagement on the sites.
Our study found the average video on porn-streaming sites is about three to five minutes in length, which is shorter than the typical YouTube video. We also noticed significant growth in the number of videos being uploaded to the sites with 50 percent of the videos uploaded in just the last four years.
Another significant finding was that the top 20 percent of most-viewed videos were responsible for 75 to 90 percent of the total views, affirming the ‘Pareto principle’ which researchers have found to be a feature of videos on all UGC video-sharing sites, mainstream and pornographic. The Pareto principle is often mentioned in the context of income distribution where most of the wealth is owned by a small proportion of the population. Similarly, researchers have found that most of the videos on these UGC video-sharing sites are uploaded by a few uploaders.
Today’s domination of the internet by video-streaming is in sharp contrast to a decade ago when the largest portion of internet traffic was peer-to-peer file sharing. Sandvine, in the Global Internet Phenomena Report, tells us file sharing using peer-to-peer BitTorrent protocol is still popular, consuming more than 30 percent of the total upstream Internet traffic. A reason for this is users turning to piracy due to delays and restrictions in getting content in certain regions of the world.
The report goes on to tell us that Facebook and Instagram take up 86 percent of the downstream online social networking traffic indicating their dominance in the social media scene, and provides further evidence of how the internet is being used for diverse purposes.
But why do we need to measure the use of the internet? Why do we need this information on where internet traffic goes?
Well, first of all, let’s understand how the internet works. The internet has transformed how information is exchanged. It is a communication network interconnecting billions of devices. Previously, these devices were primarily digital computers, however, increasingly non-traditional “things” such as TVs and fridges are also connecting to the internet, leading to the phenomenon of the Internet of Things.
People use the internet to access services or applications. Video streaming, online social networking, and online gaming are some of the most frequently-accessed services on the internet.
This dynamic system is constantly evolving and growing and, like any large system to be managed properly, it needs to be measured. Measurements can be driven by social, economic, regulatory and technical reasons. Traffic analysis and characterisation is one tool often used in internet measurement activities. Traffic analysis measures the proportion of applications used on the internet, which allows us to understand what applications are popular on the internet. This information is useful for several purposes, such as helping network operators to manage and provision their networks.
The science of internet measurement has helped demystify many unexplored topics such as the structure of online social networks, infrastructure deployment of Netflix’s content distribution network and internet topology. It is an opportunity to analyse large amounts of data from real applications and systems. With society being ever dependent on the internet, the need for internet measurements will always remain. Internet measurements have enabled the development of new applications, new communication protocols, and better communication architectures.