0
Skip to Content
Data Skunkworks
Home
About
Data Skunkworks
Home
About
Home
About
Ranking the most viewed people on Wikipedia in 2020 (so far)
Data Engineering Thomas Colin 7/10/20 Data Engineering Thomas Colin 7/10/20

Ranking the most viewed people on Wikipedia in 2020 (so far)

In the previous posts, we looked at loading all of the 1.2 TB of pageviews data from Wikipedia, and ranking the most popular people by day. The final step involves bringing all the data together into a visualisation.

Read More
Using Python to scrape Wikipedia for images of the most viewed people in 2020
Data Engineering Thomas Colin 7/10/20 Data Engineering Thomas Colin 7/10/20

Using Python to scrape Wikipedia for images of the most viewed people in 2020

Can we find the most viewed people on Wikipedia each day in 2020 and get a picture of each one?

Read More
How to load and analyse 48 billion Wikipedia page views with Google BigQuery
Data Engineering Thomas Colin 7/10/20 Data Engineering Thomas Colin 7/10/20

How to load and analyse 48 billion Wikipedia page views with Google BigQuery

This year, the English language Wikipedia has averaged around 8 billion page views per month, making it one of the most visited websites in the world. The first half of 2020 has been incredibly eventful, and I was interested building a dataset to see exactly which pages Wikipedia users have been most interested in.

Read More
 

Data Skunkworks

Experiments in data science and the visual display of quantitative information

Contact: thomas@dataskunkworks.com