All of .nyc
Exploration of websites registered under the .nyc
domain name program, offered by New York City since 2014. Web app for exploring the data, and ETL pipeline for pulling the open data, enriching it with queries to registered domain names, and loading it into an API-accessible database.
Built with:
- ETL pipeline and data analysis
- NYC Open Data - Official source for .nyc data, and 3000+ other data sets.
- Jupyter Notebook - "The Jupyter Notebook is the original web application for creating and sharing computational documents. It offers a simple, streamlined, document-centric experience."
- Python - Programming language used for enriching and processing the data.
- Pandas - Python data analysis tooling
- Beautiful Soup - Website scraping tool to extract relevant metadata from websites
- Web app for exploring the data
What's up with .nyc
All websites have a top-level domain (TLD) that is used to indicate the purpose of the website. For example, .com
, .org
, .sucks
, .io
, and many others.
The .nyc
TLD is a bit of an outlier. There are only a handful of TLDs that are used to indicate a specific geographic area that's not an entire country. Said another way, most cities or geographic areas don't have their own TLD. While it's easy for internet users to navigate, it's certainly not common practice for websites to use a geographically-focused TLD like .nyc
.
That said, I was curious exactly how .nyc
has been used since its introduction as a generally-available TLD in 2014. Thankfully, NYC OpenData maintains an open data source for .nyc
Domain Registrations that can be used to understand the history and use of .nyc
domains.
This project is exploratory, without a particular hypothesis in mind. I'm aiming to understand and visualize .nyc
TLD usage over time, and connect it to real-world events and trends when possible.