π Most Read Tech Articles In 2023
Most read tech articles across engineering blogs in 2023 featuring Meta, Airbnb, Netflix, Slack and more!
Happy Thursday π
As itβs the last Thursday of 2023, I compiled a list of the most read articles across engineering blogs in 2023.
Putting this list together was not an easy task as thereβs no simple way to get the number of βreadsβ of an article. They tend to be published across different blogs powered by various engines.
To tackle this problem, I took into consideration the engagement across Hackernews, Reddit, X as well as the number of opened links in the previous Big Tech Digest newsletter issues. With some help of Python and Jupyter, Iβm excited to share the final list!
Before we dive in, I would greatly appreciate if you tweet or mention this newsletter to your connections and follow me on X π This helps me grow the newsletter while keeping it free.
// π« Most read articles in 2023
1. π₯ "How Meta built the infrastructure for Threads"
by Laine Campbell, Chunqiang (CQ) Tang βΈ± Meta βΈ± 9 min read βΈ± 19 Dec 2023
Discusses the successful launch of Meta's Threads and the infrastructure behind
Describes the use of ZippyDB, a distributed key/value database, and how it was optimized for the Threads launch
Explores the role of Async, a serverless function platform, in scaling workload execution for Threads
2. π₯ "Slackβs Migration to a Cellular Architecture"
by Cooper Bethea βΈ± Slack βΈ± 9 min read βΈ± 22 Aug 2023
Tells a story about migration from monolithic to cell-based architecture at Slack
Introduces the concept of gray failure in distributed systems
Explains how Availability Zones can be drained
Covers the implementation of siloing and traffic-shifting in cellular architecture
3. π₯ "Migrating Netflix to GraphQL Safely"
by Jennifer Shin, Tejas Shikhare, Will Emmanuel βΈ± Netflix βΈ± 8 min read βΈ± 14 Jun 2023
Describes the migration of Netflix's iOS and Android apps to GraphQL with zero downtime
Explores the use of three key testing strategies: AB Testing, Replay Testing, and Sticky Canaries, to ensure a safe and smooth migration
Covers the phased approach to migration, including the creation of a GraphQL Shim Service and the subsequent transition to GraphQL services owned by domain teams
Discusses the challenges and wins of each testing strategy
Shares insights into the tools developed, such as the Replay Testing framework and Sticky Canaries, to validate functional correctness, performance, and business metrics during the migration
4. "What is an inverted index, and why should you care?"
by Charlie Custer βΈ± Cockroach Labs βΈ± 7 min read βΈ± 17 Aug 2023
Describes how inverted indexes work and their impact on database performance
Explores the downsides of using inverted indexes, specifically the minimal impact on write performance
Covers how to use inverted indexes, including when and how to create them
Shares examples and best practices for using inverted indexes in relational databases
5. "Scaling the Instagram Explore recommendations system"
by Vladislav Vorotilov, Ilnur Shugaepov βΈ± Meta βΈ± 11 min read βΈ± 9 Aug 2023
Discusses the use of Machine Learning in the Explore recommendation system on Instagram
Describes the use of Two Towers neural networks to make the recommendation system more scalable and flexible
Explores the use of task-specific DSL and a multi-stage approach to ranking in the system
Covers the use of caching and pre-computation with Two Towers neural network to build a more flexible and scalable ranking system
Introduces techniques such as Two Tower NN and user interactions history in the retrieval stage, and the use of Bayesian optimization and offline tuning for parameters tuning.
6. "Understanding Real-Time Application Monitoring"
by Ritesh Kapoor βΈ± Expedia Group βΈ± 7 min read βΈ± 13 Jun 2023
Covers the performance indicators and SLI/SLO/SLA concepts for application monitoring
Shares different categories of metrics, including application VM, API, database response, infrastructure, and more
Explores the importance of monitoring distributed tracing for troubleshooting requests with high latency or errors
Gives an overview of the challenges of improving operational performance and the benefits of monitoring applications with the right metrics and tools
7. "Improving Performance with HTTP Streaming"
by Victor βΈ± Airbnb βΈ± 7 min read βΈ± 17 May 2023
Describes how HTTP Streaming can improve page performance and how Airbnb enabled it on an existing codebase
8. "How does B-tree make your queries fast?"
by Mateusz KuΕΊmik βΈ± Allegro βΈ± 12 min read βΈ± 27 Nov 2023
Introduces B-Tree as a data structure and clarifies B-Trees vs. BSTs
Explains B-Tree organization and search queries
Explores the practical implications of using B-trees on hardware, including CPU caches, RAM, and disk storage
Explains how packing multiple values into a single node reduces random access and enhances query performance
Addresses balancing in a B-Tree
9. "Meta developer tools: Working at scale"
by Neil Mitchell βΈ± Meta βΈ± 4 min read βΈ± 27 Jun 2023
Describes Sapling, an open-source version control system designed for extreme scale
Covers Buck2, a build system supporting remote caching and execution for large-scale development
Explores testing and static analysis tools used at Meta, including Infer, RacerD, and Jest
Presents Sapienz, a tool for automatically testing mobile apps
10. "How Gradle Reduced Build Scan Storage Costs on AWS by 75%"
by Oliver White βΈ± Gradle βΈ± 4 min read βΈ± 23 Jun 2023
Describes the challenge faced with inefficient cloud storage using Amazon RDS
Presents the decision to migrate to Amazon S3 as the solution
Shares the immediate 75% reduction in cloud expenses as a result of the migration
Explains the added benefit of enabling automatic deletion for unactivated scans after the migration
11. "Real-time Messaging"
by Sameera Thangudu βΈ± Slack βΈ± 7 min read βΈ± 11 Apr 2023
Describes the architecture used to send real-time messages at scale
Discusses the setup of the Slack client, including the use of Webapp, Envoy, and GS to establish a websocket connection
Explains the process of broadcasting a message to all online clients following the journey of the message through the stack
Covers the different types of events, including regular traffic spikes for reminders, scheduled messages, and calendar events
12. "How Discord Stores Trillions of Messages"
by Bo Ingram βΈ± Discord βΈ± 3 min read βΈ± 6 Mar 2023
Describes problems with a Cassandra database storing billions of messages
Covers the impact of hot partitions on latency and end-user experience
Shares the challenges of cluster maintenance tasks and compactions
Discusses the frequent tuning of JVM's garbage collector and heap settings to address latency spikes
Thanks for reading π
I would greatly appreciate if you tweet or mention this newsletter to your connections and follow me on X π This will help me grow the newsletter while keeping it free.
Have a great time during the rest of the festive season!