Big Tech Digest #9
2023-12-21: How Meta built Threads infrastructure, Building documentation with OpenAI at Okta, Replacing REST and WebSockets with gRPC and more!
Happy Thursday 👋
Exciting news: From now on, each linked article additionally contains an estimated reading time, the publish date, and a concise summary.
Let me know what do you think and don’t forget to spread the word!
// 📫 Engineering blogs digest
Most notable articles posted in the tech companies’ engineering blogs over the past two weeks.
"How Meta built the infrastructure for Threads" by Laine Campbell, Chunqiang (CQ) Tang ⸱ Meta ⸱ 9 min read ⸱ 19 Dec
Discusses the successful launch of Meta's Threads and the infrastructure behind
Describes the use of ZippyDB, a distributed key/value database, and how it was optimized for the Threads launch
Explores the role of Async, a serverless function platform, in scaling workload execution for Threads
"Hashnode's Feed Architecture" by Florian Fuchs ⸱ Hashnode ⸱ 6 min read ⸱ 7 Dec
Describes the detailed process of pre-calculating feeds for thousands of users using AWS Step Functions.
Uncovers the calculation process for users with cached metadata and those without, showcasing the efficiency without involving the database.
"Can gRPC replace REST and WebSockets for Web Application Communication?" by Ian Douglas ⸱ gRPC ⸱ 11 min read ⸱ 4 Dec
Explores the transition from REST to gRPC-Web
Presents practical code samples for replacing WebSocket connections with gRPC-Web, showcasing the advantages of gRPC-Web in server-to-client streaming
"How I Have Fun With Rust" by Matheus Richard ⸱ thoughtbot ⸱ 5 min read ⸱ 21 Dec
Explores using Rust as a secondary language for side projects
Discusses an idea of writing fewer tests due to Rust's strong type system
Recommends avoiding references and lifetimes as much as possible
"How I Built an Okta Documentation Chatbot in Python" by Tanish Kumar ⸱ Okta ⸱ 9 min read ⸱ 20 Dec
Describes the creation of a Python chatbot called Oktanaut using OpenAI
Covers the meticulous training process and self-learning mechanism used to enhance the bot's performance
Shares the code and a step-by-step walkthrough for running the chatbot using Google Colab and OpenAI's GPT-3.5 model
"Understanding Generic and Variance in Kotlin" by Sagar Avhad ⸱ Walmart ⸱ 7 min read ⸱ 14 Dec
Describes how generics can help avoid repetitive type checks and casting in Kotlin
Gives an overview of using generic constraint to prevent misuse of generic types
Explains the concepts of covariance and contravariance in Kotlin generics
Presents the use of out and in keywords to enable covariance and contravariance at the use site
"Using DNS to estimate the worldwide state of IPv6 adoption" by Carlos Rodrigues ⸱ Cloudflare ⸱ 7 min read ⸱ 14 Dec
"Code review best practices: How to complete a pull request" ⸱ Capital One ⸱ 5 min read ⸱ 20 Dec
Explains the importance of understanding the high-level purpose of the PR
Covers best practices for reviewing the code, including reusing existing code and ensuring readability
"Personalizing the DoorDash Retail Store Page Experience" by Luming Chen, Yuan Meng, Anthony Zhou ⸱ DoorDash ⸱ 12 min read ⸱ 12 Dec
Gives an overview of the overall framework to generate personalized recommendations for retail store homepages
Explores the ML model deep dive, including collection retrieval and item ranking
Describes addressing position bias and diversifying recommendations
Shares future personalization goals, including incorporating restaurant order histories and moving toward MTML architectures
"Why Logic Programming Is the Best Choice for Authorization" by Nicholaos Mouzourakis ⸱ Gusto ⸱ 12 min read ⸱ 11 Dec
Describes the problems with traditional authorization code and the limitations of data-driven approaches like Role-Based Access Control
Presents Prolog as a solution for authorization queries with improved code organization and safety
Introduces Open Policy Agent (OPA) and its Rego language as a modern solution for JSON authorization queries
"Extracting skills from content to fuel the LinkedIn Skills Graph" by Ji Yan ⸱ LinkedIn ⸱ 5 min read ⸱ 13 Dec
Describes the use of AI to extract skills from various content sources across LinkedIn
Presents the multi-step process and machine learning models used to extract and map skills onto the LinkedIn Skills Graph
"Offline LLM Evaluation: Step-by-Step GenAI Application Assessment on Databricks" by multiple authors ⸱ Databricks ⸱ 8 min read ⸱ 14 Dec
Covers setting up external models in Databricks for managing and accessing large language models
Explores using Databricks AI Playground for testing prompts and parameters with LLMs
Shows how to create a GenAI POC with LangChain and logging with MLflow
"Pytest daemon: 10x local test iteration speed" by Ruby Feinstein ⸱ Discord ⸱ 5 min read ⸱ 8 Dec
"Top 3 security best practices for handling JWTs" by Liran Tal ⸱ Snyk ⸱ 11 min read ⸱ 18 Dec
"The Top Pinterest Engineering Blog posts from 2023" ⸱ Pinterest ⸱ 1 min read ⸱ 21 Dec
Thanks for reading Big Tech Digest. Don’t forget to spread the word and get in touch!
Enjoy the festive season and see you next year!
Delivered bi-weekly to your inbox, Big Tech Digest brings you a collection of links to the latest engineering blog posts from +100 Big Tech companies and startups like Airbnb, Uber, Netflix or Meta. Aimed at all Software Engineers and AI/ML folks at any level, Big Tech Digest is focused on engineering problems and their proposed solutions at tech companies that I find particularly interesting. No marketing or non-tech stuff.
Subscribe now to receive a new issue directly to your inbox every two weeks!