Big Tech Digest #17 π₯
Featuring articles from Discord, Expedia, Pinterest, Booking, and many more!
Happy Thursday π!
Welcome to another Big Tech Digest issue. This time around, we have plenty of fresh articles from Discord, Expedia, Pinterest, Booking, and many more published since the last Big Tech Digest issue 2 weeks ago!
Thereβs just one thing you could do to help me grow Big Tech Digest: go ahead and mention it to your friends and/or teammates. Thank you! π
π₯ Tech Talks Weekly
Iβd like to share an excellent newsletter, called Tech Talks Weekly, which delivers all the recently uploaded tech conference videos from over 100 engineering conferences (see the full list) like Devoxx, GOTO, NDC, QCon, LeadDev, and many more (see a recent issue). I highly encourage to subscribe if youβre into watching tech talks.
Without further ado, letβs get started!
// π Must reads
1. "Navigating Web Evolution: An In-Depth Interview with Addy Osmani on Web Development and Developer Growth"
Wix βΈ± 12 min read βΈ± 16 Jul
Addy discusses balancing quality and agility in testing in web development, covers how to choose what technologies to focus on in the ever-changing industry. He shares his thoughts on the evolution of browser design and potential areas for improvement and presents the potential game changers in browser performance, including WebAssembly and AI-driven optimizations.
Highly recommended read!
2. "The Perils of Deprecating a Legacy Microservice"
by Shashank Jha βΈ± Expedia βΈ± 4 min read βΈ± 10 Jul
Explores the journey of modernizing the Flight Details stack at Expedia Group
Describes the motivation for the project and the proposal to integrate Cassandra
Covers the challenges faced, including issues with serialization library and upgrading infrastructure
Shares the decision to introduce Redis as an alternative and the benefits it brought, including streamlined PubSub implementation and cost savings
// π¬ Optional reads
a.k.a. The Best of the Rest!
"Build your own GPT (BYO-GPT)"
by Rajat Gupta βΈ± Walmart βΈ± 7 min read βΈ± 08 Jul
How would you go about building a customized GPT for personal data and avoid exposing sensitive information to external APIs?
"How Discord Uses Open-Source Tools for Scalable Data Orchestration & Transformation"
Discord βΈ± 2 min read βΈ± 12 Jul
The article presents the decision to use a combination of Dagster and dbt, highlighting Dagster's support for Kubernetes, declarative automation, and user-friendly UI.
"Unlocking knowledge sharing for videos with RAG"
by Alon Faktor βΈ± Vimeo βΈ± 13 min read βΈ± 15 Jul
Describes Vimeo's new video Q&A system using generative AI to chat about video content
Explores the use of RAG (retrieval augmented generation) for answering questions about specific textual databases
Discusses the bottom-up approach to processing the video transcript for Q&A
Presents the technique of speaker detection without using facial recognition
Covers the process of finding accurate reference points in the video using AI-generated art and automatically generating new questions for viewers to ask
"Optimizing the picking process to enable faster deliveries for Instamart"
by Sonakshi Gupta βΈ± Swiggy βΈ± 7 min read βΈ± 10 Jul
Describes the issue of time delays during peak hours for Instamart's picking process at dark stores
Introduces the proposed solution of batch picking orders to improve efficiency
Goes through the mathematical model and algorithm to implement batch picking
Shares results of a simulation validating the effectiveness of batch picking
Gives an overview of how batch picking reduces picker assignment time and travel time, improving overall efficiency
"AWS Simple Email Service Security"
by Tom Spencer βΈ± Capgemini βΈ± 1 min read βΈ± 12 Jul
"Leverage graph technology for real-time Fraud Detection and Prevention"
by Deepak Patankar βΈ± Booking βΈ± 6 min read βΈ± 10 Jul
Describes the challenges of fraud detection and prevention in the context of Booking.com
Introduces the concept of representing requests in a graph for real-time fraud detection
Shares examples of how a graph evolves over time and reveals suspicious patterns
Explains how graphs are used for real-time fraud detection and the technical requirements for building them
Covers the system components and design of the Fraud Detection Service and Graph Service
"Hereβs what we learned at Google I/O Connect Berlin"
by Ed Holloway-George βΈ± ASOS βΈ± 1 min read βΈ± 17 Jul
"A 4-Stage Guide to Identify Insecure Output Handling Exploits in LLMs"
by Zeev Kalyuzhner βΈ± Wix βΈ± 3 min read βΈ± 15 Jul
Explores the vulnerability of insecure output handling in LLMs
Describes how malicious actors can exploit this vulnerability to breach systems and access private data
Covers the concept of training data poisoning and its impact on LLM integrity
Presents a practical scenario for exploiting the vulnerability to hack into systems
"Streamlining GraphQL Service Testing with Karate"
trivago βΈ± 10 min read βΈ± 08 Jul
Describes how trivago refactored its existing GraphQL monolith to a microservice architecture
Explores the challenges of testing GraphQL services, including nested data structures and error handling
Shares how Karate, a testing framework, was integrated to address the testing challenges
Introduces the use of Justfiles for abstracting complex tasks and ensuring cross-platform compatibility
Covers the implementation of a Blue-Green release strategy for testing changes before deployment to production environments
"Building Pinterest Canvas, a text-to-image foundation model"
by Pinterest Engineering βΈ± Pinterest βΈ± 7 min read βΈ± 10 Jul
Describes the development of Pinterest Canvas, a text-to-image foundation model for enhancing existing images and products on the platform
Discusses the training of the base text-to-image model, fine-tuning process for generating photorealistic backgrounds, and in-context learning process for conditioning on image styles
Explains the use of reinforcement learning to encourage Canvas to generate diverse and visually appealing images
Describes the fine-tuning process for background generation, including training stages and incorporating additional information for inpainting
Shares future improvements to the Pinterest Canvas model, including upgrading to a more modern Transformer diffusion architecture and rethinking the binary-masking approach to model conditioning
Thanks for reading Big Tech Digest. If you enjoyed this issue, π share it with your friends or teammates.
See you in two weeks π!