Friday, July 27, 2018
awesome scalability
awesome scalability
Contents
- Principles
- Scalability
- Availability
- Stability
- Performance
- Intelligence
- Architectures
- Ad-hoc
- Interview
- Talks
- Books
Principles
- Designs, Lessons and Advice from Building Large Distributed Systems - Jeff Dean
- On Efficiency, Reliability, Scaling - James Hamilton, VP at AWS
- Principles of Chaos Engineering
- Finding the Order in Chaos
- The Twelve-Factor App
- Clean Architecture
- High Cohesion and Low Coupling
- CAP Theorem and Trade-offs
- CP Databases and AP Databases
- Scale Up vs Scale Out
- Scale Up vs Scale Out: Hidden Costs
- Best Practices for Scaling Out
- ACID and BASE
- Blocking/Non-Blocking and Sync/Async
- Performance and Scalability of Databases
- Database Isolation Levels and Effects on Performance and Scalability
- SQL vs NoSQL
- SQL vs NoSQL - Lesson Learned from Salesforce
- How Sharding Works
- Consistent Hashing
- Uniform Consistent Hashing (used at Netflix)
- Eventually Consistent - Werner Vogels, CTO at Amazon
- Cache is King
- Anti-Caching
- Understand Latency
- Latency Numbers Every Programmer Should Know
- Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO
- Common Bottlenecks
- Life Beyond Distributed Transactions
- Relying on Software to Redirect Traffic Reliably at Various Layers
- Breaking Things on Purpose
- Avoid Over Engineering
- Scalability Worst Practices
- Use Solid Technologies - Don�t Re-invent the Wheel - Keep It Simple!
- Why Over-Reusing is Bad
- Performance is a Feature
- Make Performance Part of Your Workflow
- The Benefits of Server Side Rendering Over Client Side Rendering
- Writing Code that Scales
- Automate and Abstract: Lessons from Facebook on Engineering for Scale
- AWS Dos and Donts
- (UI) Design Doesn�t Scale - Stanley Wood, Design Director at Spotify
- Linux Performance
- How To Design A Good API and Why it Matters - Joshua Bloch
- Building Fast & Resilient Web Applications - Ilya Grigorik
- Design for Loose-coupling
- Design for Resiliency
- Design for Self-healing
- Design for Scaling Out
- Design for Evolution
- Learn from Mistakes
- Code Review Best Practices at Palantir
Scalability
- Microservices and Orchestration
- Microservices Resource Guide - Martin Fowler, Chief Scientist at ThoughtWorks
- Microservices Patterns
- Advantages and Drawbacks of Microservices
- Microservices Scale Cube
- Thinking Inside the Container (8 parts) at Riot Games
- Containerization at Pinterest
- Techniques for Splitting Up a Codebase into Microservices and Artifacts at LinkedIn
- The Evolution of Container Usage at Netflix
- Dockerizing MySQL at Uber
- Testing of Microservices at Spotify
- Organize Monolith Before Breaking it into Services at Weebly
- Lessons learned running Docker in production at Treehouse
- Inside a SoundCloud Microservice
- Microservices at BlaBlaCar
- Operate Kubernetes Reliably at Stripe
- Kubernetes Traffic Routing (2 parts) at Rakuten
- Agrarian-Scale Kubernetes (3 parts) at New York Times
- Mesos, Docker and Ochopod in Localization Services at Autodesk
- Nanoservices at BBC Online
- PowerfulSeal: Testing Tool for Kubernetes Clusters at Bloomberg
- Conductor: Microservices Orchestrator at Netflix
- Making 10x Improvement in Release Times with Docker and Amazon ECS at Nextdoor
- K8Guard: Auditing System for Kubernetes Clusters at Target.com
- Distributed Caching
- Read-Through, Write-Through, Write-Behind, and Refresh-Ahead Caching
- Eviction Policy and Expiration Policy
- EVCache: Caching for a Global Netflix
- Memsniff: Robust Memcache Traffic Analyzer at Box
- Caching with Consistent Hashing and Cache Smearing at Etsy
- Analysis of Photo Caching at Facebook
- Cache Efficiency Exercise at Facebook
- tCache: Scalable Data-aware Java Caching at Trivago
- Reduce Memcached Memory Usage by 50% at Trivago
- Caching Internal Service Calls at Yelp
- Distributed Tracking and Tracing
- Tracking Service Infrastructure at Scale at Shopify
- Distributed Tracing with Pintrace at Pinterest
- Distributed Tracing at HelloFresh
- Analyzing Distributed Trace Data at Pinterest
- Distributed Tracing at Uber
- Data Checking at Dropbox
- Tracing Distributed Systems at Showmax
- Real-time Distributed Tracing at LinkedIn
- Zipkin: Distributed Systems Tracing at Twitter
- osquery Across the Enterprise at Palantir
- Distributed Logging
- The Problem with Logging - Jeff Atwood
- The Log: What Every Software Engineer Should Know
- Using Logs to Build a Solid Data Infrastructure - Martin Kleppmann
- Scalable and Reliable Log Ingestion at Pinterest
- Building DistributedLog at Twitter: High-performance replicated log service
- Logging Service with Spark at CERN Accelerator
- Logging and Aggregation at Quora
- BookKeeper: Distributed Log Storage at Yahoo
- LogDevice: Distributed Data Store for Logs at Facebook
- LogFeeder: Log Collection System at Yelp
- Distributed Security
- Approach to Security at Scale at Dropbox
- Aardvark and Repokid: AWS Least Privilege for Distributed, High-Velocity Development at Netflix
- LISA: Distributed Firewall at LinkedIn
- Distributed Security Alerting at Slack
- Secure Infrastructure To Store Bitcoin In The Cloud at Coinbase
- Distributed Messaging and Event Streaming
- When to use RabbitMQ or Kafka
- Should You Put Several Event Types in the Same Kafka Topic? - Martin Kleppmann
- Kafka at Scale at Linkedin
- Delaying Asynchronous Message Processing with RabbitMQ at Indeed
- Real-time Data Pipeline with Kafka at Yelp
- Building Reliable Reprocessing and Dead Letter Queues with Kafka at Uber
- Audit Kafka End-to-End at Uber (count each message exactly once, audit a message across tiers)
- Kafka for PaaS at Rakuten
- Publishing with Kafka at The New York Times
- Kafka Streams on Heroku
- Kafka in Platform Events Architecture at Salesforce
- Bullet: Forward-Looking Query Engine for Streaming Data at Yahoo
- Benchmarking Streaming Computation Engines at Yahoo
- Messaging Service at Riot Games
- Event Stream Analytics with Druid (Search Engine meet Column DB) at Walmart
- Deduplication Techniques
- Exactly-once Semantics are Possible: Here�s How Kafka Does it
- Real-time Deduping at Scale with Kafka-based Pipleline at Tapjoy
- Delivering Billions of Messages Exactly Once: Deduping at Segment
- Deduplication For Efficient Storage (From 50 PB To 32 PB) At Mail.Ru
- Distributed Searching
- Search Architecture of Instagram
- Search Architecture of eBay
- Improving Search Engine Efficiency by over 25% at eBay
- Search Federation Architecture at LinkedIn (2018)
- Search at Slack
- Search and Recommendations at DoorDash
- Search Service at Twitter (2014)
- Nautilus: Travel Search Engine of Expedia
- Galene: Search Architecture of LinkedIn
- Manas: High Performing Customized Search System at Pinterest
- Sherlock: Near Real Time Search Indexing at Flipkart
- Nebula: Storage Platform to Build Search Backends at Airbnb
- ELK (Elasticsearch, Logstash, Kibana) Stack
- Elasticsearch Performance Tuning Practice at eBay
- Elasticsearch at Kickstarter
- Distributed Troubleshooting Platform with ELK Stack at Target.com
- ELK at Robinhood
- Distributed Storage
- In-memory Storage
- Introduction to In-memory Data - Viktor Gamov, Solutions Architect at Hazelcast
- MemSQL Architecture - The Fast (MVCC, InMem, LockFree, CodeGen) And Familiar (SQL)
- Optimizing Memcached Efficiency at Quora
- Real-Time Data Warehouse with MemSQL on Cisco UCS
- Moving to MemSQL (with Horizontally Scalable, ACID Compliant, MySQL Compatibility) at Tapjoy
- Durable Storage (Amazon S3)
- Reasons for Choosing S3 over HDFS at Databricks
- S3 in the Data Infrastructure at Airbnb
- Quantcast File System on Amazon S3
- Using S3 in Netflix Chukwa
- Yahoo Cloud Object Store - Object Storage at Exabyte Scale
- Ambry: Distributed Immutable Object Store at LinkedIn
- Hammerspace: Persistent, Concurrent, Off-heap Storage at Airbnb
- In-memory Storage
- Relational Databases (MySQL, MSSQL, PostgreSQL)
- Microsoft SQL versus MySQL
- SQL Database Performance Tuning
- Scaling PostgreSQL Using CUDA
- Scaling Distributed Joins
- MySQL System Design at Booking.com
- MySQL Parallel Replication (4 parts) at Booking.com
- Partitioning Main MySQL Database at Airbnb
- PostgreSQL at Twitch
- Scaling MySQL-based Financial Reporting System at Airbnb
- Scaling MySQL at Wix
- Switching from Postgres to MySQL at Uber
- Handling Growth with Postgres at Instagram
- Scaling the Analytics Database (Postgres) at TransferWise
- Updating a 50 Terabyte PostgreSQL Database at Adyen
- Sharding (Horizontal Partitioning)
- Sharding MySQL at Pinterest
- Sharding MySQL at MailChimp
- Sharding MySQL (3 parts) at Evernote
- NoSQL Databases
- Key-Value Databases (DynamoDB, Voldemort, Manhattan)
- Scaling Mapbox infrastructure with DynamoDB Streams
- Manhattan: Twitter�s distributed key-value database
- Sherpa: Yahoo�s distributed NoSQL key-value store
- Riak inside Chat Service Architecture at Riot Games
- MPH: Fast and Compact Immutable Key-Value Stores at Indeed
- zBase: High Performance, Elastic, Distributed Key-Value Store at Zynga
- Column Databases (Cassandra, HBase)
- Consistent Hashing in Cassandra
- Understanding Gossip (Cassandra Internals)
- When NOT to use Cassandra?
- Avoid Pitfalls in Scaling Cassandra Cluster at Walmart
- Storing Images in Cassandra at Walmart
- Cassandra at Instagram
- Scale Ad Analytics with Cassandra at Yelp
- Store Billions of Messages with Cassandra at Discord
- Scale to 100+ Million Reads/Writes using Spark and Cassandra at Dream11
- Moving Food Feed from Redis to Cassandra at Zomato
- Benchmarking Cassandra Scalability on AWS at Netflix
- Imgur Notification: From MySQL to HBASE at Imgur
- Improving HBase Backup Efficiency at Pinterest
- ClickHouse - Open Source Distributed Column Database at Yandex
- Document Databases (MongoDB, SimpleDB, CouchDB)
- eBay: Building Mission-Critical Multi-Data Center Applications with MongoDB
- MongoDB at Baidu: Multi-Tenant Cluster Storing 200+ Billion Documents across 160 Shards
- The AWS and MongoDB Infrastructure of Parse (acquired by Facebook)
- Migrating Mountains of Mongo Data at Addepar
- Couchbase Ecosystem at LinkedIn
- SimpleDB at Zendesk
- Graph Databases
- Handling Billions of Edges in a Graph Database
- Neo4j case studies with Walmart, eBay, AirBnB, NASA, etc
- FlockDB: Distributed Graph Database for Storing Adjancency Lists at Twitter
- JanusGraph: Scalable Graph Database backed by Google, IBM and Hortonworks
- Amazon Neptune
- Datastructure Databases (Redis, Hazelcast)
- Using Redis To Scale at Twitter
- Scaling Job Queue with Redis at Slack
- Moving persistent data out of Redis at Github
- Storing Hundreds of Millions of Simple Key-Value Pairs in Redis at Instagram
- Redis in Chat Architecture of Twitch (from 27:22)
- Learn Redis the hard way (in production) at Trivago
- Optimizing Session Key Storage in Redis at Deliveroo
- Optimizing Redis Storage at Deliveroo
- Key-Value Databases (DynamoDB, Voldemort, Manhattan)
- Time Series Database (TSDB)
- What is Time-Series Data & Why We Need a Time-Series Database
- Time Series Data: Why and How to Use a Relational Database instead of NoSQL
- Beringei: High-performance Time Series Storage Engine at Facebook
- Atlas: In-memory Dimensional Time Series Database at Netflix
- Heroic: Time Series Database at Spotify
- Roshi: Distributed Storage System for Time-Series Event at SoundCloud
- Building a Scalable Time Series Database on PostgreSQL
- Scaling Time Series Data Storage at Netflix
- HTTP Caching (Reverse Proxy, CDN)
- Reverse Proxy (Nginx, Varnish, Squid, rack-cache)
- Stop Worrying and Love the Proxy
- Playing HTTP Tricks with Nginx
- Using CDN to Improve Site Performance at Coursera
- Strategy: Caching 404s Saved 66% On Server Time at The Onion
- Increasing Application Performance with HTTP Cache Headers
- Zynga Geo Proxy: Reducing Mobile Game Latency at Zynga
- Google AMP at Cond� Nast
- Running A/B Tests on Hosting Infrastructure (CDNs) at Deliveroo
- HAProxy with Kubernetes for User-facing Traffic at SoundCloud
- Bandaid: Service Proxy at Dropbox
- Load Balancing and Other Network Matters
- Introduction to Modern Network Load Balancing and Proxying
- Load Balancing infrastructure to support more than 1.3 billion users at Facebook
- DHCPLB: Open Source Load Balancer for DHCP at Facebook
- Load Balancing with Eureka at Netflix
visit link download
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment