Blog Writing Tricks
Some writing techniques for the butterfly theme in markdown format
11711 Advanced NLP: Parallelism and Scaling
Notes on parallelism and scaling from CMU 11-711 Advanced NLP, covering single-GPU training basics, data/tensor/pipeline parallelism, ZeRO memory optimization, and strategy selection for large-scale LLM training.
11711 Advanced NLP: Quantization
Notes on quantization from CMU 11-711 Advanced NLP, covering number representation, 8-bit quantization (absmax, zero-point), LLM.int8(), GGML/GGUF, and quantization-aware training with QLoRA.
11868 LLM Sys: Systems for Mixture-of-Experts Models
Notes on CMU 11-868 LLM Systems Lec16, covering Mixture-of-Experts architecture, Switch Transformer, shared vs. routed experts, expert parallelism with GShard, load balancing, DeepSeek V3 MoE, and DeepSpeed-MoE inference optimization.
15645 Database systems: Query Optimization
Notes and summaries for CMU 15-645 Database Systems.
15645 Database systems: Query Execution
Notes and summaries for CMU 15-645 Database Systems.
11868 LLM Sys: Distributed Training, DDP, and Model Parallelism
Notes on CMU 11-868 LLM Systems Lec13-15, covering distributed training, NCCL collectives, Ring AllReduce, PyTorch DDP, pipeline parallelism, tensor parallelism, and their combination.
AI Agent Engineering Design: A Curated Reading List
A curated collection of essential readings on AI Agent engineering design, covering Claude Code, OpenAI Codex, context management, memory mechanisms, and the OpenClaw architecture. Each article includes key excerpts and core takeaways.
15645 Database systems: Join Algorithms
Notes and summaries for CMU 15-645 Database Systems.
15645 Database systems: Sorting & Aggregation
Notes and summaries for CMU 15-645 Database Systems.




