Recommended Readings

Most of my blog posts have prerequisites stated at the beginning of the post. If you are unfamiliar with one or another, please refer to the following table for recommended readings.

I only list references that I found helpful, so it is by no means exhaustive nor objective, and you might find better references. If you do, please let me know! I update the table regularly.

TopicLight introductionDeep dive
A/B testingVincent Vanhoucke's blog post on A/B testing
AI alignment problemSee "Friendly AI"See "Friendly AI"
AI safetySee "Friendly AI"See "Friendly AI"
automatic differentiationAri Seff's video
Bayesian statisticsMacKay 
classical mechanicsLandau and Lifshitz 
convex optimizationIntelligent Systems Lab's video
deep implicit layersThe NeurIPS 2020 tutorial
finite factored setsScott Garrabant's lecture
friendly AIBostrom
Eliezer Yudkowski's presentation
Leike et al. 
Everitt et al. *
functional decision theoryYudkowski and Soares 
game theoryLeyton-Brown and Shoham *
Gaussian processesDavid MacKay's lecture (slides and alternative upload here)Rasmussen and Williams  *
information theoryMacKay 
integral equationsWazwaz  *
LSTMsChristopher Olah's blog post
model compressionSam Sučík's blog post
neural differential equationsKidger  *
neural networks3blue1brown’s series of videosRoberts, Yaida, and Hanin 
Off the Convex Path blog *
neural processesGarnelo et al. 
object-oriented programmingGraham's blog post
reinforcement learningDavid Silver's lecture series
Berkley deep reinforcement learning lecture series*
Sutton and Barto 
relativityEinstein's wonderful nearly-equation-free book Wald 
Misner et al. 
Penrose and Rindler 
transformersDinan, et al. 
writingGeorge Orwell's essay
John Wentworth's post on quick and rigorous writing
Michael Nielsen's post on Discovery Fiction

Sources with a “*” are those which I have read only partially.


Dinan, E., Yaida, S. & Zhang, S. Effective Theory of Transformers at Initialization. Preprint at (2023).
Roberts, D. A., Yaida, S. & Hanin, B. The Principles of Deep Learning Theory. (2021).
Wazwaz, A.-M. Linear and nonlinear integral equations: methods and applications. (Higher Education Press, 2011).
Kidger, P. On Neural Differential Equations. arXiv:2202.02435 [cs, math, stat] (2022).
Douglas, Y. The Reader’s Brain: How Neuroscience Can Make You a Better Writer. (Cambridge University Press, 2015).
Wald, R. M. General relativity. (University of Chicago Press, 1984).
Misner, C. W., Thorne, K. S. & Wheeler, J. A. Gravitation. (W. H. Freeman, 1973).
Einstein, A. Relativity : the special and general theory. (2005).
Penrose, R. & Rindler, W. Spinors and space-time. vol. 1 (Cambridge University Press, 1984).
Landau, L. D. & Lifshitz, E. M. Mechanics. (Elsevier, 1982).
Leyton-Brown, K. & Shoham, Y. Essentials of Game Theory: A Concise Multidisciplinary Introduction. Synthesis Lectures on Artificial Intelligence and Machine Learning 2, 1–88 (2008).
Rasmussen, C. E. & Williams, C. K. I. Gaussian processes for machine learning. (MIT Press, 2008).
Everitt, T., Lea, G. & Hutter, M. AGI Safety Literature Review. arXiv:1805.01109 [cs] (2018).
Bostrom, N. Superintelligence: paths, dangers, strategies. (Oxford University Press, 2014).
Leike, J. et al. Scalable agent alignment via reward modeling: a research direction. arXiv:1811.07871 [cs, stat] (2018).
Garnelo, M. et al. Neural Processes. arXiv:1807.01622 [cs, stat] (2018).
Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction.
Yudkowsky, E. & Soares, N. Functional Decision Theory: A New Theory of Instrumental Rationality. arXiv:1710.05060 (2017).
MacKay, D. J. C. Information Theory, Inference, and Learning Algorithms. (Cambridge University Press, 2003).