Andrej Karpathy

I like to train deep neural nets on large datasets 🧠🤖💥


2017
I am the Sr. Director of AI at Tesla, where I lead the team responsible for all neural networks of the Autopilot. This includes dataset gathering, neural network training, the science of making it work, and their deployment in production running on our custom chip. Our networks learn from the most complicated and diverse scenarios in the world, iteratively sourced from our fleet of nearly 1M vehicles in real time. A full build of Autopilot neural networks involves 48 networks that take 70,000 GPU hours to train. Together, they output 1,000 distinct tensors (predictions) at each timestep.
2015 - 2017
I was a research scientist and a founding member at OpenAI, where I worked on deep learning, computer vision, generative models and reinforcement learning.
2011 - 2015
My PhD was focused on convolutional/recurrent neural networks and their applications in computer vision, natural language processing and their intersection. My adviser was Fei-Fei Li at the Stanford Vision Lab and I also had the pleasure to work with Daphne Koller, Andrew Ng, Sebastian Thrun and Vladlen Koltun along the way during the first year rotation program.

I designed and was the primary instructor for the first deep learning class Stanford - CS 231n: Convolutional Neural Networks for Visual Recognition. The class became one of the largest at Stanford and has grown from 150 enrolled in 2015 to 330 students in 2016, and 750 students in 2017.

Along the way I squeezed in 3 awesome internships: at (a baby) Google Brain in 2011 working on learning-scale unsupervised learning from videos, then again in Google Research in 2013 working on large-scale supervised learning on YouTube videos, and finally at DeepMind in 2015 working on the deep reinforcement learning team.
2009 - 2011
MSc at the University of British Columbia where I worked with Michiel van de Panne on learning controllers for physically-simulated figures. Think: agile robotics but in a physical simulation.
2005 - 2009
BSc at the University of Toronto with a double major in computer science and physics and a minor in math. This is where I first came in contact with deep learning, attending Geoff Hinton's class and reading groups.
teaching
In 2015 I designed and was the primary instructor for the first deep learning class Stanford - CS 231n: Convolutional Neural Networks for Visual Recognition ❤️. The class became one of the largest at Stanford and has grown from 150 enrolled in 2015 to 330 students in 2016, and 750 students in 2017.
pet projects
micrograd is a tiny scalar-valued autograd engine (with a bite! :)). It implements backpropagation (reverse-mode autodiff) over a dynamically built DAG and a small neural networks library on top of it with a PyTorch-like API.
char-rnn was a Torch character-level language model built out of LSTMs/GRUs/RNNs. Related to this also see the Unreasonable Effectiveness of Recurrent Neural Networks blog post, or the minimal RNN gist.
arxiv-sanity tames the overwhelming flood of papers on Arxiv. It allows researchers to track recent papers, search/sort by similarity to any paper, see recent/popular papers, build a library and get recommendations of new papers. Deployed live at arxiv-sanity.com. My obsession with meta research involved many more projects over the years, e.g. see pretty NIPS 2020 papers, research lei, scholaroctopus, and biomed-sanity.
neuraltalk2 was an early image captioning project in (lua)Torch. Also see our later extension with Justin Johnson to dense captioning.
I am sometimes jokingly referred to as the reference human for ImageNet because I competed against an early ConvNet on categorizing images into 1,000 classes. This required a bunch of custom tooling and a lot of learning about dog breeds. See the blog post "What I learned from competing against a ConvNet on ImageNet". Also a Wired article.
ConvNetJS is a deep learning library written from scratch entirely in Javascript. This enables nice web-based demos that train convolutional neural networks (or ordinary ones) entirely in the browser. Many web demos included. I did an interview with Data Science Weekly about the library and some of its back story here. Also see my later followups such as tSNEJS, REINFORCEjs, or recurrentjs, GANs in JS.
How productive were you today? How much code have you written? Where did your time go? For a while I was really into tracking my productivity, and since I didn't like that RescueTime uploads your (very private) computer usage statistics to a cloud I wrote my own, privacy-first, tracker - ulogme! That was fun.
misc: I built a lot of other random stuff over time. Rubik's cube color extractor, predator prey neuroevolutionary multiagent simulations, more of those, sketcher bots, games for computer game competitions #1, #2, #3, random computer graphics things, Tetris AI, multiplayer coop tetris, etc.
publications
ICML 2017
Tianlin (Tim) Shi, Andrej Karpathy, Linxi (Jim) Fan, Jonathan Hernandez, Percy Liang
ICLR 2017
Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma, and Yaroslav Bulatov
2016
Andrej Karpathy
CVPR 2016 (Oral)
Justin Johnson*, Andrej Karpathy*, Li Fei-Fei
ICLR 2016 Workshop
Andrej Karpathy*, Justin Johnson*, Li Fei-Fei
CVPR 2015 (Oral)
Andrej Karpathy, Li Fei-Fei
IJCV 2015
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, Li Fei-Fei
NIPS 2014
Andrej Karpathy, Armand Joulin, Li Fei-Fei
CVPR 2014 (Oral)
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei
TACL 2013
Richard Socher, Andrej Karpathy, Quoc V. Le, Christopher D. Manning, Andrew Y. Ng
ICRA 2013
Andrej Karpathy, Stephen Miller, Li Fei-Fei
NIPS 2012
Adam Coates, Andrej Karpathy, Andrew Ng
AI 2012
Andrej Karpathy, Michiel van de Panne
SIGGRAPH 2011
Stelian Coros, Andrej Karpathy, Benjamin Jones, Lionel Reveret, Michiel van de Panne

Also on Google Scholar
misc