Previous | Next --- Slide 28 of 79
Back to Lecture Thumbnails
lexicologist

This is relevant to the while loop on the slide before because of how many parameters are required in the call to evaluate_loss_gradient and because the datasets are extremely large (an issue because we want each element to appear in multiple mini batches).

As an aside, GPT-3 is really exciting! This summer, creative programmers used GPT-3 to write some convincing creative fiction (https://www.gwern.net/GPT-3) and fake blogposts (https://www.theverge.com/2020/8/16/21371049/gpt3-hacker-news-ai-blog) among other things.

sagoyal

I think it's interesting that when we take classes like 224N and 231N, we don't really discuss the energy tradeoff and the total compute time required to train the network itself. I hope in the future those classes also evolve to give more time to topics like this.

fizzbuzz

I heard a talk from the author of the KataGo Go-playing AI, and he talked about how the main challenge in competing with Google and the like is the training time, as he has but a small fraction of the compute hardware that they have. In his effort to compete, he discovered a number of ways to make the training easier, which suggests that in general, we might be leaving a lot of training efficiency on the table when we (Google et al) can just throw a bunch of compute at the problem.

felixw17

^Completely agree with sagoyal! Most of the deep learning classes just assume access to tons of compute. I think it's great that in CS149 we got to see this problem from another perspective, and see how much work went into supporting things like GPUs!

jle

^^ I also found this conversation about energy tradeoff in various deep learning classes interesting, and came across this paper if anyone’s interested!

https://www.sciencedirect.com/science/article/pii/S0743731518308773

It essentially overviews how energy consumption is measured/estimated in machine learning applications. I though the coolest thing is that this paper is quite accessible and readable after taking CS149!

Please log in to leave a comment.