Previous | Next --- Slide 28 of 79

lexicologist

This is relevant to the while loop on the slide before because of how many parameters are required in the call to evaluate_loss_gradient and because the datasets are extremely large (an issue because we want each element to appear in multiple mini batches).

As an aside, GPT-3 is really exciting! This summer, creative programmers used GPT-3 to write some convincing creative fiction (https://www.gwern.net/GPT-3) and fake blogposts (https://www.theverge.com/2020/8/16/21371049/gpt3-hacker-news-ai-blog) among other things.

sagoyal

I think it's interesting that when we take classes like 224N and 231N, we don't really discuss the energy tradeoff and the total compute time required to train the network itself. I hope in the future those classes also evolve to give more time to topics like this.

fizzbuzz

I heard a talk from the author of the KataGo Go-playing AI, and he talked about how the main challenge in competing with Google and the like is the training time, as he has but a small fraction of the compute hardware that they have. In his effort to compete, he discovered a number of ways to make the training easier, which suggests that in general, we might be leaving a lot of training efficiency on the table when we (Google et al) can just throw a bunch of compute at the problem.

felixw17

^Completely agree with sagoyal! Most of the deep learning classes just assume access to tons of compute. I think it's great that in CS149 we got to see this problem from another perspective, and see how much work went into supporting things like GPUs!

jle

^^ I also found this conversation about energy tradeoff in various deep learning classes interesting, and came across this paper if anyone’s interested!

https://www.sciencedirect.com/science/article/pii/S0743731518308773

It essentially overviews how energy consumption is measured/estimated in machine learning applications. I though the coolest thing is that this paper is quite accessible and readable after taking CS149!

Please log in to leave a comment.