Excited to release our second episode of Generally Intelligent! This time we’re featuring Sarah Jane Hong, co-founder of Latent Space, a startup building the first fully AI-rendered 3D engine in order to democratize creativity.
Sarah co-authored Low Distortion Block-Resampling with Spatially Stochastic Networks from NeurIPS 2020, a very cool method that lets you realistically resample part of a generated image. For example, maybe you’ve generated a castle, but you’d like to change the style of a tower - you could then resample that tower until you get something you like.
In this episode, we touch on:
- What it was like taking classes under Geoff Hinton in 2013
- How not to read papers
- The downsides of only storing information in model weights
- Why using natural language prompts to render a scene is much harder than you’d expect
- Why a model’s ability to scale is more important than getting state-of-the-art results
As always, please feel free to reach out with feedback, ideas, and questions!
Here’s the full transcript.
Generally Intelligent is a podcast made for deep learning researchers (you can learn more about it here).
Our first guest is Kelvin Guu, a senior research scientist at Google AI, where he develops new methods for machine learning and language understanding. Kelvin is the co-author of REALM: Retrieval-Augmented Language Model Pretraining. The conversation is a wide-ranging tour of language models, how computers interact with world knowledge, and much more; here are a few of the questions we cover:
- Why language models like GPT-3 seem to generalize so well to tasks beyond just predicting words
- How you can store knowledge in a database, in the weights of a model, or a with mix of both approaches
- What interesting problems and data sets have been overlooked by the research community
- Why cross-entropy might not be such a great objective function
- Creative and impactful ways language and knowledge models might be used in the future
We love feedback, ideas, and questions, so please feel free to reach out!
Here’s the full transcript.
Over the past few years, we’ve been a part of countless conversations with various deep learning researchers about the hunches and processes that inform their work. These conversations happen as part of informal paper reading groups, lab discussions, or casual chats with friends, and they have proved critical to our own research.
The intimacy and informality of these environments lends itself to stimulating conversations. Yet it often felt like a shame that the deeper understandings and hard-earned lessons from research failures were shared with just a select few, when so many others could use this knowledge to advance the frontier.
That’s why, today, we’re launching “Generally Intelligent,” a publicly available podcast made by deep learning researchers, for deep learning researchers.
Unlike prior work like SimCLR and MoCo, the recent paper Bootstrap Your Own Latent (BYOL) from DeepMind demonstrates a state of the art method for self-supervised learning of image representations without an explicitly contrastive loss function. This simplifies training by removing the need for negative examples in the loss function. We highlight two surprising findings from our work on reproducing BYOL:
(1) BYOL often performs no better than random when batch normalization is removed, and
(2) the presence of batch normalization implicitly causes a form of contrastive learning.
These findings highlight the importance of contrast between positive and negative examples when learning representations and help us gain a more fundamental understanding of how and why self-supervised learning works.
The code used for this post can be found at https://github.com/untitled-ai/self_supervised.
Appendix for "Understanding self-supervised and contrastive learning with 'Bootstrap Your Own Latent' (BYOL)"
This post contains the extra data and detail for our post “Understanding self-supervised and contrastive learning with ‘Bootstrap Your Own Latent’ (BYOL)”