Explore >> Select a destination


You are here

liorsinai.github.io
| | iclr-blogposts.github.io
21.3 parsecs away

Travel
| | Reinforcement Learning from Human Feedback (RLHF) is pivotal in the modern application of language modeling, as exemplified by ChatGPT. This blog post delves into an in-depth exploration of RLHF, attempting to reproduce the results from OpenAI's inaugural RLHF paper, published in 2019. Our detailed examination provides valuable insights into the implementation details of RLHF, which often go unnoticed.
| | jaykmody.com
11.2 parsecs away

Travel
| | Implementing a GPT model from scratch in NumPy.
| | comsci.blog
15.7 parsecs away

Travel
| | In this tutorial, we will implement transformers step-by-step and understand their implementation. There are other great tutorials on the implementation of transformers, but they usually dive into the complex parts too early, like they directly start implementing additional parts like masks and multi-head attention, but it is not very intuitional without first building the core part of the transformers.
| | blog.google
65.4 parsecs away

Travel
| Gemini is our most capable and general model, built to be multimodal and optimized for three different sizes: Ultra, Pro and Nano.