|
You are here |
josephm.dev | ||
| | | | |
til.simonwillison.net
|
|
| | | | | Here's the pattern I figured out for using the openai Python library to extract structured data from text using a single call to the model. | |
| | | | |
blog.val.town
|
|
| | | | | How to customize OpenAI to your liking | |
| | | | |
qwenlm.github.io
|
|
| | | | | GITHUB HUGGING FACE MODELSCOPE DISCORD Today, we're announcing Qwen3-Coder, our most agentic code model to date. Qwen3-Coder is available in multiple sizes, but we're excited to introduce its most powerful variant first: Qwen3-Coder-480B-A35B-Instruct - a 480B-parameter Mixture-of-Experts model with 35B active parameters which supports the context length of 256K tokens natively and 1M tokens with extrapolation methods, offering exceptional performance in both coding and agentic tasks. Qwen3-Coder-480B-A35B-Instruct sets new state-of-the-art results among open models on Agentic Coding, Agentic Browser-Use, and Agentic Tool-Use, comparable to Claude Sonnet 4. | |
| | | | |
blog.moonglow.ai
|
|
| | | Parameters and data. These are the two ingredients of training ML models. The total amount of computation ("compute") you need to do to train a model is proportional to the number of parameters multiplied by the amount of data (measured in "tokens"). Four years ago, it was well-known that if | ||