Explore >> Select a destination


You are here

theankurtyagi.com
| | www.agentcloud.dev
7.6 parsecs away

Travel
| | Learn about the difference between Agent Cloud and OpenAI
| | www.agentcloud.dev
5.2 parsecs away

Travel
| | Compare the features, performance, and use cases of AgentCloud and Qdrant
| | www.daffodilsw.com
8.7 parsecs away

Travel
| | Need AI development services? Daffodil Software is an AI Development Company which helps automate tasks, improve efficiency, and drive growth
| | rohan.ga
88.8 parsecs away

Travel
| This is a reference, not a guide. In a modern LLM, the "weights" consist of several distinct collections of matrices and tensors that serve different functions during inference: Token Embeddings - Large matrix mapping token IDs to vector representations - Used at the very start of inference to convert input tokens to vectors - Typically shape: [vocab_size, hidden_dim] Attention Mechanism Weights - Query/Key/Value Projection Matrices: In standard attention: 3 separate matrices [hidden_dim, hidden_dim] In GQA: One Q matrix but fewer K/V matrices [hidden_dim, kv_dim] Used to project hidden states into query, key, and value spaces - Output Projection Matrix: Maps attention outputs back to hidden dimension [hidden_dim, hidden_dim] Used after attention calculation to project back to main representation RoPE Parameters - Not traditional weight matrices but positional embedding tensors - Used to rotate query/key vectors to encode positional information - Applied during attention computation by complex multiplication