 
      
    | You are here | www.ethanepperly.com | ||
| | | | | extremal010101.wordpress.com | |
| | | | | Suppose we want to understand under what conditions on $latex B$ we have $latex \begin{aligned} \mathbb{E} B(f(X), g(Y))\leq B(\mathbb{E}f(X), \mathbb{E} g(Y)) \end{aligned}$holds for all test functions, say real valued $latex f,g$, where $latex X, Y$ are some random variables (not necessarily all possible random variables!). If $latex X=Y$, i.e., $latex X$ and $latex Y$ are... | |
| | | | | nhigham.com | |
| | | | | A norm on $latex \mathbb{C}^{m \times n}$ is unitarily invariant if $LATEX \|UAV\| = \|A\|$ for all unitary $latex U\in\mathbb{C}^{m \times m}$ and $latex V\in\mathbb{C}^{n\times n}$ and for all $latex A\in\mathbb{C}^{m \times n}$. One can restrict the definition to real matrices, though the term unitarily invariant is still typically used. Two widely used matrix norms... | |
| | | | | fa.bianp.net | |
| | | | | The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially private learning. In this post, I'll derive a simple convergence analysis of this method in the special case when the ... | |
| | | | | swethatanamala.github.io | |
| | | In this paper, authors proposed a new simple network architecture, the Transformer, based solely on attention mechanisms, removing convolutions and recurrences entirely. Transformer is the first transduction model relying entirely... | ||