Session WD-3: Optimization in neural architectures II in stream Optimization in neural architectures: convergence and solution characterization
Wednesday, 11:25 - 12:40Room: M:J
| Session chair(s): |
|
| 155. Vanishing Gradients in Reinforcement Finetuning of Language Models |
Noam Razin
[] - Israel | accepted | ||
| 164. A phase transition between positional and semantic learning in a solvable model of dot-product attention |
Hugo Cui
[] - Switzerland | accepted | ||
| Freya Behrens
[] - Switzerland | ||||
| Florent Krzakala
[] - Switzerland | ||||
| Lenka Zdeborová
[] - Switzerland | ||||
| 151. On the spectral bias of two-layer linear networks |
Aditya Varre
[] - Switzerland | accepted | ||