218. Understanding neural architectures via projection onto sets of generalized bilinear constraints
Invited abstract in session WC-3: Optimization in neural architectures I, stream Optimization in neural architectures: convergence and solution characterization.
Wednesday, 10:05 - 11:20Room: M:J
Authors (first author is the speaker)
| 1. | Manish Krishan Lal
|
| TU Munich |
Abstract
The geometry associated with matrix-matrix products can be described with generalized sets of bilinear forms. This paves the way to develop a tensorization approach to the data space in many learning problems. Starting from simple deep neural networks, we exploit this product structure in different architectures such as CNNs, Autoencoders, GANs, and transformers and provide many alternative frameworks to train these networks. The theory lends its support from closed-form projections onto sets of bilinear constraints, hidden convexity, and SDP duality arising in the nonconvex projection problems, and sampling in lifted spaces.
Keywords
- Artificial intelligence based optimization methods and appl
- Analysis and engineering of optimization algorithms
- SS - Semidefinite Optimization
Status: accepted
Back to the list of papers