Treat abstract

> Treat abstract

345. Alternate Through the Epochs Stochastic Gradient for Multi-Task Neural Networks

Invited abstract in session MC-3: First-order methods in modern optimization (Part II), stream Large scale optimization: methods and algorithms.

Monday, 14:00-16:00
Room: B100/4011

Authors (first author is the speaker)

1.	Stefania Bellavia
	Dipartimento di Ingegneria Industriale, Universita di Firenze
2.	Francesco Della Santa
	Dipartimento di Scienze Matematiche, Politecnico di Torino
3.	Alessandra Papini
	Ingegneria Industriale, Università di Firenze

Abstract

We focus on the training phase of Neural Networks for Multi-Task
Learning. We consider hard-parameter sharing Multi-Task Neural Networks (MTNNs) and discuss alternate stochastic gradient updates. Traditional MTNN training faces challenges in managing conflicting loss gradients, often yielding sub-optimal performance. The proposed alternate training method updates shared and task-specific weights alternately through the epochs, exploiting the multi-head architecture of the model. This approach reduces computational costs per epoch and memory requirements. Convergence properties similar to those of the classical stochastic gradient method are established.
Empirical experiments demonstrate enhanced training regularization
and reduced computational demands.

Keywords

Optimization for learning and data analysis
First-order optimization
Multi-objective optimization

Status: accepted

Back to the list of papers

> Treat abstract

This part of the site is hosted by EURO. Feedback. Privacy policy