576. Robust Regression and Outlier Detection with DC Programming
Invited abstract in session MC-9: Generalized convexity and monotonicity 2, stream Generalized convexity and monotonicity.
Monday, 14:00-16:00Room: B100/8013
Authors (first author is the speaker)
| 1. | Marah-Lisanne Thormann
|
| Mathematical Sciences, University of Southampton | |
| 2. | Vuong Phan
|
| University of Southampton | |
| 3. | Alain Zemkoho
|
| Mathematics, University of Southampton |
Abstract
Robust regression is a popular alternative to Ordinary Least Squares when outliers are present in the data. A commonly used robust regression technique is Least Trimmed Squares (LTS), utilizing only a subset of the observations to estimate the regression coefficients. Unfortunately, determining the exact solution corresponds to a combinatorial problem with an unmanageable computation time for larger data applications. Therefore, the most popular approach currently is a heuristic called Fast-LTS. In this talk, we alternatively propose the successive Boosted Difference of Convex Functions Algorithm (sBDCA) to solve the classical LTS problem. From a theoretical point of view, the approach can be seen as a combination of the Boosted Difference of Convex Functions Algorithm and Difference of Convex Functions Algorithm with successive DC decompositions. For the LTS problem, we prove that the algorithm converges to a local solution in the best case linearly, and in the worst case sublinearly. We additionally propose a problem-specific preconditioner that corrects the direction given by the gradient of the objective function, further improving the quality of the algorithmic output. In numerical experiments with synthetic and real-world data sets, we show that sBDCA with preconditioning is both significantly faster than Fast-LTS, and finds drastically lower objective function values especially in settings with many independent variables.
Keywords
- Applications of continuous optimization
- Data driven optimization
- Optimization for learning and data analysis
Status: accepted
Back to the list of papers