3042. ReverseGWAS: a GWAS variant with a novel reformulation of MINLPs as MILPs
Invited abstract in session MC-56: Methods & Models in Computational Biology, stream Computational Biology, Bioinformatics and Medicine.
Monday, 12:30-14:00Room: Liberty 1.11
Authors (first author is the speaker)
| 1. | Leonid Chindelevitch
|
| Infectious Disease Epidemiology, Imperial College London | |
| 2. | Asa Hedman
|
| Pfizer | |
| 3. | Dmitri Bichko
|
| Pfizer | |
| 4. | Daniel Ziemek
|
| Pfizer |
Abstract
We developed a variant of genome-wide association studies (GWAS) involving multiple phenotypes, where the phenotypic architecture can be described by a logical combination of individual phenotypes, such as a CNF or a DNF formula. Since brute-force enumeration of such logical combinations is impractical, we leverage mixed-integer optimisation, whose objective, however, is generally non-linear; for example, we may wish to optimise Fisher exact p-values.
We successfully reformulate these MINLPs as MILPs in three steps. First, we leverage a binary search to constrain the objective both above and below. Second, we use a fast (but not well-known) Imai-Iri algorithm to express this constraint with a piecewise linear function guaranteed to have the fewest segments possible. Lastly, we test the feasibility of the resulting MILP.
We successfully tested our approach, called ReverseGWAS, on large-scale genomic data (over 300,000 genomic variants in over 300,000 individuals in UK Biobank) to identify promising combination phenotypes, and replicated those in an independent cohort, FinnGen. The approach scales extremely well with both autoimmune traits as well as ICD10 codes being used as phenotypes.
In summary, our approach provides a scalable way to translate MINLPs with non-linear objective function into MILPs. Our method includes the Imai-Iri algorithm implemented both as standalone R code as well as fast C++ code, available via the rgwas package at github.com/Leonardini.
Keywords
- Programming, Nonlinear
- Computational Biology, Bioinformatics and Medicine
- Large Scale Optimization
Status: accepted
Back to the list of papers