EURO-Online login
- New to EURO? Create an account
- I forgot my username and/or my password.
- Help with cookies
(important for IE8 users)
3280. Advancing NLP classification with disentangled embeddings: a post-processing approach to separate style from content
Invited abstract in session TC-31: Analytics and the link with stochastic dynamics III, stream Analytics.
Tuesday, 12:30-14:00Room: 046 (building: 208)
Authors (first author is the speaker)
1. | Nico Hambauer
|
Faculty of Informatics and Data Science, University of Regensburg | |
2. | Michelle Fribance
|
Friedrich Alexander University | |
3. | Patrick Zschech
|
Leipzig University | |
4. | Mathias Kraus
|
University of Regensburg |
Abstract
In the field of natural language processing (NLP), identifying cheap talk remains a challenge. Boiling down the sole content from stylized text is one approach to tackle this problem. Therefore, this study translates a recent approach known as PISCO from the computer vision (CV) field into the NLP field. This method post-processes and thereby disentangles embeddings into distinct style and content latent vector coordinates. We then isolating the chunk of neurons that only encodes the content of a text and further use this information for a classification head. Thereby, we improve the objectivity of NLP applications by removing the confounding influence of stylistic variations. With a focus on financial disclosure analysis, we aim to improve the predictive quality of sentiment and specificity downstream classification tasks. We demonstrate the technique's applicability by applying this method upfront these final classification tasks. Our initial findings suggest a promising enhancement in predictive performance for both downstream tasks. This demonstrates the potential of disentangled embeddings as a novel intermediary step towards improving NLP classification tasks. Our advances enable more accurate and evidence-based decision-making and insight generation, contributing to a more reliable and, to a certain extent, interpretable use of recent rapid advances in NLP models.
Keywords
- Analytics and Data Science
- Machine Learning
- Financial Modelling
Status: accepted
Back to the list of papers