PASADENA workshopPrediction and Analysis of Structured AnD Heterogeneous DAta Program9:00 Welcome coffee 9:30 - 10:20 Generalized Concomitant Multi-Task Lasso for sparse multimodal regression 10:20 - 10:50 Coffee break 10:50 - 11:40 Variational inference for Poisson lognormal models: application to multivariate analysis of count data 11:40 - 12:30 Natural Language Processing for social computing : from opinion mining to human-agent interaction 12:30 - 14:00 Lunch (free buffet) 14:00 - 14:50 Infinite Task Learning in RKHSs 14:50 - 15:40 Simulating Alzheimer’s disease progression with personalised digital brain models 15:40 - 16:10 Coffee break 16:10 - 17:00 Structured feature selection in high dimension for precision medicine 17:00 End AbstractsStructured feature selection in high dimension for precision medicine
Chloé-Agathe Azencott (CBIO, Mines ParisTech, Institut Curie, INSERM) This setup poses statistical and computational challenges, and traditional feature selection methods fall short. In my talk, I will present how prior knowledge of the structure of the features can help tackle this problem. In the first part of the talk, I will describe how to integrate additional information on the structure of the features, such as a biological network, to constrain the feature selection procedure. In a second part, I will discuss how to account for the linkage disequilibrium structure of the genome when searching for synergistic (or epistatic) effects between features. Infinite Task Learning in RKHSs
Romain Brault (Thalès) Variational inference for Poisson lognormal models: application to multivariate analysis of count data
Julien Chiquet (AgroParisTech, INRA MIA Paris) Natural Language Processing for social computing : from opinion mining to human-agent interaction
Chloé Clavel (LTCI, Telecom-ParisTech) The Social Computing topic aims to gather research around computational models for the analysis of social interactions whether for web analysis or social robotics. The peculiarity of this theme is its multidisciplinary approach: computational models are established in close collaboration with research fields such as psychology, sociology, and linguistics. They are based on methods from various fields in signal processing (eg speech signal processing for the recognition of emotions), in machine learning (e.g. structured output learning for the detection of opinions in texts ), in computer science (ex: the automatic processing of the natural language for the detection of opinions, the integration of the socio-emotional component in the human-machine interactions). This presentation will describe examples of studies conducted around Social Computing topic. In particular, we will examine the role of natural language processing in human-agent interaction by presenting our progress on the different research topics we are currently working on, such as the analysis of the likes and dislikes of the user during her interactions with a virtual agent using symbolic methods (Langlet & Clavel, 2016) and machine learning methods (Barriere et al., 2018). Opinion mining methods and their challenges in terms of machine learning will also be tackled (Garcia et al., 2018). Simulating Alzheimer’s disease progression with personalised digital brain models
Stanley Durrleman (Aramis Lab, ICM) Simulating the effects of Alzheimer's disease on the brain is essential to better understand, predict and control how the disease progresses in patients. Our limited understanding of how disease mechanisms lead to visible changes in brain images and clinical examination hampers the development of biophysical simulations. Instead, we propose a statistical learning approach, where the repeated observations of several patients over time are used to synthesise personalised digital brain models. They provide spatiotemporal views of structural and functional brain alterations and associated scenarios of cognitive decline at the individual level. We show that the personalisation of the models to unseen subjects reconstructs their progression with errors of the same order as the uncertainty of the measurements. Simulation of synthetic patients generalise the distributions of the data in the training cohort. The analysis of factors modulating disease progression evidences a prominent sexual dimorphism and probable compensatory mechanisms in APEO-ε4 carriers. This first-of-its-kind simulator offers an unparalleled way to explore the heterogeneity of the disease's manifestation on the brain, and to predict its progression in each patient. Concomitant Lasso with Repetitions (CLaR): beyond averaging multiple realizations of heteroscedastic noise
Joseph Salmon (Université de Montpellier) Sparsity promoting norms are frequently used in high dimensional regression. A limitation of Lasso-type estimators is however that the regularization parameter depends on the noise level which varies between datasets and experiments. Estimators such as the concomitant Lasso address this dependence by jointly estimating the noise level and the regression coefficients. As sample sizes are often limited in high dimensional regimes, simplified heteroscedastic models are customary. However, in many experimental applications, data is obtained by averaging multiple measurements. This help reducing the noise variance, yet it dramatically reduces sample sizes, preventing refined noise modeling. In this work, we propose an estimator that can cope with complex heteroscedastic noise structures by using non-averaged measurements and a concomitant formulation. The resulting optimization problem is convex, so thanks to smoothing theory, it is amenable to state-of-the-art proximal coordinate descent techniques that can leverage the expected sparsity of the solutions. Practical benefits are demonstrated on simulations and on neuroimaging applications. This is joint work with Quentin Bertrand, Mathurin Massias et Alexandre Gramfort. |