We consider the problem of learning the inhomogeneous intensity of a counting process, under a sparse segmentation assumption. We introduce a weighted total-variation penalization, using data-driven weights that correctly scale the penalization along the observation interval. We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequalities with fast rates of convergence, and consistency for change-points detection. Read more

High-Dimensional Time-Varying Aalen and Cox Models

M. Z. Alaya, T. Allart, A. Guilloux, S. Lemler

Preprint, 2017, 2017

We consider the problem of estimating the intensity of a counting process in high-dimensional time-varying Aalen and Cox models. We introduce a covariate-specific weighted total-variation penalization, using data-driven weights that correctly scale the penalization along the observation interval. Read more

Binarsity: a Penalization for One-Hot Encoded Features in Linear Supervised Learning

M. Z. Alaya, S. Bussy, S. Gaïffas, A. Guilloux

Journal of Machine Learning Research, 2019

This paper deals with the problem of large-scale linear supervised learning in settings where a large number of continuous features are available. We propose to combine the well-known trick of one-hot encoding of continuous features with a new penalization called binarsity. In each group of binary features coming from the one-hot encoding of a single raw continuous feature, this penalization uses totalvariation regularization together with an extra linear constraint. Read more

Collective Matrix Completion

M. Z. Alaya, O. Klopp

Journal of Machine Learning Research, 2019

Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. Read more

Screening Sinkhorn Algorithm for Regularized Optimal Transport

M. Z. Alaya, M. Bérar, G. Gasso, A. Rakotomamonjy

Proceedings Conference NeurIPS, 2019

We introduce in this paper a novel strategy for efficiently approximating the Sinkhorn distance between two discrete measures. After identifying neglectable components of the dual solution of the regularized Sinkhorn problem, we propose to screen those components by directly setting them at that value before entering the Sinkhorn problem. This allows us to solve a smaller Sinkhorn problem while ensuring approximation with provable guarantees. Read more

Open Set Domain Adaptation using Optimal Transport

M. Kechaou, R. Hérault, M. Z. Alaya, G. Gasso

Proceedings Conference ECML-PKDD, 2020

We present a 2-step optimal transport approach that per-forms a mapping from a source distribution to a target distribution. Here, the target has the particularity to present new classes not present in the source domain. The first step of the approach aims at rejecting the samples issued from these new classes using an optimal transport plan. The second step solves the target (class ratio) shift still as an optimal transport problem. Read more

Partial Gromov-Wasserstein with Applications on Positive-Unlabeled Learning

L. Chapel, M. Z. Alaya, G. Gasso

Proceedings Conference NeurIPS, 2020

Classical optimal transport problem seeks a transportation map that preserves the total mass between two probability distributions, requiring their masses to be equal. This may be too restrictive in some applications such as color or shape matching, since the distributions may have arbitrary masses and/or only a fraction of the total mass has to be transported. Read more

POT: Python Optimal Transport

R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, L. Gautheron, N.T.H. Gayraud, H. Janati, A. Rakotomamonjy, I. Redko, A. Rolet, A. Schutz, V. Seguy, D. J. Sutherland, R. Tavenard, A. Tong, T. Vayer

Journal of Machine Learning Research, 2021

Optimal transport has recently been reintroduced to the machine learning community thanks in part to novel efficient optimization procedures allowing for medium to large scale applications. We propose a Python toolbox that implements several key optimal transport ideas for the machine learning community. Read more

Binacox: Automatic Cut-Points Detection in High-Dimensional Cox Model, with Applications to Genetic Data

S. Bussy, M. Z. Alaya, A. Guilloux, A.-S. Jannot

Biometrics, 2021

We introduce the binacox, a prognostic method to deal with the problem of detecting multiple cut-points per features in a multivariate setting where a large number of continuous features are available. The method is based on the Cox model and combines one-hot encoding with the binarsity penalty, which uses total-variation regularization together with an extra linear constraint, and enables feature selection. Read more

Theoretical Guarantees for Bridging Metric Measure Embedding and Optimal Transport

M. Z. Alaya, M. Bérar, G. Gasso, A. Rakotomamonjy

Neurocomputing, 2021

We propose a novel approach for comparing distributions whose supports do not necessarily lie on the same metric space. Unlike Gromov-Wasserstein (GW) distance which compares pairwise distances of elements from each distribution, we consider a method allowing to embed the metric measure spaces in a common Euclidean space and compute an optimal transport (OT) on the embedded distributions. This leads to what we call a sub-embedding robust Wasserstein (SERW). Read more

Heterogeneous Wasserstein Discrepancy for Incomparable Distributions

M. Z. Alaya, M. Bérar, G. Gasso, A. Rakotomamonjy

arXiv, 2021

Optimal Transport (OT) metrics allow for defining discrepancies between two probability measures. Wasserstein distance is for longer the celebrated OT-distance frequently-used in the literature, which seeks probability distributions to be supported on the same metric space. Because of its high computational complexity, several approximate Wasserstein distances have been proposed based on entropy regularization or on slicing, and one-dimensional Wassserstein computation. Read more

Statistical and Topological Properties of Gaussian Smoothed Sliced Probability Divergences

A. Rakotomamonjy, M. Z. Alaya, M. Bérar, G. Gasso

arXiv, 2021

Gaussian smoothed sliced Wasserstein distance has been recently introduced for comparing probability distributions, while preserving privacy on the data. It has been shown, in applications such as domain adaptation, to provide performances similar to its non-private (non-smoothed) counter-part. However, the computational and statistical properties of such a metric is not yet been well-established. In this paper, we analyze the theoretical properties of this distance as well as those of generalized versions denoted as Gaussian smoothed sliced divergences. Read more

Optimal Transport for Conditional Domain Matching and Label Shift

A. Rakotomamonjy, R. Flamary, G. Gasso, M. Z. Alaya, M. Berar, N. Courty

Machine Learning, 2021

We address the problem of unsupervised domain adaptation under the setting of generalized target shift (joint class-conditional and label shifts). For this framework, we theoretically show that, for good generalization, it is necessary to learn a latent representation in which both marginals and class-conditional distributions are aligned across domains. Read more

Neutron Spectrum Unfolding using two Architectures of Convolutional Neural Networks

M. Bouhadida, A. Mazzi, M. Brovchenko, T. Vinchon, M. Z. Alaya, W. Monange, F. Trompier

Nuclear Engineering and Technology, 2023

We deploy artificial neural networks to unfold neutron spectra from measured energy-integrated quantities. These neutron spectra represent an important parameter allowing to compute the absorbed dose and the kerma to serve radiation protection in addition to nuclear safety. The built architectures are inspired from convolutional neural networks. The first architecture is made up of residual transposed convolution blocks while the second is a modified version of the U-net architecture. Read more

Gaussian-Smoothed Sliced Probability Divergences

M. Z. Alaya, A. Rakotomamonjy, M. Bérar, G. Gasso

Transactions on Machine Learning Research, 2024

Gaussian smoothed sliced Wasserstein distance has been recently introduced for comparing probability distributions, while preserving privacy on the data. It has been shown that it provides performances similar to its non-smoothed (non-private) counterpart. However, the computational and statistical properties of such a metric have not yet been well-established. This work investigates the theoretical properties of this distance as well as those of generalized versions denoted as Gaussian-smoothed sliced divergences GSD. Read more

talks

Apprentissage pour l’Intensité d’Événements avec Points de Rupture

Published: June 02, 2014

Learning the intensity of time events with change-points

Published: March 24, 2015

Binarsity: Prédiction en Grande dimension via la Sparsité Induite par la Binarisation de Variables

Published: June 02, 2015

Learning high-dimensional time-varying Aalen and Cox models

Published: June 02, 2016

Around Supervised Learning with Weighted Total-Variation Penalization

Published: May 11, 2017

Complétion Jointe de Matrices

Published: May 29, 2018

Collective Matrix Completion

Published: June 13, 2018

Binarsity: a penalization for one-hot encoded features in linear supervised learning

Published: September 20, 2018

Binarsity: a penalization for one-hot encoded features in linear supervised learning

Published: November 26, 2018

Screening Sinkhorn Algorithm for Regularized Optimal Transport

Published: July 09, 2019

Screenkhorn: Screening Sinkhorn Algorithm for Regularized Optimal Transport

Published: September 11, 2019

Binarsity : New Penalization for Supervised Learning

Published: May 30, 2020

Screenkhorn: New Algorithm for Regularized Optimal Transport

Published: June 05, 2020

An application of Optimal Transport in Data Science

Published: November 24, 2020

Prédiction via la sparsité induite par la binarisation de variables

Published: March 30, 2021

Collective Matrix Completion

Published: June 21, 2022

Binarsity

Published: August 29, 2022

PUOT: Partial Optimal Transport with Applications on Positive-Unlabeled Learning

Published: September 26, 2022

L’IA au LMAC, Sciences de Données avec Transport Optimal

Published: October 24, 2022

teaching

A12/ Probability and Statistics

University Pierre and Marice Curie, Department of Engineering, 2012

S13/ Algebra and Geometry

University Pierre and Marice Curie, Departement of Mathematics, 2013

A13/ Linear Models II

University Pierre and Marice Curie, Department of Statistics, 2013

S14/ Algebra and Geometry

University Pierre and Marice Curie, Departement of Mathematics, 2014

A15/ Times Series

University Pierre and Marice Curie, Department of Statistics, 2015

S16/ Mathematical Statistics

University Pierre and Marie Curie, Department of Statistics, 2016

A16/ Real Analysis and C2I Certificate

University Paris Nanterre, Department of Mathematics, 2016

S17/ Statistics

University Paris Nanterre, Department of Psychology, 2017

A20/A21/ Algébre Linéaire et Applications

UTC, Department of Computer Sciences, 2021

Bases d’algèbre linéaire; diagonalisation, trigonalisaiton et applications pour des systèmes d’équations différentielles.

A21/ Eléments de probabilités

UTC, Department of Computer Sciences, 2021

Notion d’aléatoire et introduction au calcul des probabilités.

S22/ Fonctions de plusieurs variables réelles et applications

UTC, Department of Computer Sciences, 2022

Continuité, différentiabilité des fonctions de plusieurs variables réelles. Analyse vectorielle. Courbes et surfaces de $\mathbb{R}^3$. Intégrales multiples ; curvilignes, surfaciques. Théorèmes intégraux.

A22/ Révisions d’analyse et d’algèbre

UTC, Department of Computer Sciences, 2022

Synthèse des mathématiques du premier cycle: fonctions d’une ou plusieurs variables, courbes et surfaces, intégrales simples et multiples, équations différentielles, bases de l’algèbre linéaire. L’enseignement se présente sous forme d’un cours-TD fondé sur un document intégrant cours et exercices.

A22/A23/ MT02 - Analyse réelle 1

UTC, Department of Computer Sciences, 2023

Premier volet du module initial de mathématiques de Tronc Commun. Il permet d’acquérir les bases indispensables à l’étude des fonctions d’une variable.

S21/S22/S23/S24 Machine Learning

UTC, Department of Computer Sciences, 2024

Machine learning (apprentissage automatique ou apprentissage machine) est une branche de l’intelligence artificielle (IA), qui est elle même une branche de la science de données. Ce cours est conçu pour faire une présentation des méthodologies et algorithmes de machine learning, dans leurs concepts comme dans leurs cas typiques d’applications. La mise en ouvre de ces concepts se fait en langage de programmation Python.

A24 Algébre Linéaire et Applications

UTSEUS, Chine, 2024

Bases d’algèbre linéaire; diagonalisation, trigonalisaiton et applications pour des systèmes d’équations différentielles.

Mokhtar Z. Alaya

Sitemap

Pages

Posts

Correlation coefficient

portfolio

publications

talks

teaching