We propose a Likelihood Matching approach for training diffusion models by first establishing an equivalence between the likelihood of the target data distribution and a likelihood along the sample path of the reverse diffusion. To efficiently compute the reverse sample likelihood, a quasi-likelihood is considered to approximate each reverse transition density by a Gaussian distribution with matched conditional mean and covariance, respectively. The score and Hessian functions for the diffusion generation are estimated by maximizing the quasi-likelihood, ensuring a consistent matching of both the first two transitional moments between every two time points. A stochastic sampler is introduced to facilitate computation that leverages both the estimated score and Hessian information. We establish consistency of the quasi-maximum likelihood estimation, and provide non-asymptotic convergence guarantees for the proposed sampler, quantifying the rates of the approximation errors due to the score and Hessian estimation, dimensionality, and the number of diffusion steps. Empirical and simulation evaluations demonstrate the effectiveness of the proposed Likelihood Matching and validate the theoretical results.
@article{qian2026LM,title={Likelihood Matching for Diffusion Models},author={Qian, Lei and Su, Wu and Huang, Yanqi and Chen, Song Xi},year={2026},eprint={2508.03636},archiveprefix={arXiv},primaryclass={stat.ML},}
2025
arXiv
Partially Functional Dynamic Backdoor Diffusion-based Causal Model
Xinwen Liu, Lei Qian, Song Xi Chen, and Niansheng Tang
Causal inference in settings involving complex spatio-temporal dependencies, such as environmental epidemiology, is challenging due to the presence of unmeasured confounding. However, a significant gap persists in existing methods: current diffusion-based causal models rely on restrictive assumptions of causal sufficiency or static confounding. To address this limitation, we introduce the Partially Functional Dynamic Backdoor Diffusion-based Causal Model (PFD-BDCM), a generative framework designed to bridge this gap. Our approach uniquely incorporates valid backdoor adjustments into the diffusion sampling mechanism to mitigate bias from unmeasured confounders. Specifically, it captures their intricate dynamics through region-specific structural equations and conditional autoregressive processes, and accommodates multi-resolution variables via functional data techniques. Furthermore, we provide theoretical guarantees by establishing error bounds for counterfactual estimates. Extensive experiments on synthetic data and a real-world air pollution case study confirm that PFD-BDCM outperforms current state-of-the-art methods.
@article{liu2026PFDBDCM,title={Partially Functional Dynamic Backdoor Diffusion-based Causal Model},author={Liu, Xinwen and Qian, Lei and Chen, Song Xi and Tang, Niansheng},year={2025},eprint={2509.00472},archiveprefix={arXiv},primaryclass={stat.ML},}
2023
QTQM
An EWMA chart for high dimensional process with multi-class out-of-control information via random forest learning
Mingze Sun, Lei Qian, Amitava Mukherjee, and Dongdong Xiang
Modern manufacturing and quality monitoring involve multi-class out-of-control (OOC) information from the training sample. It is essential to use such information during online monitoring of data streams from complex processes. In this paper, a monitoring framework is designed by combining the random forest technique with the exponentially weighted moving average method for monitoring complex processes with multi-class OOC information. To be specific, a process surveillance technique in the form of a control chart is proposed based on the probability that the online data is classified as an in-control (IC) sample, and the control chart triggers an alarm when the probability is lower than the control limit. Our numerical findings based on the Monte–Carlo simulation show that the proposed control chart performs more effectively than its competitors under various distributions and data types, especially for high-dimensional cases when multi-class OOC information is known in advance. Moreover, the proposed method is illustrated with an application using the data related to the hard disk manufacturing processes.
@article{sun2023EWMA,author={Sun, Mingze and Qian, Lei and Mukherjee, Amitava and Xiang, Dongdong},title={An EWMA chart for high dimensional process with multi-class out-of-control information via random forest learning},journal={Quality Technology \& Quantitative Management},volume={0},number={0},pages={1-27},year={2023},publisher={Taylor & Francis},doi={10.1080/16843703.2023.2244213},url={ https://doi.org/10.1080/16843703.2023.2244213},eprint={https://doi.org/10.1080/16843703.2023.2244213},}