Lin Liu’s Website
I am an Assistant Professor at the Institute of Natural Sciences (INS) at Shanghai Jiao Tong University (SJTU). I am affiliated with the School of Mathematical Sciences and the SJTU-YALE Joint Center for Biostatistics and Data Science. I am also involved in research activities of the Smart Justice Lab of the Koguan Law School at SJTU.
I graduated from the Department of Biostatistics at Harvard University in 2018. My advisors are Professor Franziska Michor and Professor James M. Robins. I am interested in nonparametric and semiparametric statistical theory, causal inference, applied statistics, and computational and mathematical biology.
I am also interested in the theory of deep learning, estimation, and inference in inverse problems and applying causal inference tools in biomedical research.
I obtained my undergraduate degree from the School of Life Sciences at Tongji University, under the supervision of Professor Yong Zhang.
You can reach me by email: linliu@alumni.tongji.edu.cn or linliu@sjtu.edu.cn
Selected Papers
(You can find all my papers on my Google Scholar profile.) (Italic: co-first authorship; #: (co-)corresponding authorship; $: student mentee)
Working papers:
Sihui Zhao$, Xinbo Wang$, LL#, Xin Zhang. Covariate Adjustment in Randomized Experiments Motivated by Higher-Order Influence Functions. In preparation.
Replicating code and supplementary materials: link.
Xingyu Chen$, LL, Rajarshi Mukherjee. Method-of-Moments Inference for GLMs and Doubly Robust Functionals under Proportional Asymptotics. Under submission.
Chaozhi Zhang, LL#, Xiaoqun Zhang#. Few-shot Multi-Task Learning of Linear Invariant Features with Meta Subspace Pursuit (MetaSP). Submitted.
Xin Zhang, Haitao Chu, LL, Satrajit Roychoudhury. A Robust Score Test for Covariate Adjustment in Randomized Controlled Trials using G-computation. Revision invited.
Xinbo Wang$, Junyuan Liu$, Sheng’en Shawn Hu, Zhonghua Liu, Hui Lu#, LL#. HILAMA: High-dimensional multi-omic mediation analysis with latent confounding. Revision resubmitted.
HILAMA: software link.
LL, Xinbo Wang, and Yuhao Wang. Root-n consistent semiparametric learning with high-dimensional nuisance functions under minimal sparsity. Under review.
Kerollos Wanis, LL, Nelya Melnitchouk, and James M Robins. Falsification using higher order influence functions for double machine learning estimators of causal effects. Revision Invited by Biometrics.
LL# and Chang Li$. New $\sqrt{n}$-consistent, numerically stable higher-order influence function estimators. Working draft.
LL, Rajarshi Mukherjee, Whitney Newey, and James M. Robins. Semiparametric efficient empirical higher-order influence function estimators. Under Review.
Statistical and Learning Theory:
LL, Rajarshi Mukherjee, James M Robins, and Eric Tchetgen Tchetgen. Adaptive estimation of nonparametric functionals. (2021). Journal of Machine Learning Research, 22 (99): 1-66.
LL, Rajarshi Mukherjee, and James M Robins. On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning. (2020). Statistical Science, 35 (3): 518-539. (arXiv: 1904.04276)
See the Discussion (arXiv: 2006.09613) of our paper by Edward H. Kennedy, Siva Balakrishnan, and Larry Wasserman and our Rejoinder (arXiv: 2008.03288)
LL, Rajarshi Mukherjee, and James M Robins. Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators. In press in the Journal of Econometrics. (arXiv: 2306.10590)
Statistical and Causal Inference Methodology:
Qinshuo Liu, Zixin Wang, Xi-An Li, Xinyao Ji, Lei Zhang, LL#, and Zhonghua Liu#. DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation. (2024). International Conference on Machine Learning.
Siqi Xu, LL#, and Zhonghua Liu#. DeepMed: Semiparametric causal mediation analysis with debiased deep learning. (2022). Advances in Neural Information Processing Systems 35: 28238-28251. (arXiv: 2210.04389)
DeepMed: software link.
LL, Zach Shahn, James M Robins and Andrea Rotnitzky. Efficient estimation of optimal regimes under a no direct effect assumption. (2021). Journal of the American Statistical Association: Theory and Methods, 116 (533): 224-239. (arXiv: 1908.10448)
Statistical Computing:
Lei Li, LL, Yuzhou Peng$. A splitting Hamiltonian Monte Carlo method for efficient sampling. (2023). CSIAM Transactions on Applied Mathematics, 4 (1): 41-73. (arXiv: 2105.14406)
Statistical Methods for Experiments and Clinical Trials:
Long-Shen Xie, LL, Shein-Chung Chow and Hui Lu. Determining the Extent and Frequency of On-Site Monitoring: A Bayesian Risk-Based Approach. (2024). BMC Medical Research Methodology, 24: 14.
Mathematical and Computational Biology:
Nana Wei, Yating Nie, LL#, Xiaoqi Zheng# and Hua-Jun Wu#. Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data. (2022). PLOS Computational Biology, 18 (12): e1010753. (arXiv: 2205.12432)
Secuer: software link
Sheng’en S. Hu, LL, Qi Li, Wenjing Ma, Michael J. Guertin, Clifford A. Meyer, Ke Deng, Tingting Zhang, Chongzhi Zang. Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA. (2022). Nature Communications, 13: 5533.
SELMA: software link
Kyle S Smith, LLL, Shridar Ganesan, Franziska Michor, and Subhajyoti De. Nuclear topology modulates the mutational landscapes of cancer genomes. (2017). Nature Structural & Molecular Biology, 24 (11): 1000-1006.
LLL, Justin Brumbaugh, Ori Bar-Nur, Zachary Smith, Matthias Stadtfeld, Alexander Meissner, Konrad Hochedlinger, and Franziska Michor. Probabilistic modeling of reprogramming to induced pluripotent stem cells. (2016). Cell Reports, 17 (12): 3395-3406.
Philipp M Altrock, LLL, and Franziska Michor. The mathematics of cancer: integrating quantitative models. (2015). Nature Reviews Cancer, 15 (12): 730-745.
Jasmine Foo, LLL, Kevin Leder, Markus Riester, Yoh Iwasa, Christoph Lengauer, and Franziska Michor. An evolutionary approach for identifying driver mutations in colorectal cancer. (2015). PLOS Computational Biology, 11 (9): e1004350.
LL, Subhajyoti De, and Franziska Michor. DNA replication timing and higher-order nuclear organization determine single-nucleotide substitution patterns in cancer genomes. (2013). Nature Communications, 4: 1502.
Biomedical Applications:
Kai Hou, LL, Zhi-Hui Fang, Wei-Xing Zong, Daqiang Sun, Zhigang Guo, Lu Cao. The role of ferroptosis in cardio-oncology. (2024). Archives of Toxicology, 98: 709-734.
Harsh Parikh, Kentaro Hoffman, Haoqi Sun, Wendong Ge, Jin Jing, LL, Jimeng Sun, Aaron Struck, Sahar Zafar, Alexander Volfovsky, Cynthia Rudin, and M. Brandon Westover. Effects of epileptiform activity on discharge outcome in critically ill patients in the USA: A retrospective cross-sectional study (2023). The Lancet Digital Health, 5 (8): e495-e502. (arXiv: 2203.04920)
Jeremy R. Glissen Brown, Nabil M. Mansour, Pu Wang, Maria Aguilera Chuchuca, Scott B. Minchenberg, Madhuri Chandnani, LL, Seth A. Gross, Neil Sengupta, Tyler M. Berzin. Deep learning computer-aided polyp detection reduces Adenoma Miss Rate: A U.S. multi-center randomized tandem colonoscopy study (CADeT-CS Trial). (2022). Clinical Gastroenterology and Hepatology, 20 (7): 1499-1507.e4.
Miscellaneous:
Zixiao Wang$, Yi Feng$, LL#. Book Review: Semiparametric regression in R by Jaroslaw Harezlak, David Ruppert, and Matt P. Wand. (2022). Journal of the American Statistical Association: Theory and Methods, 117 (540): 2283-2287.
LL. Book Review: Matrix-Based Introduction to Multivariate Data Analysis, 2nd Edition by Kohei Adachi. (2021). Biometrics, 77 (4): 1498-1500.
Papers not intended for publication:
James M Robins, Lingling Li, Lin Liu, Rajarshi Mukherjee, Eric Tchetgen Tchetgen, and Aad van der Vaart. Minimax estimation of a functional on a structured high-dimensional model (2023). (arXiv: 1512.02174).
Note: This is the corrected version of the paper James M Robins, Lingling Li, Rajarshi Mukherjee, Eric Tchetgen Tchetgen, and Aad van der Vaart. Minimax estimation of a functional on a structured high-dimensional model (2017). Annals of Statistics, 45 (5): 1951-1987. published in the Annals of Statistics in 2017 (I am not an author of the published version).
Teaching
Fall 2020: Computational methods (undergraduate students in engineering or economics @ SJTU)
Fall 2020 – now: Advanced mathematical statistics (graduate students in statistics @ SJTU)
Summer 2021 – now: Causal inference methods in data science (graduate students in statistics, biostatistics, applied mathematics, and life sciences @ SJTU)
Spring 2023 – now: Bayesian statistics (undergraduate students in statistics, (applied) mathematics, and economics @ SJTU)
Service
Area Chair (AC) for CLeaR 2023, 2024.
Links
A reading group on interacting particle systems organized by Lei Li