List of works
Journal article
Fourier methods for efficient sufficient dimension reduction in time series
First online publication 10/30/2025
The Canadian journal of statistics = La revue canadienne de statistique, online ahead of print
Dimension reduction has always been one of the most significant and challenging problems in the analysis of high-dimensional data. In the context of time series analysis, our focus is on the estimation and inference of conditional mean and variance functions. By using central mean and variance dimension reduction subspaces that preserve sufficient information about the response, one can estimate the unknown mean and variance functions. While several approaches exist to estimate the time series central mean and variance subspaces (TS-CMS and TS-CVS), they are often computationally intensive and impractical. By employing the Fourier transform, we derive explicit estimators for TS-CMS and TS-CVS. These estimators are consistent, asymptotically normal, and efficient. Simulation studies evaluate the method's performance, showing it is significantly more accurate and computationally efficient than existing ones. Furthermore, the method is applied to the Canadian lynx dataset.
Journal article
Published Autumn 2025
Journal of structural integrity and maintenance, 10, 4, 2558425
The large-scale use of concrete requires reliable quality assessment to ensure workability, mechanical properties, and durability. Conventional testing methods are often costly and time-consuming. This study explores predictive modeling as an efficient alternative, focusing on self-compacting concrete (SCC) produced and cured with seawater, silica fume, and fly ash. Workability indicators, including slump flow, J-ring, visual stability index (VSI), and air content, were used to predict compressive strength and chloride concentration. Artificial neural networks (ANNs) and Classification and Regression Trees (CART) were applied. The ANN models achieved high accuracy, with compressive strength predicted at a minimum mean squared error (MSE) of 0.085638. The chloride content prediction achieved an R² of 0.9429. CART analysis revealed that air content was the most significant factor influencing compressive strength, while the J-ring had the strongest impact on chloride content. A comparative study demonstrated that ANNs outperformed random forest regression in predictive capability. These results highlight the value of machine learning in concrete research, offering a cost-effective and time-saving method for property evaluation. The findings also support the sustainable use of seawater and supplementary cementitious materials in the production of concrete. The novelty of this study lies in predicting the compressive strength and chloride ion concentration of self-compacting concrete produced and cured with seawater and pozzolans. Neural networks and machine learning were applied for this prediction, an approach not previously explored.
Journal article
Envelope Matrix Autoregressive Models
Accepted for publication 08/19/2025
Journal of business & economic statistics, 1 - 28
Matrix-valued data is commonly collected over time in many scientific fields. However, existing methods for handling such data are limited and often suffer from overparametrization. In response, Chen et al. (2021) introduced the matrix autoregressive (MAR) model as an alternative to traditional time series analysis, which relies on vectorization and vector autoregression frameworks. By preserving the original structure of matrices, the MAR model avoids the loss of valuable column and row information. This approach offers a significant reduction in dimensions and enables explicit interpretations of the data. However, when applied to high-dimensional matrix time series, the MAR model faces challenges due to the large size of the coefficient matrices involved. It struggles to differentiate between relevant and irrelevant information, making it inefficient in extracting relevant information from complex data. To address these limitations, we propose envelope-based MAR (EMAR) models that effectively identify and eliminate irrelevant information. Our proposed EMAR approach achieves substantial efficiency gains in estimation and forecasting by reducing parameters and constructing a link between the mean function and covariance structure. This is achieved by utilizing the minimal reducing subspaces of covariance matrices. We establish the asymptotic properties of our proposed estimators and compare their efficiency and accuracy to existing methods through simulation studies under both normality and non-normality conditions. Furthermore, we provide two real-world applications in economics and business to demonstrate the effectiveness of our approach.
Journal article
AI-augmented failure modes, effects, and criticality analysis (AI-FMECA) for industrial applications
Published 10/2024
Reliability engineering & system safety, 250, 110308
Design failure modes, effects, and criticality analysis (d-FMECA)22The term FMEA and FMECA are used interchangeably throughout this paper. The user has the option in the interface to choose either FMEA or FMECA based on their desired application. is a bottom-up, semi-quantitative risk assessment approach that is used by reliability engineers across all industries (nuclear, chemical, environmental, pharmaceuticals, aerospace, etc.) for identifying the effects of postulated components failure modes such as solenoid-operated valves (SOV), motor-operated valves (MOV), controllers, pumps, sensors of various types, printed circuit boards (PCBs). This research aims to develop a novel AI-augmented tool that guides, in real-time, the risk-analyst to a host of potential failure modes and their effects for each component contained in a bigger system. Through a user-friendly graphical interface and a robust statistical modeling backend, the AI-driven tool streamlines the risk assessment process by prompting the risk analyst to input a system’s name and subsequently generate an extensive array of failure modes and associated effects for each constituent component within the system. This AI-augmented tool allows the user to select either a simplified d-FMEA or a detailed d-FMECA for the system under investigation. This novel AI-driven tool offers significant effort and time savings in conducting d-FMECA, which is known to be a labor-intensive engineering task. In addition, this tool can be used for training risk and reliability professionals.
•Developed a novel AI-augmented tool to replace conventional d-FMEA approach.•Designed a user-friendly graphical interface & robust statistical modeling backend.•This AI-driven tool offers significant effort & time savings in conducting d-FMECA.•Help train junior reliability engineers to effectively conduct d-FMEA.
Journal article
Stacking-Based Neural Network for Nonlinear Time Series Analysis
Published 07/2024
Statistical methods & applications, 33, 3, 901 - 924
Stacked generalization is a commonly used technique for improving predictive accuracy by combining less expressive models using a high-level model. This paper introduces a stacked generalization scheme specifically designed for nonlinear time series models. Instead of selecting a single model using traditional model selection criteria, our approach stacks several nonlinear time series models from different classes and proposes a new generalization algorithm that minimizes prediction error. To achieve this, we utilize a feed-forward artificial neural network (FANN) model to generalize existing nonlinear time series models by stacking them. Network parameters are estimated using a backpropagation algorithm. We validate the proposed method using simulated examples and a real data application. The results demonstrate that our proposed stacked FANN model achieves a lower error and improves forecast accuracy compared to previous nonlinear time series models, resulting in a better fit to the original time series data.
Code
sdrt: Estimating the Sufficent Dimension Reduction Subspaces in Time Series
Published 03/28/2024
The Comprehensive R Archive Network (CRAN)
The sdrt() function is designed for estimating subspaces for Sufficient Dimension Reduction (SDR) in time series, with a specific focus on the Time Series Central Mean subspace (TS-CMS). The package employs the Fourier transformation method proposed by Samadi and De Alwis (2023) and the Nadaraya-Watson kernel smoother method proposed by Park et al. (2009) for estimating the TS-CMS. The package provides tools for estimating distances between subspaces and includes functions for selecting model parameters using the Fourier transformation method.
Code
itdr: An R Package of Integral Transformation Methods to estimate SDR in Regression
Published 02/26/2024
The Comprehensive R Archive Network (CRAN)
The itdr() routine allows for the estimation of sufficient dimension reduction subspaces in univariate regression such as the central mean subspace or central subspace in regression. This is achieved using Fourier transformation methods proposed by Zhu and Zeng (2006) , convolution transformation methods proposed by Zeng and Zhu (2010) , and iterative Hessian transformation methods proposed by Cook and Li (2002) . Additionally, mitdr() function provides optimal estimators for sufficient dimension reduction subspaces in multivariate regression by optimizing a discrepancy function using a Fourier transform approach proposed by Weng and Yin (2022) , and selects the sufficient variables using Fourier transform sparse inverse regression estimators proposed by Weng (2022) .
Preprint
itdr: An R package of Integral Transformation Methods to Estimate the SDR Subspaces in Regression
Posted to a preprint site 04/14/2022
arXiv (Cornell University)
Sufficient dimension reduction (SDR) is an effective tool for regression models, offering a viable approach to address and analyze the nonlinear nature of regression problems. This paper introduces the itdr R package, a comprehensive and user-friendly tool that introduces several functions based on
integral transformation methods for estimating SDR subspaces. In particular, the itdr package incorporates two key methods, namely the Fourier method (FM) and the convolution method (CM). These methods allow for estimating the SDR subspaces, namely the central mean subspace (CMS) and the central subspace (CS), in cases where the response is univariate. Furthermore, the itdr package facilitates the recovery of the CMS through the iterative Hessian
transformation (IHT) method for univariate responses. Additionally, it enables the recovery of the CS by employing various Fourier transformation strategies, such as the inverse dimension reduction method, the minimum discrepancy approach using Fourier transformation, and the Fourier transform sparse inverse regression approach, specifically designed for cases with multivariate responses. To demonstrate its capabilities, the itdr package is applied to five different datasets. Furthermore, this package is the pioneering implementation of integral transformation methods for estimating SDR subspaces, thus promising significant advancements in SDR research.