List of works
Conference proceeding
Toward Human-Aligned LLM Reviews for Scientific Papers
Published 09/15/2025
Proceedings IEEE International Conference on e-Science: eScience 2025, 363 - 364
IEEE International Conference on e-Science: eScience 2025, 09/15/2025–09/18/2025, Chicago, Illinois, USA
The peer review process is strained by increasing submission volumes, reviewer fatigue, and inconsistent standards. While Large Language Models (LLMs) can aid in reviews, they are often overly optimistic and lack technical depth. We developed an innovative prompting strategy that, when applied to ChatGPT-4 on ICLR 2025 papers, reduced score inflation and generated reviews more closely aligned with human reviewer median scores.
Conference proceeding
Training Variational Autoencoders for Population Synthesis in Public Health with Missing Data
Published 12/15/2024
IEEE International Conference on Big Data, 4969 - 4973
IEEE International Conference on Big Data (BigData), 12/15/2024–12/18/2024, Washington, DC, USA
The importance of social determinants of health in shaping equitable public health policies is gaining increasing recognition. Emerging data sources, such as mobility and social media data, are becoming key to public health models. However, privacy concerns often limit access to these sensitive data, as even anonymized datasets are vulnerable to deductive disclosure. A promising solution to this challenge is the use of synthetic populations, which not only safeguard privacy but also enable the exploration of various "what-if" scenarios. Most existing studies on synthetic populations assume full access to complete datasets for training. However, in many public health applications, such as census data, this assumption is unrealistic due to incomplete responses, especially regarding sensitive questions like wealth or education. This paper introduces a novel approach for training Variational AutoEncoders (VAEs) with incomplete data, without resorting to missing value imputation. Instead, the VAEs are trained solely on the observed data. Using the 2019 PUMS dataset for Florida, we successfully train VAEs to generate diverse and flexible synthetic populations. By comparing marginal distributions and utilizing t-SNE for analysis, the results highlight the effectiveness of this method in addressing missing data challenges. This work demonstrates the potential of VAEs in generating synthetic populations for health research, even when complete datasets are unavailable, thereby offering a robust solution to advance public health studies while preserving data privacy.
Journal article
Parameter space exploration in pedestrian queue design to mitigate infectious disease spread
Published 2021
Journal of the Indian Institute of Science, 101, 3, 329 - 339
Reducing the interactions between pedestrians in crowded environments can potentially curb the spread of infectious diseases including COVID-19. The mixing of susceptible and infectious individuals in many high-density man-made environments such as waiting queues
involves pedestrian movement, which is generally not taken into account in modeling studies of disease dynamics. In this paper, a social force-based pedestrian-dynamics approach is used to evaluate the contacts among proximate pedestrians which are then integrated with a stochastic epidemiological model to estimate the infectious disease spread in a localized outbreak. Practical application of such multiscale models to real-life scenarios can be limited by the uncertainty in human behavior, lack of data during early stage epidemics, and inherent stochasticity in the problem. We parametrize the sources of uncertainty and explore the associated parameter space using a novel high-efficiency parameter sweep algorithm. We show the effectiveness of a low-discrepancy sequence (LDS) parameter sweep in reducing the number of simulations required for effective parameter space exploration in this multiscale problem. The algorithms are applied to a model problem of infectious disease spread in a pedestrian queue similar to that at an airport security check point. We find that utilizing the low-discrepancy sequence-based parameter sweep, even for one component of the multiscale model, reduces the computational requirement by an order of magnitude.
Journal article
From bad to worse: Airline boarding changes in response to COVID-19
Published 2021
Royal Society Open Science, 8
Airlines have introduced a back-to-front boarding process in response to the COVID-19 pandemic. It is motivated by the desire to reduce passengers’ likelihood of passing close to seated passengers when they take their seats. However, our prior work on the risk of Ebola spread in aeroplanes suggested that the driving force for increased exposure to infection transmission risk is the clustering of passengers while waiting for others to stow their luggage and take their seats. In this work, we examine whether the new boarding processes lead to increased or decreased risk of infection spread. We also study the reasons behind the risk differences associated with different boarding processes. We accomplish this by simulating the new boarding processes using pedestrian dynamics and compare them against alternatives. Our results show that back-to-front boarding roughly doubles the infection exposure compared with random boarding. It also increases exposure by around 50% compared to a typical boarding process prior to the outbreak of COVID-19. While keeping middle seats empty yields a substantial reduction in exposure, our results show that the different boarding processes have similar relative strengths in this case as with middle seats occupied. We show that the increased exposure arises from the proximity between passengers moving in the aisle and while seated. Such exposure can be reduced significantly by prohibiting the use of overhead bins to stow luggage. Our results suggest that the new boarding procedures increase the risk of exposure to COVID-19 compared with prior ones and are substantially worse than a random boarding process.
Journal article
Published 2021
Archives of Computational Methods in Engineering, 1 - 26
An overview of high-fidelity modeling of pathogen propagation, transmission and mitigation in the built environment is given. In order to derive the required physical and numerical models, the current understanding of pathogen, and in particular virus transmission and mitigation is summarized. The ordinary and partial differential equations that describe the flow, the particles and possibly the UV radiation loads in rooms or HVAC ducts are presented, as well as proper numerical methods to solve them in an expedient way. Thereafter, the motion of pedestrians, as well as proper ways to couple computational fluid dynamics and computational crowd dynamics to enable high-fidelity pathogen transmission and infection simulations is treated. The present review shows that high-fidelity simulations of pathogen propagation, transmission and mitigation in the built environment have reached a high degree of sophistication, offering a quantum leap in accuracy from simpler probabilistic models. This is particularly the case when considering the propagation of pathogens via aerosols in the presence
of moving pedestrians.
Journal article
Architecture‑aware modeling of pedestrian dynamics
Published 2021
Journal of the Indian Institute of Science, 101, 341 - 356
The spread of infectious diseases arises from complex interactions between disease dynamics and human behavior. Predicting the outcome of this complex system is difficult. Consequently, there has been a recent emphasis on comparing the relative risks of different policy
options rather than precise predictions. Here, one performs a parameter sweep to generate a large number of possible scenarios for human behavior under different policy options and identifies the relative risks of different decisions regarding policy or design choices. In particular, this approach has been used to identify effective approaches to social distancing in crowded locations, with pedestrian dynamics used to simulate the movement of individuals. This incurs a large computational load, though. The traditional approach of optimizing the implementation of existing mathematical models on parallel systems leads to a moderate improvement in computational performance. In contrast, we show that when dealing with human behavior, we can create a model from scratch that takes computer architectural features into account, yielding much higher performance without requiring complicated parallelization efforts. Our solution is based on two key observations. (i) Models do not capture human behavior as precisely as models for scientific phenomena describe natural processes. Consequently, there is some leeway in designing a model to suit the computational architecture. (ii) The result of a parameter sweep, rather than a single simulation, is the semantically meaningful result. Our model leverages these features to perform efficiently on CPUs and GPUs. We obtain a speedup factor of around 60 using this new model on two Xeon Platinum 8280 CPUs and a factor 125 speedup on 4 NVIDIA Quadro RTX 5000 GPUs over a parallel implementation of the existing model. The careful design of a GPU implementation makes it fast enough for real-time decision-making. We illustrate it on an application to COVID-19.
Conference proceeding
Published 03/2020
2020 IEEE Aerospace Conference
IEEE Aerospace Conference, 03/07/2020–03/14/2020, Big Sky, MT, USA
This paper presents an integrated computational modelling framework combining pedestrian dynamics and infection spread models, to analyse the infectious disease spread during the different stages of air-travel. While, commercial air travel is central to the global mobility of goods and people, it has also been identified as a leading factor in the spread of several epidemic diseases including influenza, SARS and Ebola. The mixing of susceptible and infectious individuals in these high people density locations like airports involves pedestrian movement which needs to be taken into account in the modelling studies of disease dynamics. We develop a Molecular Dynamics based social force modeling approach for pedestrian dynamics and combine it with a stochastic infection dynamics model to evaluate the spread of viral infectious diseases in airplanes and airports. We apply the multiscale model for various key components of air travel and suggest strategies to reduce the number of contacts and the spread of infectious diseases. We simulate pedestrian movement during boarding and deplaning of some typical commercial airplane models and movement of people through security check areas. We found specific boarding strategies that reduce the number of contacts. Further, we find that smaller airplanes are more effective in reducing the number of contacts compared to larger airplanes. We propose certain queue configuration that reduces contacts between people and mitigate disease spread.
Conference proceeding
Published 01/31/2020
Cyberinfrastructure for Sustained Scientific Innovation (CSSI) PIs meeting, 02/13/2020–02/14/2020, Seattle, Washington
Pedestrian dynamics provides mathematical models that can accurately simulate the movement of individuals in a crowd. These models allow scientists to understand how different policies, such as boarding procedures on planes, can prevent, or make worse, the transmission of infections. This project seeks to develop a novel software that will provide a variety of pedestrian dynamics models, infection spread models, as well as data so that scientists can analyze the effect of different mechanisms on the spread of directly transmitted diseases in crowded areas. The initial focus of this project is on air travel. However, the software can be extended to a broader scope of applications in movement analysis and epidemiology, such as in theme parks and sports venues.
Journal article
Multiscale model for the optimal design of pedestrian queues to mitigate infectious disease spread
Published 2020
PL o S One, 15
There is direct evidence for the spread of infectious diseases such as influenza, SARS, measles, and norovirus in locations where large groups of people gather at high densities e.g. theme parks, airports, etc. The mixing of susceptible and infectious individuals in these high people density man-made environments involves pedestrian movement which is generally not taken into account in modeling studies of disease dynamics. We address this problem through a multiscale model that combines pedestrian dynamics with stochastic infection spread models. The pedestrian dynamics model is utilized to generate the trajectories of motion and contacts between infected and susceptible individuals. We incorporate this information into a stochastic infection dynamics model with infection probability and contact radius as primary inputs. This generic model is applicable for several directly transmitted diseases by varying the input parameters related to infectivity and transmission mechanisms. Through this multiscale framework, we estimate the aggregate numbers and probabilities of newly infected people for different winding queue configurations. We find that the queue configuration has a significant impact on disease spread for a range of infection radii and transmission probabilities. We quantify the effectiveness of wall separators in suppressing the disease spread compared to rope separators. Further, we find that configurations with short aisles lower the infection spread when rope separators are used.
Journal article
Constrained Linear Movement Model (CALM): Simulation of passenger movement in airplanes
Published 2020
PL o S One, 15
Pedestrian dynamics models the walking movement of individuals in a crowd. It has recently been used in the analysis of procedures to reduce the risk of disease spread in airplanes, relying on the SPED model. This is a social force model inspired by molecular dynamics; pedestrians are treated as point particles, and their trajectories are determined in a simulation. A parameter sweep is performed to address uncertainties in human behavior, which requires a large number of simulations. The SPED model’s slow speed is a bottleneck to performing a large parameter sweep. This is a severe impediment to delivering real-time results, which are often required in the course of decision meetings, especially during emergencies. We propose a new model, called CALM, to remove this limitation. It is designed to simulate a crowd’s movement in constrained linear passageways, such as inside an aircraft. We show that CALM yields realistic results while improving performance by two orders of magnitude over the SPED model.