PI : Qin Zhang
NSF IIS-1633215 (September 2016 - August 2020)
A number of phenomena of societal importance, such as the spread of diseases and contagion processes, can be modeled by stochastic processes on networks. The analysis and control of such network phenomena involve, at their heart, fundamental graph-theoretic problems. The graphs encountered are typically of large-scale (having tens of millions of nodes); further, typical experimental analyses involve large designs with a number of parameters, leading to hundreds of thousands of graph computations. Novel methods for solving these problems are needed, since fast response times are critical to effective decision making.
The overarching goal of this project is to develop efficient distributed algorithms and associated lower bounds for graph-theoretic problems that arise in computational epidemiology and contagion dynamics. This will have a significant impact on these specific applications, through more efficient algorithmic tools for enabling complex analyses.
This is a collaborative project with Prof. Gopal Pandurangan (University of Houston) and Prof. Anil Vullikanti (Virginia Tech).
Papers and Talks
Distributed and Streaming Linear Programming in Low Dimensions
with S. Assadi and N. Karpov
Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 19), to appear. Amsterdam, The Netherlands, June-July, 2019.
A Practical Algorithm for Distributed Clustering and Outlier Detection [poster]
with J. Chen and E. Sadeqi-Azer
Proc. Annual Conference on Neural Information Processing Systems (NeurIPS 18), pages 2253-2262. Montreal, Canada, December 2018.
Distributed Statistical Estimation of Matrix Products with Applications [conf. talk]
with D. P. Woodruff
Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 18), pages 383-394. Houston, TX, U.S.A., June 2018.
Communication-Efficient Distributed Skyline Computation [conf. talk]
with H. Zhang
Proc. ACM International Conference on Information and Knowledge Management (CIKM 17), pages 437-446. Singapore, November 2017.
Distributed Partial Clustering (preliminary full version, 20 pages) [conf. talk]
with S. Guha and Y. Li
Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 17), pages 143-152. Washington D.C., U.S.A., July 2017.
Best Paper Award
Invited to special issue for SPAA 2017 papers in ACM Transactions on Parallel Computing (TOPC)
- We give the first systematic study of communication/round-efficient distributed partial clustering, by providing almost tight bounds for partial k-center, k-median, and k-means in both deterministic and stochastic settings.
- As a byproduct, we develop the first algorithms for the partial k-median and k-means objectives that run in subquadratic time.
Communication-Optimal Distributed Clustering [talk] [poster]
with J. Chen, H. Sun and D. Woodruff
Proc. Annual Conference on Neural Information Processing Systems (NIPS 16), pages 3720-3728. Barcelona, Spain, December 2016.
Educational and Other Development
A Trilogy of Courses:
Code libraries for streaming/distributed algorithms:
This project is supported by the National Science Foundation (NSF) under the project: BIGDATA: Collaborative Research: F: Efficient Distributed Computation of Large-Scale Graph Problems in Epidemiology and Contagion Dynamics. Any opinions, findings, and conclusions or recommendations expressed in this project are those of author(s) and do not necessarily reflect the views of the National Science Foundation.