PI : Qin Zhang
NSF IIS-1633215 (September 2016 - August 2020)
A number of phenomena of societal importance, such as the spread of diseases and contagion processes, can be modeled by stochastic processes on networks. The analysis and control of such network phenomena involve, at their heart, fundamental graph-theoretic problems. The graphs encountered are typically of large-scale (having tens of millions of nodes); further, typical experimental analyses involve large designs with a number of parameters, leading to hundreds of thousands of graph computations. Novel methods for solving these problems are needed, since fast response times are critical to effective decision making.
The overarching goal of this project is to develop efficient distributed algorithms and associated lower bounds for graph-theoretic problems that arise in computational epidemiology and contagion dynamics. This will have a significant impact on these specific applications, through more efficient algorithmic tools for enabling complex analyses.
This is a collaborative project with Prof. Gopal Pandurangan (University of Houston) and Prof. Anil Vullikanti (Virginia Tech).
Distributed Statistical Estimation of Matrix Products with Applications [conf. talk]
with D. P. Woodruff
Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 18), pages 383-394. Houston, TX, U.S.A., June 2018.
Communication-Efficient Distributed Skyline Computation
with H. Zhang
Proc. ACM International Conference on Information and Knowledge Management (CIKM 17), pages 437-446. Singapore, November 2017.
Distributed Partial Clustering (preliminary full version, 20 pages) [conf. talk]
with S. Guha and Y. Li
Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 17), pages 143-152. Washington D.C., U.S.A., July 2017.
Best Paper Award
Invited to special issue for SPAA 2017 papers in ACM Transactions on Parallel Computing (TOPC)
- We give the first systematic study of communication/round-efficient distributed partial clustering, by providing almost tight bounds for partial k-center, k-median, and k-means in both deterministic and stochastic settings.
- As a byproduct, we develop the first algorithms for the partial k-median and k-means objectives that run in subquadratic time.
Communication-Optimal Distributed Clustering
with J. Chen, H. Sun and D. Woodruff
Proc. Annual Conference on Neural Information Processing Systems (NIPS 16), pages 3720-3728. Barcelona, Spain, December 2016.
- Distributed Statistical Estimation of Matrix Products with Applications
At PODS 2018 , Houston, TX, U.S.A., June 2018.
- Communication-Efficient Distributed Computation
At Indiana University SICE Research Horizons, Bloomington, IN, U.S.A., April 2018.
- Communication-Efficient Distributed Skyline Computation
At CIKM 2017 , Singapore. November 2017. (By Haoyu Zhang)
- Distributed Partial Clustering
At SPAA 2017 , Washington D.C., U.S.A., July 2017.
Also at HKUST , Hong Kong. June 2017.
- Some New Questions in Communication Complexity
At Communication Complexity and Applications, II , Banff, Alberta, Canada. March 2017.
- Communication Complexity for Distributed Graphs.
At ADGA: Workshop on Advances in Distributed Graph Algorithms, Paris, France. September 2016.
Educational and Other Development
A Trilogy of Courses:
Code libraries for streaming/distributed algorithms: