CAREER:Foundation of Communication-Efficient Distributed Computation and Monitoring


PI : Qin Zhang
NSF CCF-1844234 (June 2019 - May 2024)


Abstract

Through the massive use of mobile devices, data clouds, and the rise of the Internet of Things, large amounts of data have been generated, digitized, and analyzed for the benefit of society. As data are often collected and maintained at different sites, communication has become necessary for nearly every computational task. Moreover, decision makers naturally want to maintain a centralized view of all the data in a timely manner, which requires frequent queries on the distributed data and, in the extreme, continuous monitoring of the query output. The cost of communication has naturally become the bottleneck for such applications. This project aims to develop communication-efficient solutions for distributed computation and monitoring.

This project targets three fundamental aspects of distributed computation: (1) the tradeoffs between the communication cost and the number of rounds of the computation in distributed one-shot computation, (2) the power of data partitioning, and (3) the connections between distributed one-shot computation and continuous monitoring.


Papers

  1. Parallel Best Arm Identification in Heterogeneous Environments
    with N. Karpov
    Proc. ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 24), pages to appear. Nantes, France, June 2024.

    • We have shown a separation between the homogeneous CL model and the heterogeneous CL model for adaptive algorithms .

  2. Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits
    with N. Karpov
    Proc. AAAI Conference on Artificial Intelligence (AAAI 24). Vancouver, Canada. February 2024.

  3. MinJoin++: A Fast Algorithm for String Similarity Joins under Edit Distance
    with N. Karpov and H. Zhang
    VLDB Journal. Published in August 2023.

  4. Communication-Efficient Collaborative Best Arm Identification
    with N. Karpov
    Proc. AAAI Conference on Artificial Intelligence (AAAI 23). Washington, D.C., USA. February 2023.

  5. SyncSignature: A Simple, Efficient, Parallelizable Framework for Tree Similarity Joins.
    with N. Karpov
    Proc. International Conference on Very Large Databases (VLDB 23). Vancouver, Canada. August-September 2023.

  6. Collaborative Top Distribution Identifications with Limited Interaction (preliminary full version, 47 pages)
    with N. Karpov and Y. Zhou
    Proc. IEEE Symposium on Foundations of Computer Science (FOCS 20). Virtual conference, November 2020.

    • We have shown a strong separation on the top-1 arm identification and top-k arm identifications in the collaborative learning model.

  7. Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits (preliminary full version, 37 pages) [conf. talk]
    with C. Tao and Y. Zhou
    Proc. IEEE Symposium on Foundations of Computer Science (FOCS 19), pages 126-146. Baltimore, MD, U.S.A., November 2019.

    • We conduct a systematic study of the best arm identification problem in the setting of collaborative learning with limited interaction.
    • We obtain almost tight round-speedup tradeoffs for both fixed-time and fixed-confidence settings, and show a complexity separation for the two variants.
    • We develop two new techniques for proving round lower bounds for multi-agent collaborative learning.

  8. MinJoin: Efficient Edit Similarity Joins via Local Hash Minima
    with H. Zhang
    Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 19) (oral presentation), pages 1093-1103. Anchorage, AK, U.S.A., August 2019.

  9. Distributed and Streaming Linear Programming in Low Dimensions
    with S. Assadi and N. Karpov
    Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 19), pages 236-253. Amsterdam, The Netherlands, June-July, 2019.
    Invited to the special issue for PODS 2019 in ACM Transactions on Database Systems (TODS).


Educational and Other Development

A Trilogy of Courses: Code library for distributed algorithms:

Personnel