COALESCENCE
Project: Coded Computation for Large Scale Machine Learning with Privacy Guarantees
Collaborating Departments: Institute for Communications Engineering (TUM); Department Electrical and Electronig Engineering (Imperial)
Recent years have witnessed an explosive growth in large-scale machine learning (ML) and big data applications. The sheer volume of computations carried out by these applications often requires cloud infrastructures deployed at large data centres. However, these massively distributed large-scale cloud platforms suffer major performance bottlenecks. While computation power can be considered limitless, available servers are often unreliable due to resource sharing among many tasks, resulting in the straggler problem, where the overall computation speed is limited by the slowest servers.
Also, when the computation power is no longer an issue, communication delay becomes a major bottleneck. Other challenges include the privacy against the honest-but-curious cloud platforms, and the security against adversaries. “Coded computing” has recently emerged as a new paradigm, where coding theoretic ideas are employed to optimally inject and leverage redundant computations to overcome these bottlenecks. In this project, we bring together the complementary expertise of the two research groups (Imperial team in communications and ML, TUM team in coding theory) to develop new coded computing techniques for distributed ML applications with significant potential academic and commercial impact. The project will not only design novel coded computing techniques, such as rateless coded computation and bivariate polynomial coded computation, but also implement the developed codes on distributed computing platforms for large scale ML tasks.
We currently focus on 4 subtopics within the general large-scale machine learning thematic, namely; function-correcting codes, neural network compression, sequential goal-oriented compression and novel optimization techniques for neural network training. Function-correcting codes refers to the use case where a transmitter would like to encode a message such that a receiver will be able to successfully recover only a function of the message, instead of the message itself. We aim at generalizing and extending work on the field, and potentially apply our research to specific functions that are relevant to our project, such as activation functions for machine learning. Additionally, we are currently exploring (sequential) goal-oriented compression schemes, which contrary to conventional communication systems that aim at high accuracy and fidelity, aim at adapting the compression process to the specific task at hand (goal-oriented). We are also working on neural network compression techniques, on utilization and optimization of existing compression pipelines in an attempt to produce compact representations of machine learning models while preserving their (pre-compression) capabilities to the greatest degree possible. Lastly, we are aiming at improving the performance of neural network training algorithms in terms of exploring the communication overhead vs. computational complexity trade-off. On this topic, V. Papadopoulou recently submitted a proposal at the Qualcomm Innovation Fellowship 2023 program.
Conferences:
S. Kobus and D. Gunduz (2023). "Generalized Lossless Compression for Constrained Decoder”, 2023 IEEE International Symposium on Information Theory (ISIT) [Submitted]
M. Egger, R. Bitar, A. Wachter-Zeh, D. Gunduz, and N. Weinberger (2023). Maximal-Capacity Discrete Memoryless Channel Identification, 2023 IEEE International Symposium on Information Theory [Submitted].
V. Papadopoulou (2023). “Neural Network Quantization” [Presentation at the 2023 Joint Workshop on Communications and Coding].
Maximilian Egger; Rawad Bitar; Antonia Wachter-Zeh; Deniz Gündüz (2022): Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits, at the 2022 IEEE International Symposium on Information Theory (ISIT).
DOI
Team
Principal Investigator (Imperial)
Professor Deniz Gunduz
Professor in Information Processing | Imperial
Principal Investigator (TUM)
Professor Antonia Wachter-Zeh from TUM
Associate Professor of Coding and Cryptography
Doctoral Candidate (Imperial)
Szymon Kobus