Winter Seminar Series

Fri, April 11, 4:30 PM

60 MINUTES

Developing Dependable and Scalable Coordination Protocols: From Theory to Practice

Abstract

Modern distributed systems heavily rely on the coordination protocols that enable distributed computing primitives (e.g., distributed reset, token passing, leader election). Such primitives are utilized across the software stack for the development of more complex distributed algorithms, from system-on-chip (e.g., cache coherence) to computer networks (e.g., routing, Paxos). Thus, their dependability and scalability have profound impacts on the overall reliability of modern distributed systems (e.g., distributed key-value storage, distributed AI). However, developing protocols that preserve their functional and non-functional properties regardless of their scale is a complex problem, evidenced by protocols whose proof of correctness takes years (e.g., Chord for distributed hashing). The scalability requirement exacerbates the design and verification problems because scalable protocols are required to function correctly for an unbounded number of nodes and unbounded variable domains. Thus, it is highly desirable to develop methods and tools that automate the design and verification of distributed coordination protocols. In this talk, I will first present the most recent research results on the verification and synthesis of protocols that preserve their fault tolerance while being scalable. Then, I will discuss some practical guidelines that help mainstream engineers in the development of dependable and scalable distributed systems. Finally, I will present some real-world applications of automated design, followed by a discussion on open problems and ongoing research.

Presenter

Ali Ebnenasir

Associate Professor @ Michigan Technological University

Ali Ebnenasir is an Associate Professor of Computer Science at Michigan Technological University and a senior member of the ACM. Ali received his PhD in Computer Science from Michigan State University, and then joined the Department of Computer Science at Michigan Tech, where he is currently an associate professor. His research and teaching interests include Software Engineering, Dependable Distributed Computing, Formal Methods, and Distributed Quantum Computing. Ali's research has been sponsored by the National Science Foundation of USA, IEEE Computer Society Technical Committee on Parallel Processing (TCPP), and Michigan Tech's Research Excellence Fund. He has published numerous papers in major international journals and conferences such as IEEE Transactions on Dependable and Secure Computing (TDSC), IEEE Transactions on Software Engineering (TSE), ACM Transactions on Software Engineering and Methodology (TOSEM), ACM Transactions on Autonomous and Adaptive Systems (TAAS), IEEE/ACM International Conference on Automated Software Engineering (ASE) and IEEE International Conference on Distributed Computing Systems (ICDCS).