Columbia University, New York
JUNE 14-18, 2010

Tutorials

Tutorial Track 1 will be in CEPSR 414
Tutorial Track 2 will be in CEPSR 750 (Interschool Lab)

Detailed Schedule

Monday, June 14th - Track 1
8:30am - 10:00am
  Tutorial 1 (Part 1): A Modern Tool for Approximative Queueing Analysis: Theory and Practice
10:30am - 12:00pm
  Tutorial 1 (Part 2): A Modern Tool for Approximative Queueing Analysis: Theory and Practice
1:30pm - 3:00pm
  Tutorial 2 (Part 1): Foundations of Applied Modeling
3:30pm - 5:00pm
  Tutorial 2 (Part 2): Foundations of Applied Modeling
   
Monday, June 14th - Track 2
8:30 - 10:00am
  Tutorial 3 (Part 1): Beyond File Sharing: Recent Technologies and Trends in Peer-To-Peer Systems
10:30am - 12:00pm
  Tutorial 3 (Part 2): Beyond File Sharing: Recent Technologies and Trends in Peer-To-Peer Systems
1:30pm - 3:00pm
  Tutorial 4: Distributed Network Architecture: Variational Principle, Glauber Dynamics and Network Adiabatics
3:30pm - 5:00pm
  Tutorial 5: A Tutorial on Large Deviation Theory for (Non-Theoretical) Computer Scientists
   

Tutorial Details

Foundations of Applied Modeling
Jeffrey P. Buzen, Independent Consultant

Computer systems and communication networks are, for the most part, deterministic in nature. However, they are driven by workloads that vary unpredictably from moment to moment. It is common practice to characterize uncertainty in workload behavior using random variables, thereby providing the basis for stochastic models. These models can be solved analytically or evaluated with numerical methods or Monte Carlo simulation. In all cases, the underlying stochastic models are pure mathematical abstractions that can be fully specified without reference to the properties of any real world system.

This tutorial examines the relationship between mathematical results from the theory of stochastic processes and real world data collected by experimenters and practicing performance analysts. Questions examined in this tutorial include:

  • What do the Ergodic Theorem and the Laws of Large Numbers (strong and weak) imply about the linkage between the mathematical properties of steady state stochastic processes, the output generated by Monte Carlo simulations and data collected by observing and measuring real world systems?
  • What does it mean for a Monte Carlo simulation to be "in steady state" and what does it mean for a real world system to be "steady state"?
  • Can uncertainty in the behavior of deterministic systems be characterized without the use of random variables and stochastic processes?
  • How do alternative characterizations of uncertainty affect the degree of risk in predictions obtained from mathematical models, and how is risk related to the distinction between distributional and trans-distributional properties?
  • Why is it riskier to express service level agreements (SLAs) in terms of response time percentiles rather than average response times?
  • Is it possible to guarantee that the output of a Monte Carlo simulation has converged to the steady state distribution of the underlying stochastic process, and can the output of the simulation be adjusted ("aligned") to improve accuracy if convergence has not occurred?

These questions and others will be explored within the context of simple models that can be described and analyzed using straightforward, easy to follow arguments. The insights obtained from these simple examples generalize immediately to a very broad class of steady state stochastic models.

Speaker Biography
Jeff Buzen has been actively involved in the analysis of computer performance for the past four decades. His discovery of the convolution algorithm and the central server model in the early 1970s helped launch a period of intense worldwide interest in the mathematical analysis of queuing networks. In 1975, he co-founded a software company that was the first to commercialize queuing network technology. BGS Systems, which began in the basement of his home, became the leading provider of tools for performance modeling and capacity planning at major data centers on six continents. The company operated independently until 1998 when it was acquired by BMC Software. Since then, he has been a consultant, researcher, author and lecturer, and has served a term as President of the Computer Measurement Group.

Jeff Buzen holds three degrees in Applied Mathematics: an Sc.B. from Brown, and an M.S and Ph.D. from Harvard. He has been elected to the National Academy of Engineering and is a recipient of the A.A. Michelson award. Prior to co-founding BGS Systems, he was a member of the Harvard faculty where he supervised the Ph.D. dissertations of Ethernet inventor Bob Metcalfe and Internet pioneer John M. McQuillan. He also co-taught a popular course on Operating Systems and Computer Performance attended by several students who have had a profound impact on the field of computing: most notably, Microsoft cofounder Bill Gates.


Beyond File Sharing: Recent Technologies and Trends in Peer-To-Peer Systems

Minghua Chen, The Chinese University of Hong Kong (CUHK); Sudipta Sengupta, Microsoft Research

Peer-to-peer (P2P) systems provide a scalable and cost-effective way to deliver content to multiple receivers over the Internet by allowing peers to exchange their received content. The past decade has witnessed rapid evolution of commercial P2P systems providing various types of services, including file sharing, TV streaming, video-on-demand, video conferencing, and storage. As P2P techniques become ever more popular, today P2P has become a major type of service traffic on the Internet, and often occupy 60% or more of the Internet bandwidth.

The flourish of P2P techniques has attracted a tremendous amount of research efforts to evaluate and improve the performance of P2P systems, and their interactions with ISP. This tutorial will provide for an overview on the research in P2P areas. Topics include: Introduction to P2P systems, modeling of P2P systems, challenges, design, and performance of practical P2P systems, optimizing throughput and delay of P2P systems, utility maximization in P2P systems and its application to P2P conferencing, queuing models and quality of experience of dynamic P2P systems, ISP-friendly P2P systems design and optimization, network coding in P2P design, authentication and security, incentive and exploring helpers in P2P systems, and exploring P2P as a new service infrastructure paradigm.

Outline:

  • Introduction and modeling of P2P systems
  • Challenges, design, and performance of practical P2P systems
  • Optimizing throughput and delay of P2P systems
  • Utility maximization in P2P systems and its application to P2P conferencing
  • Queuing models and quality of experience of dynamic P2P systems
  • ISP-friendly P2P systems design and optimization
  • Network coding in P2P: design and performance evaluation
  • P2P authentication and security
  • P2P as service infrastructure


Intended Audience:
(1) Graduate students and researchers in the areas of networking and computer science.
(2) Practicing professionals in the content provider, multimedia conferencing, (other) peer-to-applications, and Internet Service Provider (ISP) industry.

Depending on the orientation of the attendee, the tutorial can serve to introduce the state-of-the-art in peer-to-peer applications and/or algorithmic approaches for arriving at efficient solutions for content distribution and multimedia conferencing. We will make presentation of most of the material self-contained. We expect some background in basic concepts in graph algorithms/mathematical optimization and Internet networking protocols and architectures.

Speaker Biographies:
Dr. Minghua Chen received his B.Eng. and M.S. degrees from the Department of Electronics Engineering at Tsinghua University in 1999 and 2001, respectively. He received his Ph.D. degree from the Department of Electrical Engineering and Computer Sciences at University of California at Berkeley in 2006. He spent one year visiting Microsoft Research Redmond as a Postdoc Researcher. He joined the Department of Information Engineering, the Chinese University of Hong Kong, in 2007, where he currently is an Assistant Professor. He wrote the book "A General Framework for Flow Control in Wireless Networks" with Avideh Zakhor in 2008, and the book "IPv6 Principle and Practice" with Haisang Wu, Maoke Chen, Xinwei Hu, and Cheng Yan in 2000. He received the Eli Jury award from UC Berkeley in 2007 (presented to a graduate student or recent alumnus for outstanding achievement in the area of Systems, Communications, Control, or Signal Processing), the ICME Best Paper Award in 2009, and the IEEE Transactions on Multimedia Prize Paper Award in 2009. His current research interests include complex system (smart grid), distributed networked systems, distributed and stochastic network optimization and control, peer-to-peer networking, wireless networking, network monitoring and abnormality localization, multi-level trust data privacy, secure network coding and network coding for security.

Dr. Sudipta Sengupta is currently at Microsoft Research, where he is working on peer-to-peer applications (including video streaming and hybrid p2p CDNs), data center systems and networking, and wireless access. Previously, he spent five years at Bell Laboratories, Lucent Technologies, where he advanced the state-of-the-art in Internet routing, optical switching, network security, wireless networks, and network coding. Dr. Sengupta has taught advanced courses/tutorials on networking at academic/research and industry conferences, including IEEE INFOCOM, ACM SIGMETRICS, ACM MOBIHOC, IEEE Hot Interconnects, IEEE GLOBECOM, International Workshop on Design of Reliable Communication Networks (DRCN), OPTICOMM, and NFOEC. He received a Ph.D. and an M.S. in Electrical Engg. & Computer Science from Massachusetts Institute of Technology (MIT), USA, and a B.Tech. in Computer Science & Engg. from Indian Institute of Technology (IIT), Kanpur, India. He was awarded the President of India Gold Medal at IIT-Kanpur for graduating at the top of his class across all disciplines. He has published 50+ research papers in some of the top conferences, journals, and technical magazines. He has authored 30+ patents (granted or pending) in the area of computer networking. Dr. Sengupta received the IEEE Communications Society Leonard G. Abraham Prize for his work on oblivious routing of Internet traffic. He is a Senior Member of IEEE.


A Modern Tool for Approximative Queueing Analysis: Theory and Practice

Florin Ciucu, T-Labs/TU-Berlin; Jens Schmitt, TU-Kaiserslautern

Queueing phenomena are present in many systems such as computer and communication networks. While queueing theory has been traditionally used to exactly solve queueing problems, the stochastic network calculus has been recently developed as an analytical tool for approximative queueing analysis. The calculus offers a tradeoff between the problems' complexity and the solutions' accuracy. Its main feature is an uniform methodology to analyze broad classes of arrivals, service, scheduling algorithms, and multi-queue scenarios, which makes it an attractive alternative to the traditional queueing theory.

The goal of this tutorial is to illustrate the usefulness of the stochastic network calculus by providing the attendees with a good understanding of how to formulate and solve queueing problems with the calculus tool.

Outline:

  • Part I (Perspective, 20 min.): Brief overview of alternative methods for queueing analysis including queueing theory, matrix analytical and large deviations; Provides answers to two fundamental questions: 1) Where does the stochastic network calculus stand, from a mathematical point of view, relative to these other methods?, and 2) Why is the calculus needed anyway?
  • Part II (Theory, 90 min.): Overview of the fundamental concepts and formalisms of the calculus (description of arrival and service processes, analysis of backlog, delay and output processes, multi-queue analysis, statistical independence vs. dependence, probabilistic vs. deterministic analysis).
  • Part III (Practice, 70 min.): Insights into the accuracy of the calculus method by reanalyzing classical queueing systems subject to exact solutions; Analysis of inherently difficult problems (e.g., queues with heavy-tailed and self-similar arrivals, overloaded queues).

Intended audience: researchers confronting with the performance evaluation of queueing systems
Prerequisites: basic to intermediate background in discrete mathematics, calculus, and probability

Speaker Biographies:

Jens Schmitt and Florin Ciucu have been long active in the network calculus research community and publishing related works at major conferences including ACM Sigmetrics, IEEE Infocom and International Teletraffic Congress.
Jens Schmitt is professor for Computer Science at the TU Kaiserslautern. Since 2003 he has been the Head of the Distributed Computer Systems Lab (disco). His research interests are broadly in performance and security aspects of networked and distributed systems. He received his PhD from TU Darmstadt in 2000.

Florin Ciucu was educated at the Faculty of Mathematics, University of Bucharest (B.Sc. in Informatics, 1998), George Mason University (M.Sc. in Computer Science, 2001), and University of Virginia (Ph.D. in Computer Science, 2007). Between 2007 and 2008 he was a Postdoctoral Fellow in the Electrical and Computer Engineering Department at the University of Toronto. Currently he is a Senior Research Scientist with Deutsche Telekom Laboratories (T-Labs). His research interests are in the stochastic analysis of communication networks, resource allocation, and randomized algorithms. Florin is a recipient of the ACM Sigmetrics 2005 Best Student Paper Award.

Distributed Network Architecture: Variational Principle, Glauber Dynamics and Network Adiabatics
Devavrat Shah, MIT

Efficient resource allocation is the fundamental problem that needs to be resolved to achieve a good communication network architecture. Subsequently, the algorithms for resolving contention among various entities contending for network resources serve as the building blocks of the network architecture. For example, medium access control (MAC) algorithm resolves contention between various nodes competing for access of wireless medium, congestion control algorithm resolves contention between various flows contending for network capacity or switch scheduling in an internet router resolves contention between various packets competing for access to switch fabric. By design, these algorithms such as MAC or congestion control, must be distributed and extremely simple while retaining high network performance. And methods to design such distributed, implementable high-performance algorithms have been posing a major intellectual challenge for more than the past few decades.

Very recently, an important progress has been made towards resolving these long standing problems leading to design of distributed and high performance algorithms. These algorithms are based on insights from statical learning theory, statistical physics and theory of Markov chains. Specifically, they utilize the so called classical ``Variation principle'' and ``Glauber dynamics''. The efficiency of the algorithm strongly depends on validity of certain time-scale separation property which can be verified through framework of ``Network adiabatic theorem'' developed recently.
In this tutorial, we shall explain this general method to design distributed network algorithms by means of the example of MAC. We shall also describe how this leads to a complete distributed network architecture for end-to-end operation -- MAC, routing and congestion control. To faithfully represent network dynamics, we shall utilize a combined flow and packet level model.

Speaker Biography:

Devavrat Shah is currently a Jamieson career development associate professor with the department of EECS, MIT and member of LIDS. His research focus is on theory of large complex networks which includes network algorithms, stochastic networks, network information theory and large scale statistical inference. He received his BTech degree from IIT-Bombay in 1999 with the honor of the President of India Gold Medal and Ph.D. from Stanford University in 2004. He was co-awarded the best paper awards at the IEEE INFOCOM '04, ACM SIGMETRICS/Performance '06; and best student paper awards at Neural Information Processing Systems '08 and ACM SIGMETRICS/ Performance '09. He received 2005 George B. Dantzig best disseration award from the INFORMS. He received the first ACM SIGMETRICS Rising Star Award 2008 for his work on network scheduling algorithms.


A Tutorial on Large Deviation Theory for (Non-Theoretical) Computer Scientists

Jun (Jim) Xu, Georgia Tech; Li Zhang, IBM T.J. Watson Research Center

Numerous large deviation techniques (including concentration inequalities) have been developed to bound the probability for the sum of independent random variables to exceed a given threshold. Large deviation principles have also been developed for multi-dimensional spaces, random processes, and the sum of dependent random variables. In this tutorial, we will focus on the subset of of large deviation theory that is most relevant to the performance evaluation of computer and communication systems. In the spirit of this, we plan to include example applications from the computer and communication systems literature for the vast majority of the results we are going to cover in this tutorial.

We hope to cover the following topics:

  • Large deviation principle (LDP) for sum of i.i.d. random variables on $R$, including concept and properties of rate function, Chernoff bounds, and applications to network performance analysis.
  • The principle of the largest term and the contraction principle, with applications likely drawn from the rich literature of effective bandwidth theory.
  • LDP for empirical mean, such as Sanov's theorem, with applications to hypothesis testing in network traffic anomaly detection.
  • Concentration inequalities concerning Martingales such as Azuma's inequality and Kolmogorov's maximal inequality, with applications to network queueing analysis.
  • Worst-case LDP. Schur concavity in the sum of independent non-homogeneous Bernoulli random variables, Schur convexity in the weighted sum of exchangeable random variables, sum of independent random variables in a 2nd moment class.
  • LDP for the sum of dependent random variables such as Markov sequences, time permitting.

Speaker Biographies:

Dr. Jun (Jim) Xu is an Associate Professor in the College of Computing at Georgia Institute of Technology. He received his Ph.D. in Computer and Information Science from The Ohio State University. His current research interests include data streaming algorithms for the measurement and monitoring of computer networks, algorithms and data structures for computer networks, network security, and performance modeling and simulation. He received the NSF CAREER award in 2003 for his efforts in establishing fundamental lower bound and tradeoff results in networking, and 2006 and 2008 IBM Faculty Awards for his research accomplishments in network data streaming and large deviation theory respectively. Dr. Xu is also a co-recipient of the Best Student Paper Award from 2004 ACM Sigmetrics/IFIP Performance joint conference.

Dr. Li Zhang is the manager of the System Analysis and Optimization group at IBM T.J. Watson Research Center. His research interests include control and performance analysis of computer systems, statistical techniques for traffic modeling and prediction, and scheduling and resource allocation in parallel and distributed systems. He has co-authored over 50 technical articles and 10 patents. He received IBM's Outstanding Technical Achievement Award for developing advanced clock synchronization mechanisms for mainframe computers. A math major at Beijing University, he received his M.S. in Mathematics from Purdue University and his Ph.D. in Operations Research from Columbia University.