ACM SIGMETRICS 2018
Irvine, California, USA
June 18-22, 2018
California Institute of Technology
The Role of Tensors in Deep Learning
Tensors are higher order extensions of matrices that can incorporate multiple modalities and encode higher order relationships in data. Tensors play a significant role in machine learning through (1) tensor contractions, (2) tensor sketches, and (3) tensor decompositions. Tensor contractions are extensions of matrix products to higher dimensions. Tensor sketches efficiently compress tensors while preserving information. Tensor decompositions compute low rank components that constitute a tensor. We show that tensor contractions are an effective replacement for fully connected layers in deep learning architectures. They result in significant space savings with negligible performance degradation. Tensor contractions present rich opportunities for hardware optimizations through extended BLAS kernels. I will end with many open challenges in the area.
Anima Anandkumar's research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms for machine learning. Tensors are multi-dimensional extensions of matrices and can encode higher order relationships in data. At Amazon Web Services, she is researching the practical aspects of deploying machine learning at scale on the cloud infrastructure.
AT&T Labs Research
Techniques for Monitoring and Measuring
Joint work with: Minlan Yu (Harvard University) and Vijay Gopalakrishnan (AT&T Labs Research)
Network virtualization promises to revolutionize how networks are built and operated. While there has been a lot of focus on the flexibility and cost reduction that software-based virtualized networks promise, the move to software opens new doors for network measurement and monitoring of networks and network functions. In this tutorial, we discuss the opportunities, challenges, and advances in monitoring and measuring virtualized networks and network functions. We aim to demonstrate that using these advances, one can not only overcome the challenges that come with an unproven and maturing technology, but also have monitoring and measurement capabilities that have so far eluded large physical appliance-based networks, in general.
Aman Shaikh is a principal inventive scientist at AT&T Labs Research. He obtained his Ph.D. and M.S. in Computer Engineering from the University of California, Santa Cruz in 2003 and 2000, respectively. He also holds a B.E. (HONS) in Computer Science and an M.Sc. (HONS) in Mathematics from the Birla Institute of Technology and Science, Pilani, India. His current research interests include service quality management, SDN, and NFV. Several tools that have emerged from his research are being used extensively by AT&T operations teams.
Eindhoven University of Technology
Structured Markov Chains
Markov chains are popular stochastic models, because of their intriguing mathematical properties and flexible structure, but more importantly, they provide a powerful instrument for modeling, analyzing and understanding a large variety of systems and networks, including manufacturing systems, communication networks, traffic networks and service systems. This tutorial provides an introduction to Markov chain modeling and analysis, with an emphasis on analytical methods to determine the steady-state behavior of Markov chains. We will classify Markov chains based on their structural properties. As it appears, these structural properties determine the analytical methods required for solving them. Various analytical methods will be discussed, including generating functions, spectral expansion, matrix-geometric and matrix-analytic methods. Markov chain modeling and analysis will be demonstrated through illustrative specific problems.
Ivo Adan studied mathematics at the Eindhoven University of Technology from 1980-1987 and wrote a masters thesis on the Monotonicity Properties in Queueing Networks, under supervision of Jan van der Wal. Immediately after finishing his masters thesis, he started to do research for his Ph.D. thesis at the same university, under guidance of Jaap Wessels and Henk Zijm. The main contribution of his thesis is the development of a so-called compensation approach for the analysis of the equilibrium behaviour of two-dimensional Markov processes.
Performance Modeling and Analysis of
Deep Learning Systems
The tutorial will introduce the basic concepts of convolution neural networks, the training process, and various systems that support the distributed training using multiple GPUs on multiple hosts. It will cover the workload characterization and modeling for identifying performance bottlenecks. The performance models are also used for capacity planning, performance optimization, as well as system design.
Dr. Li Zhang is the manager of the System Analysis and Optimization group at IBM T.J. Watson Research Center. His research interests include design and optimization of high performance big data systems; performance analysis, control, scheduling, and resource allocation in parallel and distributed systems; traffic modeling and prediction for large scale computer systems. He has also been working on measurement based clock synchronization algorithms. He has co-authored over 100 technical articles and over 50 patents. A math major at Beijing University, he received his M.S. in Mathematics from Purdue University and his Ph.D. in Operations Research from Columbia University.