Chapter 1 Fundamentals of Quantitative Design and Analysis
1.1 Introduction
1.2 Classes of Computers
1.3 Defining Computer Architecture
1.4 Trends in Technology
1.5 Trends in Power and Energy in Integrated Circuits 23
1.6 Trends in Cost
1.7 Dependability
1.8 Measuring, Reporting, and Summarizing Performance
1.9 Quantitative Principles of Computer Design
1.10 Putting It All Together: Performance, Price, and Power
1.11 Fallacies and Pitfalls
1.12 Concluding Remarks
1.13 Historical Perspectives and References
Case Studies and Exercises by Diana Franklin
Chapter 2 Memory Hierarchy Design
2.1 Introduction
2.2 Memory Technology and Optimizations
2.3 Ten Advanced Optimizations of Cache Performance
2.4 Virtual Memory and Virtual Machines
2.5 Cross-Cutting Issues: The Design of Memory Hierarchies
2.6 Putting It All Together: Memory Hierarchies in the ARM Cortex-A53 and Intel Core i7 6700
2.7 Fallacies and Pitfalls
2.8 Concluding Remarks: Looking Ahead
2.9 Historical Perspectives and References
Case Studies and Exercises by Norman P. Jouppi, Rajeev Balasubramonian, Naveen Muralimanohar, and Sheng Li
Chapter 3 Instruction-Level Parallelism and Its Exploitation
3.1 Instruction-Level Parallelism: Concepts and Challenges
3.2 Basic Compiler Techniques for Exposing ILP
3.3 Reducing Branch Costs With Advanced Branch Prediction
3.4 Overcoming Data Hazards With Dynamic Scheduling
3.5 Dynamic Scheduling: Examples and the Algorithm
3.6 Hardware-Based Speculation
3.7 Exploiting ILP Using Multiple Issue and Static Scheduling
3.8 Exploiting ILP Using Dynamic Scheduling, Multiple Issue, and Speculation
3.9 Advanced Techniques for Instruction Delivery and Speculation
3.10 Cross-Cutting Issues
3.11 Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor Throughput
3.12 Putting It All Together: The Intel Core i7 6700 and ARM Cortex-A53
3.13 Fallacies and Pitfalls
3.14 Concluding Remarks: What’s Ahead?
3.15 Historical Perspective and References
Case Studies and Exercises by Jason D. Bakos and Robert P. Colwell
Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures
4.1 Introduction
4.2 Vector Architecture
4.3 SIMD Instruction Set Extensions for Multimedia
4.4 Graphics Processing Units
4.5 Detecting and Enhancing Loop-Level Parallelism
4.6 Cross-Cutting Issues