Revised August 11, 2015
CS 450: Computer Architecture
Watch a video introduction to this course on YouTube.
General description
This course provides students with an appreciation of modern computer design and its relation to system architecture, compiler technology, and operating system functionality. The course focuses on design that is based on the measurement of performance and its dependency on parallelism, efficiency, latency, and resource utilization.
Logistics
Audience
- CS major students. Mostly taken in fourth year. Students who are interested in computer hardware or work with logic designers to specify application specific processors will find this course valuable.
Normally available
- Winter
Related courses
- Pre-requisites: (CS 245 or SE 212) and (CS 350 or ECE 354 or MTE 241 or SE 350); Computer Science students only
- Anti-requisites: ECE 429
For official details, see the UW calendar.
Software/hardware used
- UNIX, Verilog Simulator
Typical reference(s)
- M. Dubois, M. Annavaram, and P. Stenström, Parallel Computer Organization and Design, Cambridge Press, 2012
- Course notes
Required preparation
At the start of the course, students should be able to
- Describe a simple processor architecture for executing a RISC machine language
- Explain basic cache and virtual-memory architectures, and how they can impact performance
Learning objectives
At the end of the course, students should be able to
- Code a simulatable specification for a simple multi-cycle processor
- Explain the structure of statically and dynamically scheduled pipelines
- Evaluate the performance of software executed on statically and dynamically superscalar pipelines
- Appreciate use and limits of instruction-level, data-level, and thread-level parallelism
- Describe memory coherency and consistency protocols
Typical syllabus
Digital hardware design (6 hours)
- Transistors, digital logic, hardware description languages
Instruction set architecture (3 hours)
- Instruction types and mixes, addressing, RISC vs CISC, exceptions, Flynn's taxonomy
Scalar pipelines (3 hours)
- Data dependencies, local and global scheduling, performance
VLIW pipelines (6 hours)
- Local scheduling, loop unrolling, software pipelining, trace scheduling, data speculation, deferred exceptions, predicated execution, IA64
Dynamic pipelines (9 hours)
- Dynamic scheduling, register renaming, speculative loads, prefetching, speculative execution, trace caches
Thread-level parallelism (6 hours)
-
Multiprocessors, synchronization, cache coherency, memory consistency,
chip multiprocessors,
simultaneous multithreading, transactional memory
Data-level parallelism (2 hours)
- GPGPU structure and programming models