Revised December 11, 2013
CS 241: Foundations of Sequential Programs
General description
CS 241 covers the relationship between high-level programming languages and the computer architecture that underlies their implementation. Students learn the fundamentals of the compilation of computer programs to machine-executable code and how to implement simple versions of relevant algorithms. The following topics are discussed:
- Functional components of a computer (CPU, memory, I/O) and their interactions. Representation of data in a computer and its manipulation at the machine-language level and in high-level languages.
- Assembly language and the implementation of an assembler. Overall structure of a compiler and the functions of its components. Purpose and functions of linkers and loaders.
- Specification and recognition of regular languages and their use in lexical analysis (lexical-analysis tools). Specification and parsing of context-free languages (parsing tools). Semantic analysis and code generation for procedural languages.
Logistics
Audience
- 2B Computer Science students
- Optional for Computational Math students
Normally available
- Fall, Winter, and Spring
Related courses
- Predecessors: CS 246 or (CS 138 and enrolled in Software Engineering)
- Successors: Most third-year CS major courses
- Co-requisites: Taking CS 241 together with CS 251 works well
- Conflicts: Other courses that focus on the process of turning programs into executable code
For official details, see the UW calendar.
Software/hardware used
- UNIX/Linux, C++, and possibly Scheme
Typical reference(s)
- D. Patterson, J. Hennessy, Computer Organization and Design, 4th ed., Morgan Kaufmann
- A. Robbins, Unix in a Nutshell, 4th ed., O'Reilly
Required preparation
At the start of the course, students should be able to
- Design, code, test, and debug several-hundred-line programs in C/C++
- Describe the basic properties of the C/C++ memory model: bytes vs. words, memory as an array, pointers as memory addresses, run-time stack and stack frames, memory allocation on the heap vs. automatic allocation on the stack
- Use basic collections (arrays, lists, dictionaries) in C/C++ programs
Individual instructors may choose to use Scheme in addition to C/C++. Assignments written in Scheme may be at the level indicated above for C/C++ (mutatis mutandis). Prior knowledge of Scheme is not required.
Learning objectives
At the end of the course, students should be able to
- Write short machine- and assembly-language programs to perform simple data manipulation
- Write a basic assembler supporting labels
- Give formal specifications for regular languages, including regular expressions and bubble diagrams
- Write a scanner capable of dealing with a typical high-level programming language (given the specification)
- Give a grammar for a context-free language and, given a grammar, produce a derivation for a given string in the language
- Write a parser for an LR(1) language given a low-level representation of the LR-parsing automaton (e.g., as derived from an automatic parser generator)
- Write a simple code generator for an imperative language, i.e., one doing little or no optimization
- Apply appropriate design decisions when programming in C/C++ based on a detailed understanding of the way memory is used by a running C/C++ program
Note: When writing programs, students must be able to design, code, debug, test, and successfully run the programs.
Typical syllabus
Machine architecture and assembly language (6 hours)
- Functional components of a computer: memory, control unit, arithmetic/logic unit, input/output devices
- Data representation
- Machine language: operation codes, addressing modes, indexing, base registers, register designation
Assemblers, linkers, and loaders (6 hours)
- Mnemonic op-codes, pseudo-ops, symbolic constants and addresses, literals
- Assembler algorithm, linker and loader algorithms
Regular languages and scanning (5 hours)
- Architecture of a compiler
- Syntax vs. semantics
- Introduction to formal languages
- Regular languages, regular expressions, and finite state machines
Context-free languages and parsing (8 hours)
- Context-free grammars, derivations, derivation trees, ambiguous grammars
- Introduction to top-down and bottom-up parsing, LL(1) and LR(1) grammars
- Tool-based parser generation
Semantic Analysis and Code generation (6 hours)
- Constructing parse trees
- Code generation
Runtime organization and data layout (5 hours)
- Machine-level implementation of procedure invocation and the runtime stack
- Implications of stack vs. heap allocation
- Main-storage layout for structures, vectors, and arrays