Faculty of Information Technology, BUT

Course details

Advanced Computer Architectures

ARP Acad. year 2004/2005 Winter semester 6 credits

The course covers architecture of processors and parallel systems. Instruction- and thread-level parallelism (ILP, TLP) is studied on scalar, superscalar, VLIW and multithreaded processors. Next, in the context of process-level parallelismus, the most frequently used bus-based symmetric multiprocessors are dealt with. Then the treatment of interconnection networks follows, as a base of systems with a distributed shared memory (NUMA) and of multicomputers with local memories, especially popular clusters of workstations and massively parallel systems. The last part is devoted to parallel vector processors and SIMD-style processing (data parallelism).

Guarantor

Language of instruction

Czech, English

Completion

Credit+Examination (written)

Time span

39 hrs lectures, 16 hrs exercises, 10 hrs projects

Assessment points

60 exam, 10 half-term test, 30 projects

Department

Lecturer

Instructor

Subject specific learning outcomes and competences

Overview of processor microarchitecture and its future trends, principles of parallel system design and interconnection networks, ability to estimate performance of parallel applications.

Learning objectives

To familiarize students with architecture of the newest processors exploiting the instruction-level parallelism and its impact on a compiler design. To make them understand features of parallel systems which make use of functional parallelism at a process- or thread-level and also data parallelism.

Study literature

  • Dvořák,V.-Drábek,V: Architektura procesorů. VUT Brno Publ., VUTIUM 1999.
  • 160 Powerpoint frames available to students.

Fundamental literature

  • Dvořák,V.-Drábek,V: Architektura procesorů. Nakl. VUT v Brně, VUTIUM 1999.
  • Henessy, J.L. - Patterson, D.A.: Computer Architecture - A Quantitative Approach. 3. vydání, Morgan Kaufman Publishers, Inc., 2003. (http://mkp.com)

Syllabus of lectures

  • Function- and data-level parallelism, performance figures and speedup laws.
  • Pipeline instruction processing and instruction dependencies. Typical CPU architecture (DLX).
  • FP unit. Eliminating instruction dependencies. Loop-level parallelism, branch prediction.
  • Superscalar CPU. Dynamic instruction scheduling, register renaming, ROB, speculation.
  • Relaxed models of memory consistency. VLIW processors, software pipelining, predication.
  • Thread-level parallelism, support in hardware. Multithreaded processors.
  • Shared memory architectures. Bus scalability, memory organization, cache coherence.
  • MSI and MESI cache coherence protocols. Synchronization of events in multiprocessors.
  • Interconnection and switching networks. Features and specs, routing, control, group communications.
  • Distributed shared memory architectures, shared virtual memory.
  • Message passing architectures. Hardware support for communication, overlapping communication and computation.
  • Data-level parallelism, vector processors and instructions. SIMD machines and SIMD-like processing. Systolic structures.
  • Accelerators and specific architectures for ANN, architectures of future CPUs.

Syllabus of numerical exercises

  • Efficiency and speedup of parallel applications, Amdahl's and Gustafson's laws.
  • Instruction dependencies and hazard elimination at pipeline instruction processing, loop unrolling.
  • Superscalar processing.
  • Midterm examination.
  • VLIW and software pipelining.
  • Multithreading, SMT.
  • Shared memory, bus scalability, SM-system performance.
  • Parameters of interconnection networks, routing algorithms.
  • Vector processors, duration of vector operations.

Progress assessment

Assessment of four small projects, a midterm examination.
Back to top