Computer Science Colloquium
Time+Place : Tuesday 01/02/2011 14:30 room 337-8 Taub  Bld.
Speaker    : Yoav Etsion
Affiliation: Barcelona Supercomputing Center (BSC)
Host       : Johann Makowsky
Title      : Task Superscalar Multiprocessors
Abstract   :
Parallel programming is notoriously difficult and is still considered an
artisan's job. Recently, the shift towards on-chip parallelism has brought
this issue to the front stage. Commonly referred to as the Programmability
Wall, this problem has already motivated the development of simplified
parallel programming models, and most notably task-based models.
In this talk, I will present Task Superscalar Multiprocessors, a conceptual
multiprocessor organization that operates by dynamically uncovering
task-level parallelism in a sequential stream of tasks. Task superscalar
multiprocessors target an emerging class of task-based dataflow programming
models, and thus enables programmers to exploit manycore systems
effectively, while simultaneously simplifying their programming model.
The key component in the design is the Task Superscalar Pipeline, an
abstraction of instruction-level out-of-order pipelines that operates at the
task-level and can be embedded into any manycore fabric to manage cores as
functional units. Like out-of-order pipelines that dynamically uncover
parallelism in a sequential instruction stream and drive multiple functional
units, the task superscalar pipeline uncovers task-level parallelism in a
stream of tasks generated by a sequential thread. Utilizing intuitive
programmer annotations of task inputs and outputs, the task superscalar
pipeline dynamically detects inter-task data dependencies, identifies
task-level parallelism, and executes tasks out-of-order. I will describe the
design of the task superscalar pipeline, and discuss how it tackles the
scalability limitations of instruction-level out-of-order pipelines.
Finally, I will present simulation results that demonstrate the design can
sustain a decode rate faster than 60ns per task and dynamically uncover data
dependencies among as many as ~50,000 in-flight tasks, using 7MB of on-chip
eDRAM storage. This configuration achieves speedups of 95-255x (average
183x) over sequential execution for nine scientific benchmarks, running on a
simulated multiprocessor with 256 cores.
Short bio
Yoav Etsion is a Juan de la Cierva Fellow and a Senior Researcher at the
Barcelona Supercomputing Center (BSC), where he is a member of the
Heterogeneous Architectures Group. His research interests span all aspects
of computing systems, and specifically computer architecture, operating
systems, and parallel systems. Yoav received his BSc, MSc, and PhD in 1998,
2002, and 2009, respectively, all in Computer Science, and all from the
Hebrew University.
Visit our home page-   <>
Wed Jan  5 08:05:02 IST 2011
Technion Math. Net (TECHMATH)
Editor: Michael Cwikel   <> 
Announcement from: Hadas Heier   <>