In order to provide increased branch prediction
accuracy with low area and power overheads, in this paper we propose a novel adaptive learning machine-based shadow dynamic finite state machine (SDFSM).
The researchers propose an event-driven multithreaded dynamic optimization framework, a distributed control path architecture for VLIW processors, a simple divide and conquer approach for neural-class branch prediction
, and a hardware prefetching technique for chip multiprocessors.
A dynamic branch prediction
circuit eliminates idle cycles during execution of change-of-flow instructions, thereby accelerating new and existing StarCore programs by an average of 10%.
But despite the important benefits provided by branch prediction
schemes, there are many mispredicted branches.
Similarly, floating-point, computationally intensive tasks will gain some performance from both faster clock speeds and from the chip's internal branch prediction
mechanisms (as well as from RDRAM, if used).
In particular, it is interesting to note that with perfect branch prediction
, the instruction operand sizes are far more predictable than with realistic branch prediction
, whether dynamic (hardware-based) or static (software-based), makes good guesses about likely branch targets and allows the instruction unit to fetch instructions early.
Smith (University of Wisconsin) for fundamental contributions to high-performance microarchitecture, including saturating counters for branch prediction
, reorder buffers for precise exceptions, decoupled access/execute architectures, and vector supercomputer organization, memory, and interconnects.
Data throughput will be boosted by dynamic branch prediction
and extensive data bypassing techniques.
The PA-8500 design features improved branch prediction
; 140 million transistors; and the world's largest on-chip cache -- 1.5MB -- further boosting processor performance.
For the most part, researchers investigating high-performance processors are looking at evolutionary variations on already known techniques for increasing instruction-level parallelism, for example, ways to increase superscalar issue by some small factor, more elaborate branch prediction
methods, ways of increasing cache hit rates by more sophisticated buffering methods, more elaborate cache coherence methods, and so on.
This system will meet or exceed any reasonable requirements for general purpose computing under Windows 3.11 or Windows 95 because the Pentium has superscalar architecture, separate code and data caches, branch prediction
, a high-performance floating point unit, and an enhanced 64-bit data bus.