Universal VLSI Architecture:
Universal architecture has full functional flexibility and functional expandability (requires no change to implement any function).
Universal architecture can sequentially realize any function without requiring any change in any part of hardware, as the control is very flexible. It is programmable through the contents of control memory, e.g. the sequence of control words stored in the control memory that are sequentially fetched and applied by the control sequencer.
A CISC Architecture:
- Larger instructions with variable formats (16-64 bits/ instruction)
- Larger Addressing Modes (12-24)
- Few Registers
- Most Micro-coded with control Memory
A RISC Architecture:
- LOAD-STORE Architecture
- Fewer Addressing Modes
- Fixed Length Instructions, More Registers
- Designed for Pipeline Efficiency
- Hardwired Control Unit
VLSI Architecture Note:
The architectural approach that directly translates a function or an algorithm to an architecture (purely combinational, direct-mapped architecture) is expensive and not much scalable due to replication of processing elements involved. Direct translate is like hardware synthesis, without optimization and circuit reduction.
However, one can rearrange such an architecture to increase performance through the reduction of combinational logic chain lengths: by arranging to perform such operations concurrently that are not precluded by the unavailability of their data inputs. The rearranged architectures, with reduced combinational logic chain lengths, give superior performance, in terms of speed of function computation, as compared to direct-mapped architectures, for identical gate complexity and power consumption.
The sequential architectures can substantially reduce gate counts (hence, overall cost) as compared to direct-mapped architectures for similar performance and power consumption. This is achieved by avoiding the replication of processing elements (mainly combinational logic) inherent in the direct-mapped architectures. In the contrary, they use a single processing element to perform the processing function multiple times with different operands in a sequential manner in different time slots. This is made possible through the introduction of storage elements, data routing circuitry, and a control circuit in the architecture.
The use of pipelining increases the sustained through-put of the architecture, by segmenting long chains of combinational logic into shorter chains and introducing a storage register, called pipeline register, at the end of each segment (shorter chain) for storing the intermediate result generated by it. This arrangement allows a faster rate of change of inputs and there by the increased throughput. The penalty paid is in terms of increased hardware, like pipeline registers, which are added at the end of each segment.
Domain-specific more specialized architectures (instruction set based and still programmable) have spawned over various high performance application areas, such as Digital Signal Processing, graphics processing GPUs, image and video processing – particularly for real-time applications.
All of them internally use a combination of parallelism and pipelining besides optimized processing element design and memory organization to increase their through-put.