Throughout the last few decades in Programmable Logic Devices (PLD) the two most well-known FPGA suppliers were Altera and Xilinx. Altera is now part of Intel (PSG) and Xilinx was acquired by AMD. In an earlier post we discussed the logic resources of AMD Xilinx FPGAs and the Configurable Logic Block (CLB). This post will compare the logic fabric of Intel and AMD FPGAs, as well as discuss the generations of Altera and later Intel FPGAs.
Altera (Intel) FPGAs and Devices
After manufacturing PLDs since 1983, Altera produced the Flex series of FPGAs, the precursor to the modern Altera and Intel lines. They started with the Flex 8000 in 1992 and continued through several generations of their Cyclone, Arria, and Stratix devices, the new Agilex, and several other programmable logic series.
Many of the earlier devices blur the lines between FPGAs and PLDs, but they have many typical FPGA characteristics. The table below shows each Altera (Intel) family with the AMD (Xilinx) family released around the same time:
The Excalibur is an interesting device worthy of its own blog post as one of the earliest devices to implement an ARM processor coupled with programmable logic. It even has an AMBA-AXI interconnect of sorts.
LEs, LABs, and ALMs
In the original Flex 8000, the logic was based on individual logic elements, or LEs. Each LE contained a LUT, a flip flop, a carry logic and a cascade chain allowing you to stack multiple LEs for larger expressions.
LEs (Logic Elements)
LEs are organized into larger groups of Logic Array Blocks (LABs) consisting of 8 LEs. The concept of a LAB is like the AMD Xilinx CLBs (Configurable Logic Blocks) in the sense that the logic elements inside of a LAB have similar control signal restrictions. The boundaries between elements function a bit differently than AMD CLBs, though. CLBs largely do not have explicitly defined subsidiary logic elements, and it can make comparing resources between vendors tricky at times.
LABs (Logic Array Blocks)
LABs additionally contain carry chains to allow for low latency data transfer between LEs.
The building block of these Logic Elements are a similar 4-Input LUT to the Xilinx LUT4 at the time and a flip flop.
Intel Stratix
Let’s jump ahead 10 years. It’s now 2005 and the Stratix has been released. There are a lot of similarities to the old architecture:
- The logic is still organized into LABs of LEs
- LEs still consist of the same types of elements
- The LUTs remain 4-input LUTs
However, there are a few very important differences:
- LABs now have 10 LEs instead of 8
- LEs have built-in dynamic arithmetic mode to capitalize on resources
- There have been major improvements to control signal routing inside of LABs
- Major improvements to memory resources on the devices
Intel Stratix II
Stratix II, however, introduced several foundational changes:
- Logic elements are now Adaptive Logic Modules (ALMs), a change that persists to this day.
- ALMs are more similar to the Xilinx CLB in that they contain 2 FFs and 2 LUTs as opposed to separated into individual cells like before.
- LABs consist of 8 ALMs (16 LUTs and FFs) a significant upgrade from the 10 LEs in the original Stratix.
- ALMs have 2 outputs to the FPGA logic rather than LEs’ single output.
ALMs (Adaptive Logic Modules)
The ALMs provide unique functionality due to their adaptive LUTs. The 2 LUTs can be combined to act using different combinations of inputs e.g. a 6 input function and a 2 input function.
ALMs also contain 2 full adders. These are used when the LUTs are set to arithmetic mode.
If we jump to the Stratix V, largely the devices look similar, but now there are 4 FFs per ALM instead of 2. Additionally, a LAB can drive up to 30 ALMs via interconnects to adjacent LABs with the direct link feature.
It’s worth noting that if you want to pack your ALMs well, you need to follow some guidelines. Intel states that for a Stratix 10, packing 1 ALM with 2 5-input functions requires at least 2 common inputs.
ALMs have unique operating modes to function differently. These will be handled by the tool during synthesis. In the Stratix 10 for example, ALMs can function in Normal Mode, Extended LUT Mode to allow for 8-input functions, and Arithmetic mode to leverage the two full adders.
In the Agilex-7 series, LABs still drive 10 ALMs, but now the interconnects have been improved to the point one LAB can drive 60 ALMs in adjacent LABs. Agilex ALMs are largely the same as Stratix 10 ALMs.
Conclusion
This post has been a look at Altera and Intel’s core logic elements through the generations. When comparing to AMD Xilinx, one should be aware of the structural differences of the CLB groupings vs LABs of groups of ALMs.
How you code will directly affect both ALM and CLB packing. AMD and Intel provide resources on good practices for efficient resource utilization.
This can make it very tricky to compare AMD and Intel FPGA resource counts, due to ALMs/LABs and CLBs not being a one-to-one comparison.