Lightelligence developed a photonic interposer technology that can connect cores on an electronic application-specific IC (ASIC) in arbitrary, on-chip network topologies, including full mesh or toroidal configurations, offering performance benefits and simpler software versus nearest-neighbor configurations, Mo Steinman, VP of engineering at Lightelligence, told EE Times.
The company developed its own 64-core AI inference accelerator ASIC, with cores connected in a topology that allows all-to-all broadcast via the company’s optical network-on-chip (oNoC) interposer technology, assembled in a system-in-package (SiP) it calls Hummingbird. This offers latency and power efficiency benefits, Steinman said, declining to reveal performance figures or benchmarks.
“This is a tool we can use for addressing interconnect challenges at favorable density and power characteristics, but also to simplify software development, it’s really to avoid the challenge of the scheduling problem,” he added.
Steinman described the scheduling problem that arises in many-core or chiplet designs where each core or chiplet can only communicate with its nearest neighbors.
Lightelligence Hummingbird SiP contains electronic and photonic dies. (Source: Lightelligence)
“If I have to jump one, two, three or four [cores or chiplets] away, the electrical interface power characteristics and capabilities start to become a challenge,” Steinman said. “But for optics, the definition of what’s short reach and what’s long reach are very different than for electronics… Even at, say, wafer scale, the attenuation [for photonics] is very manageable… The power and latency are fairly independent of that distance.”
Topologies like toroidal configurations are challenging to achieve with electrical interconnects.
“[With our oNoC technology] there isn’t necessarily a predisposed recipe to the types of topologies we can entertain,” he said. “So it’s a powerful tool that we can use to work with partners to solve their connectivity problems which may be unique—not mapped to a preconceived topology—there’s a lot of flexibility there.”
Lightelligence’s Hummingbird is a SiP combining its 64-core AI inference accelerator ASIC with the optical network-on-chip interposer. This is the first specific implementation of Lightelligence’s oNoC technology, which carries data, encoded onto light, in an all-to-all broadcast mode between the 64 cores.
“For convolution, which is a big part of AI, that allows us to do a very interesting mathematical function where each core is doing a piece of the work and then simultaneously blasting it to every other core every clock cycle,” Steinman said.
Hummingbird’s accelerator is a SIMD (single instruction, multiple data) machine with a “fairly simple” proprietary instruction set, he said. Each of the identical cores has SRAM and compute for scalar and vector operations, plus embedded transmitter and receiver circuitry that converts between the electrical and optical domains.
There’s an analog interface on the ASIC that is coupled to the photonic interposer. When light from a laser mounted on the interposer passes it, the circuitry on the ASIC alters the refractive index of the silicon waveguide below to modulate the light passing by (complete darkness isn’t required for a zero, it just has to be modulated sufficiently to distinguish it from a one).
At the other end, there’s a receiver photodiode that converts incoming light pulses into an electrical current. This current is amplified and analog circuitry does threshold detection to convert the signal into a bitstream. Features like error correction code (ECC), framing, encoding and more can be layered on top, Steinman said.
The analog circuitry on the electronic die can be calibrated to account for process variabilities.
“[Refractive index] will vary from die-to-die and transmitter-to-transmitter, so our electronic circuitry is able to adjust to those characteristics,” he said. “One of the things we do early in the powerup sequence is to calibrate the design—run known patterns through it and see what the circuit response is—so we can adjust knobs on the analog side.”
While Lightelligence uses an optical NoC on its PACE optical compute product, the technology on Hummingbird is quite different, Steinman said.
“There’s a little bit of IP reuse, but it’s a little different because of the type of communication—this is high speed digital versus PACE where there’s an analog computation, it’s not just ones and zeroes,” he said.
Hummingbird is available on a PCIe card. Building a whole system complete with an AI software stack was necessary to work out all the kinks, according to Steinman.
“Our belief is that if we’re going to develop some new kind of interconnect, there’s bound to be implications at every level,” he said. “In a computer system there’s digital design, in our case we also have analog and photonic design, there’s packaging, there’s system design, there’s software implications, and everything has some sort of implication or second- or third-order effect.”
One thing Lightelligence learned was that they needed another interposer layer—a laminate interposer—between the electronic and photonic dies to deliver power to the electronic chip. Future generations of the technology will enable direct connections between the two dies.
“3D technologies are cutting edge, and we didn’t want to wait for the full enablement of that to bring this out,” Steinman said. “We felt this was the way we could do a first implementation that will only get better when we have the 3D stacking, when we can eliminate [the laminate interposer] layer.”
Lightelligence also has a full AI software stack up and running, he said, which can run Pytorch models. The overall aim is to abstract away any “exotic” technologies, presenting simply a PCIe card with a software stack that can be used like any other AI accelerator.
The aim for Hummingbird is to prove out the software stack and get customer feedback on functionality, Steinman said.
“We don’t have any illusions that this is going to supplant Nvidia, it’s more about the possibilities of the technology—we need a legitimate, functioning proof point,” he addded.
“We want to use Hummingbird primarily as a vehicle to enable conversations, to get to purpose-built semi-custom implementations with partners,” he said. “The next generation will probably be semi-custom implementations working with partners, then maybe develop a standard interface template that’s a bit more generic. I think those first few adopters will want to do a very close collaboration, but we’re open to any model; we don’t want to pre-suppose the way people want to do business, and we’re flexible enough to do that at this point.”
Future generations of Hummingbird will use reticle-stitching technology (which etches a test pattern at the reticle boundary to test stepper alignment) to allow photonic interposers bigger than the reticle limit to support many-chiplet architectures. Future technology generations may also see separate photonic transmitter/receiver chiplets connected electrically to compute and memory chiplets, and/or licensed transmitter/receiver IP embedded into customer chiplets.
Hummingbird PCIe cards have been sampled to an early partner, with full availability of the card and the software development kit coming in Q3 2023.