Photonic computing system rethinks how to power familiar AI tool

Researchers from MIT have demonstrated a photonics-based computing system that could lead to machine-learning programs several orders of magnitude more powerful than the one behind ChatGPT. The researchers reported a greater than 100-fold improvement in energy efficiency and a 25-fold improvement in compute density — a measure of system power — over state-of-the-art digital computers for machine learning.

In the near term, the researchers’ experimental, laser-based system could be further developed to improve these metrics by two more orders of magnitude, the researchers said.

According to research team member Dirk Englund, an associate professor in MIT’s Department of Electrical Engineering and Computer Science, ChatGPT is limited in its size by the power of modern supercomputers. Deep neural networks (DNNs) like the one behind ChatGPT are based on large machine-learning models that simulate how the brain processes information. However, the digital technologies behind DNNs are reaching their limits even as the field of machine learning is growing. They require massive amounts of energy and are largely confined to data centers. It is also not economically viable to train models that are much bigger, Englund said.

The newly demonstrated system uses hundreds of micron-scale VCSEL arrays to perform computations based on the movement of light rather than electrons. Though optical neural networks (ONNs) typically use a great deal of energy, the researchers believe they could advance their system to cut down on energy usage. At the same time, they could offer the potential for larger bandwidths, said researcher and lead author Zaijun Chen.

The researchers said that by supporting the development of large-scale optoelectronic processors to accelerate machine-learning tasks from data centers to decentralized edge devices, cellphones, and other small devices could become capable of running programs that can currently only be computed at large data centers. They acknowledged that the components involved in ONNs are bulky and take up significant space. And, although ONNs are quite good at linear calculations such as adding, they are not great at nonlinear calculations such as  multiplication and “if” statements.