Rumors have it that the forthcoming higher-end M5 Pro and M5 Max processors will be architected differently than the M5, presumably to improve scalability. They are expected to debut in the next wave of MacBook Pros.
This change in architecture would mean it would be easier and more efficient to speed up the chips, as well as to decrease heat generation and possibly bump battery life a bit. To achieve this feat, the word is Apple has switched from TSMC's system-on-chip layout used by the M5 to a custom variation of TSMC's system-on-integrated-chip, molding-horizontal layout, called SoIC-MH.
What's changed?
With the SoC architecture Apple uses for its M5, everything but the memory is fabricated wholesale on a single die, which is connected to system memory for a chip. The company has essentially been scaling to its Pro, Max and Ultra chips by combining multiple dies.
That's why, for example, the GPU core counts on the Max and Ultra chips have generally been doubles of the two lower-end versions. But that's always been a rather inefficient and inflexible way to scale performance.
TSMC's SoIC fabrication creates separate dies (or chiplets) for selected operational groups, which are then linked via tiny high-speed connections into a single package and combined with memory for the final chip.
In Apple's case, the expectation is that it will be moving the GPU to its own chiplet to make it easier to scale its performance independent of the CPU. This is essential since demand for tensor (for AI) and graphics processing power is growing rapidly, while the need for high-powered CPUs is a lot less at the moment.
SoICs generally have stacked chiplets, which allows for the tiniest and, therefore, fastest possible interconnects. But Apple may be using a custom SoIC-MH layout, where they're laid out next to the main die rather than stacked.
A potentially critical step
In the absence of any solid information, it's hard to predict how much of an effect this would have on performance.
For instance, Apple could conceivably put fewer GPU cores in some system configurations rather than more, though I really doubt it would want to reduce performance. Plus, the layout could change Apple's CPU core count choices, depending upon what Apple chooses to do with the space that opens up on the die, or if the company chooses to just shrink the die.
But when Apple announced the M1 Ultra for the Mac Pro, my first thought on seeing that it only used integrated graphics was that this was not a great long-term chip architecture strategy.
With every generation, I've been waiting for the shoe to drop: The M series still doesn't support discrete graphics cards, only the chips' integrated GPU, which means the company needed a better way to increase on-chip GPU power over time. Apple addressed AI tensor processing needs in the M5 by adding a neural accelerator to each GPU core, but the one-accelerator-per-core approach is also a slow road to take.
I'm don't know if a GPU chiplet is the best solution to these problems, but this would hopefully let Apple cram as many cores into the GPU as possible, without the constraint of fitting into the fixed available space on the die. In turn, that might enable Apple to offer, say, a 14-inch MacBook Pro that can really compete when it comes to large machine-learning workloads.
If Apple really does implement this type of architecture, then we'll definitely see M5 Pro and Max systems by the Worldwide Developers Conference, the company's annual developers conference, because Apple will need to excite developers about gaming (again) and AI (again).
And given the pressure on laptop prices from the severe shortage of memory (among other obstacles), the ability to offer chip options with GPU and CPU performance decoupled this way may help control costs -- or at least make it easier for you to manage making a choice within your budget from granular configuration options.

