January 7, 2025

Taylor Daily Press

Complete News World

DeepMind creates new AI that solves 50-year-old matrix multiplication problem – IT Pro – News

What this AI did was to discover an algorithm that was faster than the fastest algorithm we know about a very specific problem. As far as the authors know, the Strassen algorithm is the fastest algorithm 4 by 4 matrices on a finite field to double it. This is a very specific problem, particularly the “via finite field” part, which rules out, for example, matrix multiplication over integers, and as far as I can quickly see, they don’t make a strong argument for why this particular problem is so relevant.

How do you come to the conclusion that optimizations are only achieved using 4×4 matrices? This example was given because it is relatively simple and no human has been able to find a better way for 50 years while AI succeeds, which is an achievement in itself. I also got from the original paper that they made improvements for example in 5x5x5 (98->96), 3x4x5 (48->47), 4x4x5 (64->63) and 4x5x5 (80->76). In addition, the dimension is not so important, it has to do with the way it can be widely used.

The second, and perhaps most important point, is that this AI is specifically looking for algorithms to multiply the matrix of a fixed dimension in a fixed field. Strassen’s algorithm, which they mentioned earlier as an important example in which they found an improvement, multiplies a matrix with a variable dimension in any field. This means that the algorithms that this AI finds are only applicable to a very limited scale,

Any matrix can be divided into blocks, even if they are not square. For example, the matrix 8×4 A and the multiplication of the matrix 4×8 B can be divided into 4x4A0, 4x4A1, 4x4B0 and 4x4B1. The result is then 4x4A0 x 4x4B0 + 4x4A1 x 4x4B1. This is also called tiling and is very important when executing on a GPU due to the high degree of parallelization.

It is likely that for any application where an efficient matrix multiplication algorithm is found, the AI ​​will have to be retrained (although of course some dimensions and fields are more important than others, for example 3×3 matrices on rational numbers are common in cases involving on the three-dimensional physical world).

Going further, artificial intelligence must be trained on each new type of device. I think this is inevitable, because executing on tensor cores can achieve very different performance than executing on cuda cores. On the one hand, this is great, because the performance is validated instantly against a real-world application and you are always sure that you are getting the most out of it. On the other hand, it takes more work.

See also  PostNL employees in the Netherlands forced to strike | Abroad