Improved use of embedded Experts is the secret to DeepSeek’s success
The division of labour within insect colonies leads to greater efficiency. Specialisation of tasks has an LLM analogue where several expert networks, typically six per LLM, collaborate to parse tokens more efficiently.
This approach is called a Mixture of Experts; the benefit accruing from experts is determined by the gating network that allocates tasks to the individual experts. Individual expert networks do not correlate with the human concept of an expert, they define the internal structure of the LLM in way that improves processing of vectors.
This is a well-known method of simplifying problems that was employed long before LLMs existed. LLMs recognise and intercept input vectors and allocate them to experts that provide fast and accurate solutions. Of course, realising the full benefit of experts depends upon the LLM identifying that it has the dedicated resources to answer certain types of vectors efficiently.
The DeepSeek team cannot use the fastest GPUs, access was restricted to devices that complied with USA export regulations, making the training and operation of their LLM excessively slow. As always, necessity is the mother of invention; an important part of their groundbreaking numerical innovations was the efficient selection, training and use of experts.
This enabled the team to build their LLM product using 2048 derated GPUs in three months at an estimated cost of just $5M.
In addition, the DeepSeek team knew that all LLMs contain a large amount of redundancy which increases the time taken to perform a translation. This redundancy was removed by processing millions of translations using the LLM and eliminating any part of the model that never contributed to generating an output.
What lessons does DeepSeek have for LLMs and the Creative Industries?
Firstly, there are many further opportunities to improve the efficiency of data driven AI, which will promote the products offered by lower cost competitors to NVIDIA.
Secondly, end-to-end AI solutions are not always a desirable choice for companies that have unique expertise within a business sector. The best AI solution for a company may be a strategy that incorporates existing expertise.
Thirdly, it is probable that SMEs, certainly trade organisations, will be able to fund and operate private AI systems that exclusively serve a single vertical market.
We will explore the use of more specialised types of experts within Creative Industries in future discussions.