Google DeepMind’s FunSearch, a groundbreaking artificial intelligence, is touted as the first of its kind capable of making reliable mathematical discoveries. This reliability stems from its built-in “anti-hallucination” system that filters out erroneous or useless results, significantly enhancing the quality of answers.
The Power of FunSearch
FunSearch consists of a pre-trained large language model (LLM) combined with an automated scoring system. It is claimed to be capable of solving complex mathematical problems. According to the company, this task is easier to handle because the generated solutions can be internally and quickly verified. In addition, two components of the model allow proposed solutions to evolve, sometimes leading to scientific discoveries.
Comparing FunSearch with Other AI Tools
Google DeepMind’s latest AI tools, such as GNoME, exhibit remarkable capabilities due to their specificity. These models, trained on precise datasets within a specific domain (like chemistry), are less prone to errors compared to generalized LLMs like ChatGPT or Gemini. The latter, trained on large and diverse datasets, are often susceptible to hallucinations.
The Challenge of Hallucinations in AI
The propensity of LLMs to hallucinate has been a point of contention. Some experts argue that the workings of LLMs mirror how scientists solve problems, generating numerous ideas that may or may not prove useful. Therefore, the future of LLMs hinges not on their size, but on their ability to generate reliable answers or filter out unreliable ones.
The Solution: FunSearch
Google DeepMind engineers have made a significant stride with their new tool, FunSearch. Comprising a pre-trained LLM and an automated scoring system, FunSearch is claimed to be capable of solving complex mathematical problems. The generated solutions can be internally and quickly verified, making this task more manageable.
The Future of AI with Google DeepMind’s FunSearch
With up to 90% of results when solving a complex problem being either useless or incorrect, the complete system starts by receiving the problem being solved and the underlying source code solution as input. It then generates a set of new solutions, the correctness of which is checked by an expert. This closed-loop process, according to Google DeepMind, generates millions of potential solutions that eventually converge on a result that may be more reliable than the best-known solution.
FunSearch in Action
Google DeepMind researchers tested their new system by challenging it to find a solution to the “set of hats problem.” This task involves finding sets of points in a large grid where three points should never form a straight line. FunSearch’s calculations resulted in 512 points in eight dimensions, the largest set ever found for this problem. “These are the first discoveries made for open problems posed using LLM,” the researchers write in their paper.
The second challenge was to solve the bin-packing problem, which involves packing items of different sizes into a minimum number of boxes. FunSearch produced results that outperformed the algorithms typically used to solve this type of problem.
This could make it applicable in the transport and logistics sectors. It should be noted that other AI approaches such as neural networks and reinforcement learning can also solve a similar problem. However, their deployment will require significant resources. FunSearch, on the other hand, runs on computer code that Google engineers say can be easily tested and deployed.