Meta, the parent company of Facebook, has revealed an AI language model called Toolformer that can self-teach itself to use external tools, such as calculators, search engines, and calendars without sacrificing its core language modeling abilities. While language models like ChatGPT have revolutionized natural language processing, they often struggle with basic tasks like fact-checking and arithmetic.
The secret to Toolformer’s success is its ability to use APIs, which are protocols that allow different applications to communicate with each other in an automated and seamless manner. During training, researchers demonstrated to Toolformer a small set of human-written examples of how each API is used, allowing the model to annotate a large language modeling dataset with potential API calls in a “self-supervised” way without needing explicit human guidance.
Toolformer learned to predict each text-based API call as if it were any other form of text. When generating text as the result of human input, it can insert the calls when necessary. Additionally, it can determine which tool to use for the appropriate context and how to use it.
This API-calling ability enables Toolformer to use external software tools such as search engines, calculators, language translators, and factual references. For instance, Toolformer can use a calculator program to work around the limitation of large language models’ poor arithmetic skills. Alternatively, if someone wanted an LLM-based assistant to add a date to their calendar, Toolformer could handle that task using an API link to a calendar app.
Toolformer is based on a pre-trained GPT-J model with 6.7 billion parameters. Experiments by researchers on various tool-using tasks have shown that Toolformer outperforms the much larger GPT-3 model, which contains 175 billion parameters.
While researchers have attempted to address the limitations of language models in the past, most approaches require large amounts of human annotations or have been limited to specific task-specific settings. In contrast, Toolformer can learn to use a range of tools in a generalized way that does not require specialized training for specific tasks.
With Toolformer, there is the potential for language models to become more versatile and reliable assistants. However, there is also the potential for them to cause harm to user data or create trouble in the outside world through APIs that they might accidentally invoke while providing an answer.
In conclusion, Toolformer represents a significant advancement in natural language processing and offers exciting possibilities for the future of AI language models.