Language models like ChatGPT have revolutionized the field of natural language processing, but they still struggle with some basic tasks like arithmetic and fact checking. Last Thursday, researchers at Meta revealed tool shaperan AI language model that can teach itself to use external tools such as search engines, calculators, and calendars without sacrificing its core language modeling capabilities.
The key to Toolformer is that it can be used APIs (application programming interfaces), which are a set of protocols that allow different applications to communicate with each other, often seamlessly and in an automated manner. During training, the researchers gave Toolformer a small set of human-written examples showing how each API is used, and then allowed it to annotate a large language modeling data set of potential API calls. It did so in a “self-supervised” manner, meaning it could learn without needing explicit human guidance.
The model learned to predict any text-based API call as if it were some other form of text. When it’s running — generating text as a result of human input — it can insert the calls as needed. Also Toolformer can “decide” which tool to use for the right context and how to use it.
This API calling capability allows Toolformer to use external software tools such as search engines, calculators, language translators and factual references. For example, Large Language Models (LLM) are known. not very good at arithmetic. Toolformer can circumvent this limitation by using a calculator program. Or if someone wanted an LLM-based assistant to add a date to their calendar, Toolformer could do that job via an API link to a calendar app.
-
An illustration provided by meta researcher Timo Schick shows an example where Toolformer makes an API call to the calendar app.
-
An illustration provided by meta-researcher Timo Schick shows an example where Toolformer makes an API call to the calculator app.
-
An illustration provided by meta researcher Timo Schick shows an example of Toolformer making an API call to an external factual reference.
Toolformer based on a pre-trained GPT-J Model with 6.7 billion parameters. Experiments conducted by the researchers on different tasks with tools seem to show that Toolformer performs far better than the much larger GPT-3 model, which contains 175 billion parameters.
This isn’t the first time researchers have attempted to overcome limitations in language models. In fact, the recent Bing Chat model, which is making the news this week, can do web searches itself when needed, and others have tried to integrate it with browsers, calculators, and search engines. According to Meta researchers, most existing approaches to integrating tools into language models relied on large amounts of human annotations or were limited to certain task-specific settings. In contrast, Toolformer can learn to use a set of tools in a general way that doesn’t require specific training for specific tasks.
Using techniques such as those found in Toolformer, we look to a potential future where LLMs, complemented by the ability to use external apps, become far more versatile and reliable assistants (allegedly). But the ability to make API calls could also increase an LLM’s ability to harm user data (in apps) or cause problems in the outside world (via a web browser or communication tools) – capabilities they could inadvertently invoke while using a Give an answer .
This article was previously published on Source link