While I was still bragging and chatting with ChatGPT, someone was already using it to control the robot.
is none other than OpenAI’s sponsor dad, who just recently “reinvented the search engine” with ChatGPT Microsoft.
So far, the technical threshold for developers to train robots is not only high, but also has a long road ahead:
Engineers need to be at work In the process loop, new codes and specifications are constantly written by hand to correct the robot's behavior; in addition, different programming languages and environments may be required to control different robots.
With the help of ChatGPT, engineers don’t even need to write code by hand - they directly use human words to describe what they want to do, AI can automatically translate into machine language.
This means that on the one hand, the efficiency of interaction between professionals and robots has taken off; on the other hand, the technical threshold has also been greatly reduced, making it easier for laymen to You can even participate in debugging and create more usage methods.
A simple example: let drones automatically inspect shelves.
First, the operator only needs to make a request to ChatGPT in natural language; then, AI can automatically translate it into code and direct the drone's actions. (You can also specify the flight path of the drone.)
No wonder Tesla’s former AI director Andrej Karpathy made fun of it :
The latest popular programming language is English.
In fact, ChatGPT can do a lot of tricks.
For example, an operator says to the AI: "I'm thirsty, please help me find something to drink."
At this time, the AI will not go straight to find water. Instead, he will ask smartly:
What kind of drink do you want to drink? There are several drinks here, such as coconut water, cola, etc.
Of course the operator is not a vegetarian. He did not directly tell the AI which one to choose, but said: "I just came from the gym. Come back, please help me find a healthier drink."
Then the more magical operation began:
The AI first guessed that he wanted to drink coconut water, and then wrote a paragraph on its own. Code (even with comments) :
After writing, direct the drone to find coconut water:
In addition to drones, ChatGPT can also easily control other small robots, including cameras, robotic arms, etc.
For example, let the camera find things in the room that can heat lunch.
There is also a command robot arm to spell out a Microsoft logo. (Secretly carrying private goods)
Seeing this, some netizens were enlightened and asked:
Are they building the all-powerful Skynet?
Some people even joked that the AI may even be able to write instructions for launching a nuclear bomb:
But having said that, it is actually far from what netizens said. After all, humans are still needed to participate.
As can be seen from the previous article, this flexible AI not only communicates smoothly with people, but can also communicate quickly with machines.
This is mainly due to a series of API and advanced function libraries specially developed by the Microsoft team.
They did not let the large language model (LLM) behind ChatGPT generate a fixed type of code; because the robot is a Diverse domains, which may involve a lot of fine-tuning in different scenarios.
Under the novel operating framework, different robots have their own corresponding specific function libraries.
——An AI can adapt to different objects and different tasks.
On the one hand, these function libraries can be connected to the robot control system to manage the underlying hardware, as well as the code and function modules that perform basic movements.
On the other hand, in order for ChatGPT to follow the rules of the function library, predefined function naming is crucial. Clear function names can establish good functional connections between APIs and ultimately generate high-quality answers.
One of the requirements is that all API names must describe the overall functional behavior. For example, the detect_object(object_name) function can be linked internally to an OpenCV function or computer vision model.
After designing the library and API, Microsoft wrote a text prompt (prompt) for ChatGPT, describing the target task and clearly stating which functions in the function library are available; in addition, this can Specifies which programming language ChatGPT uses to generate code.
It is worth mentioning that the effect of AI-generated content is positively correlated with the quality of human prompts. To this end, Microsoft has also developed a collaborative open source platform PromptCraft, where anyone can share Prompt strategies for different types of robots.
At this point, the behind-the-scenes deployment is basically completed, and then the user can indirectly control the robot by "speaking human words".
If you want to check whether there are bugs in the code generated by AI, you can check it directly in the chat box at any time, or test it through the simulator. Humans can use natural language to guide the AI to make corrections.
In addition, you can wait until the user is satisfied with the solution before deploying the ChatGPT generated code to the robot.
Finally, if it were you, what would you want to do using ChatGPT to control the robot?
Paper address:https://www.microsoft.com/en-us/research/uploads/prod/2023/02/ChatGPT___Robotics.pdfReference link:
[3] https://github.com/microsoft/PromptCraft-Robotics#promptcraft-robotics
The above is the detailed content of Some netizens questioned whether Microsoft is building Skynet because ChatGPT can already control robots without engineers needing to write code.. For more information, please follow other related articles on the PHP Chinese website!