OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models-Hardware News-php.cn

Home

Hardware Tutorial

Hardware News

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

DDD

Sep 19, 2024 am 03:22 AM

openai laptop test Notebook review reviews tests reports netbook STEM o1 o1-mini

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

OpenAI o1 and o1-mini have arrived. These AI LLMs perform much better on coding, math, and science problems and tasks than prior models such as GPT-4o by taking more time to think.

Complex problems in STEM tend to require more than a quick online search for correct answers. By giving the o1 AI more time to think, the AI can reason more carefully and accurately. The o1-mini model has been specifically tuned to answer STEM questions with faster speed and lower demand on computer resources, and it is notably better at coding than the o1 model.

Across a range of standardized AP exams and STEM tests for LLMs, the o1 models perform with high accuracy. Specifically, on the AP Calculus, AP Chemistry, AP Physics 2, LSAT, and SAT evidence-based reading & writing tests, the o1 models perform at or above the B-grade level (~80% or higher). The models answer accurately at the A-grade level on PhD-level physics questions, at the B-grade level on tough 2024 American Invitational Mathematics Examination math questions, and at the high B-grade level on Codeforces coding problems. Because o1 has been tuned for answering STEM questions, its performance on AP English Language and AP English Literature is at or below the C-grade level.

Interestingly, while GPT-4o is dumbfounded by the cryptographic challenge of decoding “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz” when given the hint “oyfjdnisdr rtqwainr acxz mynzbhhx” means “Think step by step”, o1 had no issues thinking through the problem to come up with the correct answer “There are three r’s in strawberry”. This new power will delight hobby cryptographers at home as well as the NSA.

Closet evil-doers will want to know that while the uncensored o1 models are apt to give troubling replies, OpenAI has neutered these models for release. The o1 models have been tested to resist answering questions about making bioweapons, producing naughty images, jailbreaking itself, and harassing and threatening. Unfortunately, the OpenAI o1 models remain gender and race biased when tested, despite tuning efforts.

ChatGPT Plus and Team users along with API usage tier 5 developers have access to o1 models immediately, and ChatGPT Edu and Enterprise users will gain access on the week of September 16. ChatGPT Free users will gain access to o1-mini in the near future. The o1 models cannot browse the web or accept uploaded files and images to answer questions, so OpenAI recommends users continue using their GPT-4o models for general questions.

Users who want to ask AI questions now have a wide-range of capable LLM models to interact with besides those from OpenAI, including Anthropic Claude, Microsoft CoPilot, Google Gemini, and X Grok. Every AI has specific advantages, so it is worth testing several AI models to find one that best suits individual needs. Some of these AI are built into smart glasses (like these on Amazon) and voice recorders (like this one on Amazon), and some upcoming autonomous humanoid robots use proprietary AI to cook and clean.

OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models

The above is the detailed content of OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

How to fix KB5055523 fails to install in Windows 11?

4 weeks ago By DDD

How to fix KB5055518 fails to install in Windows 10?

4 weeks ago By DDD

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1664

CakePHP Tutorial

1423

Laravel Tutorial

1317

PHP Tutorial

1268

C# Tutorial

1243

Related knowledge

Huawei Watch GT 5 smartwatch gets update with new features Oct 03, 2024 am 06:25 AM

Huawei is rolling out software version 5.0.0.100(C00M01) for the Watch GT 5 and the Watch GT 5 Prosmartwatchesglobally. These two smartwatches recently launched in Europe, with the standard model arriving as the company’s cheapest model. This Harmony

$Tekken\'s Colonel Sanders dream fried by KFC$ Tekken\'s Colonel Sanders dream fried by KFC Oct 02, 2024 am 06:07 AM

Katsuhiro Harada, the Tekken series director, once seriously tried to bring Colonel Sanders into the iconic fighting game. In an interview with TheGamer, Harada revealed that he pitched the idea to KFC Japan, hoping to add the fast-food legend as a g

Cybertruck FSD reviews praise quick lane switching and full-screen visualizations Oct 01, 2024 am 06:16 AM

Tesla is rolling out the latest Full Self-Driving (Supervised) version 12.5.5 and with it comes the promised Cybertruck FSD option at long last, ten months after the pickup went on sale with the feature included in the Foundation Series trim price. F

Garmin releases Adventure Racing activity improvements for multiple smartwatches via new update Oct 01, 2024 am 06:40 AM

Garmin is ending the month with a new set of stable updates for its latest high-end smartwatches. To recap, the company released System Software 11.64 to combat high battery drain across the Enduro 3, Fenix E and Fenix 8 (curr. $1,099.99 on Amazon).

New Xiaomi Mijia Graphene Oil Heater with HyperOS arrives Oct 02, 2024 pm 09:02 PM

Xiaomi will shortly launch the Mijia Graphene Oil Heater in China. The company recently ran a successful crowdfunding campaign for the smart home product, hosted on its Youpin platform. According to the page, the device has already started to ship to

First look: Leaked unboxing video of upcoming Anker Zolo 4-port 140W wall charger with display Oct 01, 2024 am 06:32 AM

Earlier in September 2024, Anker's Zolo 140W charger was leaked, and it was a big deal since it was the first-ever wall charger with a display from the company. Now, a new unboxing video from Xiao Li TV on YouTube gives us a first-hand look at the hi

Samsung Galaxy Z Fold Special Edition revealed to land in late October as conflicting name emerges Oct 01, 2024 am 06:21 AM

The launch of Samsung's long-awaited 'Special Edition' foldable has taken another twist. In recent weeks, rumours about the so-called Galaxy Z Fold Special Edition went rather quiet. Instead, the focus has shifted to the Galaxy S25 series, including

$Manjaro 24.1 \'Xahea\' launches with KDE Plasma 6.1.5, VirtualBox 7.1, and more$ Manjaro 24.1 \'Xahea\' launches with KDE Plasma 6.1.5, VirtualBox 7.1, and more Oct 02, 2024 am 06:06 AM

With a history of over one decade, Manjaro is regarded as one of the most user-friendly Linux distros suitable for both beginners and power users, being easy to install and use. Mostly developed in Austria, Germany, and France, this Arch-based distro

See all articles