


OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models
OpenAI o1 and o1-mini have arrived. These AI LLMs perform much better on coding, math, and science problems and tasks than prior models such as GPT-4o by taking more time to think.
Complex problems in STEM tend to require more than a quick online search for correct answers. By giving the o1 AI more time to think, the AI can reason more carefully and accurately. The o1-mini model has been specifically tuned to answer STEM questions with faster speed and lower demand on computer resources, and it is notably better at coding than the o1 model.
Across a range of standardized AP exams and STEM tests for LLMs, the o1 models perform with high accuracy. Specifically, on the AP Calculus, AP Chemistry, AP Physics 2, LSAT, and SAT evidence-based reading & writing tests, the o1 models perform at or above the B-grade level (~80% or higher). The models answer accurately at the A-grade level on PhD-level physics questions, at the B-grade level on tough 2024 American Invitational Mathematics Examination math questions, and at the high B-grade level on Codeforces coding problems. Because o1 has been tuned for answering STEM questions, its performance on AP English Language and AP English Literature is at or below the C-grade level.
Interestingly, while GPT-4o is dumbfounded by the cryptographic challenge of decoding “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz” when given the hint “oyfjdnisdr rtqwainr acxz mynzbhhx” means “Think step by step”, o1 had no issues thinking through the problem to come up with the correct answer “There are three r’s in strawberry”. This new power will delight hobby cryptographers at home as well as the NSA.
Closet evil-doers will want to know that while the uncensored o1 models are apt to give troubling replies, OpenAI has neutered these models for release. The o1 models have been tested to resist answering questions about making bioweapons, producing naughty images, jailbreaking itself, and harassing and threatening. Unfortunately, the OpenAI o1 models remain gender and race biased when tested, despite tuning efforts.
ChatGPT Plus and Team users along with API usage tier 5 developers have access to o1 models immediately, and ChatGPT Edu and Enterprise users will gain access on the week of September 16. ChatGPT Free users will gain access to o1-mini in the near future. The o1 models cannot browse the web or accept uploaded files and images to answer questions, so OpenAI recommends users continue using their GPT-4o models for general questions.
Users who want to ask AI questions now have a wide-range of capable LLM models to interact with besides those from OpenAI, including Anthropic Claude, Microsoft CoPilot, Google Gemini, and X Grok. Every AI has specific advantages, so it is worth testing several AI models to find one that best suits individual needs. Some of these AI are built into smart glasses (like these on Amazon) and voice recorders (like this one on Amazon), and some upcoming autonomous humanoid robots use proprietary AI to cook and clean.
The above is the detailed content of OpenAI o1 and o1-mini arrive as AIs that handle STEM questions better than prior models. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Huawei is rolling out software version 5.0.0.100(C00M01) for the Watch GT 5 and the Watch GT 5 Prosmartwatchesglobally. These two smartwatches recently launched in Europe, with the standard model arriving as the company’s cheapest model. This Harmony

Katsuhiro Harada, the Tekken series director, once seriously tried to bring Colonel Sanders into the iconic fighting game. In an interview with TheGamer, Harada revealed that he pitched the idea to KFC Japan, hoping to add the fast-food legend as a g

Tesla is rolling out the latest Full Self-Driving (Supervised) version 12.5.5 and with it comes the promised Cybertruck FSD option at long last, ten months after the pickup went on sale with the feature included in the Foundation Series trim price. F

Garmin is ending the month with a new set of stable updates for its latest high-end smartwatches. To recap, the company released System Software 11.64 to combat high battery drain across the Enduro 3, Fenix E and Fenix 8 (curr. $1,099.99 on Amazon).

Xiaomi will shortly launch the Mijia Graphene Oil Heater in China. The company recently ran a successful crowdfunding campaign for the smart home product, hosted on its Youpin platform. According to the page, the device has already started to ship to

Earlier in September 2024, Anker's Zolo 140W charger was leaked, and it was a big deal since it was the first-ever wall charger with a display from the company. Now, a new unboxing video from Xiao Li TV on YouTube gives us a first-hand look at the hi

The launch of Samsung's long-awaited 'Special Edition' foldable has taken another twist. In recent weeks, rumours about the so-called Galaxy Z Fold Special Edition went rather quiet. Instead, the focus has shifted to the Galaxy S25 series, including

With a history of over one decade, Manjaro is regarded as one of the most user-friendly Linux distros suitable for both beginners and power users, being easy to install and use. Mostly developed in Austria, Germany, and France, this Arch-based distro
