Home > Technology peripherals > AI > body text

Zhipu AI cooperated with Tsinghua KEG to release an open source multi-modal large model called CogVLM-17B

WBOY
Release: 2023-10-12 11:41:01
forward
1188 people have browsed it

whip牛士 News on October 12th, recently, Zhipu AI & Tsinghua KEG released and directly open sourced the multi-modal large model-CogVLM-17B in the Moda community. It is reported that CogVLM is a powerful open source visual language model that uses the visual expert module to deeply integrate language coding and visual coding, and has achieved SOTA performance on 14 authoritative cross-modal benchmarks.

Zhipu AI cooperated with Tsinghua KEG to release an open source multi-modal large model called CogVLM-17B

CogVLM-17B is currently the model with the first comprehensive performance on the multi-modal authoritative academic list, and has achieved the most advanced or second place results on 14 data sets. The effect of CogVLM depends on the idea of ​​"visual priority", that is, giving visual understanding a higher priority in multi-modal models. It uses a 5B parameter visual encoder and a 6B parameter visual expert module, with a total of 11B parameters to model image features, even more than the 7B parameters of text

The above is the detailed content of Zhipu AI cooperated with Tsinghua KEG to release an open source multi-modal large model called CogVLM-17B. For more information, please follow other related articles on the PHP Chinese website!

source:sohu.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template