OpenAI GPT-4 Multimodal AI Showing up Mid-March 2023

Andreas Braun, CTO of Microsoft Germany, confirmed that GPT-4 will be multimodal and arrive within a week of March 9, 2023. Multimodal AI implies that working inside different sorts of information, similar to video, pictures, and sound will be capable.

Multimodal Large Language Models

The announcement’s main takeaway is that GPT-4 is multimodal (SEJ predicted that it would be in January 2023).

The term “modality” refers to the kind of input that, in this instance, is handled by a large language model.

Text, speech, images, and video are all examples of Multimodal.

GPT-3 and GPT-3.5 only used text as their mode of operation.

GPT-4 may be able to operate in at least four modalities, according to the German news report: images, text, sound (auditory), and video.

The detailing needed particulars for GPT-4, so it’s hazy on the off chance that what was shared about multimodality was well defined for GPT-4 or simply overall.

Holger Kenn, Director of Business Strategy for Microsoft, spoke about multimodalities, but it wasn’t clear in the reporting whether he was referring to GPT-4 multimodality or multimodality in genera.

Microsoft’s work on “confidence metrics” to back up their AI with facts to make it more reliable is another interesting fact.

Microsoft Kosmos-1

The fact that Microsoft released a multimodal language model called Kosmos-1 at the beginning of March 2023 appears to have been underreported in the United States.

“…the team subjected the pre-trained model to various tests, with good results in classifying images, answering questions about image content, automated labeling of images, optical text recognition, and speech generation tasks.

…Visual reasoning, i.e. drawing conclusions about images without using language as an intermediate step, seems to be a key here…”

GPT-4 goes further than Kosmos-1 because it adds video and what appears to be sound as a third modality.

Works in a variety of languages

GPT-4 seems to work in every language. It is said to be able to answer questions in Italian and receive questions in German.

That is an odd example because who would ask a question in German and expect an Italian response?

Because it can transfer knowledge between languages, the model transcends language. Therefore, if the response is in Italian, it will be aware of it and able to provide the response in the language that was used to ask the question.

As a result, it would be comparable to the objective of Google’s MUM multimodal AI. Mum is said to be able to answer questions in English when the data are only available in another language, like Japanese.

GPT-4 Applications

GPT-4’s location has not been announced at this time. However, Azure-OpenAI was mentioned specifically.

By integrating a competing technology into its own search engine, Google is struggling to catch up to Microsoft. The development that Google is trailing behind and lacks leadership in AI for consumers is exacerbated by this development.

Multiple Google products, including Google Lens, Google Maps, and other areas where customers interact with Google, already integrate AI. Utilizing AI as an assistive technology to assist people with minor tasks is this strategy.

Microsoft’s implementation is more obvious, and as a result, it is drawing all of the attention and bolstering Google’s image of floundering and struggling to catch up.

Raeesa Sayyad

Next The Amazing Journey of Hamid Amni: An Unstoppable Kickboxer Fighting Against All Odds On and Off the Ring »

Previous « DuckDuckGo AI: What is it? How does DuckAssist work?

Published by

Raeesa Sayyad

Tags: AIGPT 4Multimodal AIOpenAIOpenAI GPT 4

2 years ago

Upstox’s Milan Bavishi Shares Fintech Storytelling Secrets with Invertis University Students

Towards new trends, gathering new trends, and innovating: China Changan debuts at the 2025 Shanghai Auto Show

6 Easy Steps to Optimize Your Google Business Profile for Local SEO to Boost Your Online Presence

Swiss International University (SIU) Elevates Its Global Standing with Prestigious Accreditations and Global Partnerships

Swiss International University (SIU) is on track to be one of the world's most respected… Read More

4 hours ago

Tech

Upstox’s Milan Bavishi Shares Fintech Storytelling Secrets with Invertis University Students

In a session that left students buzzing with fresh ideas and practical insights, Invertis University… Read More

6 hours ago

Tech

Towards new trends, gathering new trends, and innovating: China Changan debuts at the 2025 Shanghai Auto Show

At the 21st Shanghai International Automobile Industry Exhibition, which is surging with the wave of… Read More

6 hours ago

Travel

House of Spells and Comic Con Liverpool Collaborate Again to Bring Wonder and Tourism to Merseyside

Liverpool, UK—House of Spells and Comic Con Liverpool are once again collaborating to bring the… Read More

1 day ago

Startup

From Small Town to Startup Success: The Story of Frontlines Edutech Founders

Introduction In India's booming EdTech space, there's one name that's making waves among Telugu students… Read More

1 day ago

Startup

Why Expert Opinion Matters for Strategy During Litigation

In litigation, often, the difference between winning and losing comes down to strategy. Although facts… Read More

1 day ago

OpenAI GPT-4 Multimodal AI Showing up Mid-March 2023

Multimodal Large Language Models

Microsoft Kosmos-1

Works in a variety of languages

GPT-4 Applications

Related Post

Recent Posts

Swiss International University (SIU) Elevates Its Global Standing with Prestigious Accreditations and Global Partnerships

Upstox’s Milan Bavishi Shares Fintech Storytelling Secrets with Invertis University Students

Towards new trends, gathering new trends, and innovating: China Changan debuts at the 2025 Shanghai Auto Show

House of Spells and Comic Con Liverpool Collaborate Again to Bring Wonder and Tourism to Merseyside

From Small Town to Startup Success: The Story of Frontlines Edutech Founders

Why Expert Opinion Matters for Strategy During Litigation

Headline