The new, multimodal LLM from Open AI could become a real game changer

On 13 May 2024, Open AI published the new Large Language Model GPT-4o. The ‘o’ stands for ‘omni’ and already points to one of the model's major strengths, namely the intelligent interaction of text, audio and images. In my blog post, I show what these new possibilities mean for Gen AI use cases in the financial sector.

The new features of GPT-4o

Multimodal input and output in real time

GPT-4o is a flagship model that can simultaneously process text, audio and images in real time. The output can also be multimodal. For example, it can respond to audio input within an average of 320 milliseconds. This is roughly equivalent to what is perceived as a normal response time in interpersonal communication. These capabilities now open up exciting possibilities for seamless human-computer interaction.

Better understanding of text, images and sound in interaction

Compared to previous models, GPT-4o not only boasts improved text comprehension, but also advances in image and audio processing. It can process visual information such as images and graphics and understand audio input better. Significant progress has been made in the more precise recognition of intentions, feelings or content ‘between the lines’. This makes GPT-4o a powerful tool for applications such as real-time translation, meeting support or customer service.

Cost reduction for tokens

GPT-4o achieves the text and code performance of GPT-4 Turbo and is significantly faster. Thanks to an improved token design, the costs for API usage are significantly lower - by up to 50 per cent, according to Open AI. This cost saving makes it attractive for even more applications for which a closed-source model was previously out of the question.

Gen AI use cases in the financial sector

So what does this mean for use cases in the financial sector? Here is a selection:

Searching for and summarising documents

Banks spend a lot of time searching for information in contracts, internal guidelines and regulatory requirements. Gen AI can effectively help bank employees to find and understand complex information so that they can spend more time with their customers. Images and graphics can now also be better recognised and interpreted. Examples include property photos in mortgage lending or graphics in business reports for corporate customers.

Conversational bank assistant

Imagine a virtual bank assistant that is controlled by Gen AI. It holds natural conversations with customers and answers questions about account balances, transaction histories and investment options. Beyond basic FAQs, it can offer personalised financial advice and increase customer satisfaction. Bank customers can now communicate much more easily and in multiple languages with a virtual agent, in writing or verbally, 24 hours a day, 7 days a week.

Content creation

Creating reports, summaries and other content can be time-consuming. Gen AI can generate documents, reports with tables and graphs, a credit opinion or investment decisions at the touch of a button. One specific example is checking contracts with critical third-party ICT providers for DORA compatibility. Although this was already possible in the past, the improvements in terms of intent now allow for more precise and therefore more legally compliant answers to specific questions in the form of prompts.

Intuitive data access

Gen AI analyses historical financial data, identifies trends and forecasts market movements. It enables investment teams to make informed decisions, optimise portfolios and manage risks. By automating data analysis, strategic planning is accelerated. Chart analyses can now be carried out better and more reliable forecasts can be created in combination with time series models.

Strategic decision-making

A bank's specialised departments such as risk management, finance or sales controlling can use Gen AI to gain forward-looking insights, explain deviations and recommend strategic measures. By automating routine tasks, finance professionals can focus even more on high-impact activities. One example is the creation of intuitive visualisations of trends and correlations to prepare strategic decisions. Significant progress can be expected here in the area of business intelligence.

Regulatory compliance

Compliance with ever-changing regulations is crucial for financial institutions. Gen AI monitors changes, interprets complex rules and notifies compliance officers. It ensures that regulatory requirements are met and risks and penalties are minimised. Improved language and text understanding reduces room for interpretation. Intelligent prompt engineering, the enrichment of models with domain-specific knowledge, parameter-efficient fintuning (PEFT) and knowledge graphs are further methods for improving the robustness of the generated content.

Conclusion

To summarise, the possibilities of GPT-4o and the potential of Gen AI for the financial sector are very promising.

Open AI is certainly still the top dog with GPT-4o. But the other model providers such as Meta, Google, Anthropic, Mistral and Aleph Alpha will soon follow suit when it comes to multimodal real-time processing and the improved understanding of text, images and audio in combination.

Development is progressing rapidly. We will soon discover even more innovative use cases that will change the financial industry. Stay tuned for the next wave of AI-driven advances!

You can find more exciting topics from the world of adesso in our previous blog posts.

GenAI@adesso

Would you like to find out more about GenAI and how we can support you? Then take a look at our website. Podcasts, blog posts, events, studies and much more - we offer you a compact overview of all topics relating to GenAI.

Learn more about GenAI on our website

Picture Andreas Strunz

Author Andreas Strunz

Andreas Strunz heads adesso's Competence Centre Banking AI and focuses on feasible, beneficial AI use cases at the interface between banking business and technology.

Save this page. Remove this page.