AT&T clinched the top spot for generative AI (genAI) execution accuracy in a benchmark test, placing it above tech giants including IBM and Alibaba Group.

In a blog, Mark Austin, VP of data science at AT&T, explained the Big Bench for Large-scale Database Grounded Text-to-SQL Evaluation (BIRD) tracks which genAI platforms are the best at translating plain text queries into SQL programming language.

An AT&T representative told Mobile World Live (MWL) the BIRD text-to-SQL parsing took place on 1 September.

The cross-domain benchmark includes more than 12,751 unique question-SQL pairs and 95 large databases with a total size of 33.4 GB. 

It covers more than 37 professional domains including blockchain, healthcare, hockey and education.

The operator launched its Ask AT&T genAI-based tool in June 2023 using an early version of OpenAI’s ChatGPT. 

“What this ranking means is that Ask AT&T is exceptionally good at taking standard, plain English questions and turning them into computer code to find insights in data,” Austin stated.

The representative said AT&T achieved accuracy of more than 72 per cent, compared with 40 per cent for ChatGPT 3.5 and 50 per cent for ChatGPT 4.

AT&T’s submission to the BIRD benchmark answered more than 12,000 questions.

When combining database technology with genAI, AT&T linked schemas to advanced models such as ChatGPT and GPT-4.

Austin explained Ask AT&T allows approved employees to pose specific questions without needing a data scientist or programmer to write code or use algorithms to generate actionable results.

The operator decided to build its own genAI platforms and tools early on by training the large language models on its own vast amount of internal data.

“Off-the-shelf genAI platforms can be incredibly useful,” Austin noted. “But we’ve found that the accuracy and value increase dramatically when you fine tune models on our data or use approaches such as retrieval-augmented generation.”

Austin said the next steps for genAI include moving from “ask data” to “explain data”, which includes building models to spell out what is occurring across areas including churn, sales and fraud.

From there, he stated the evolution of genAI will include “act on data.”

AIOps
The operator currently generates about 1 billion tokens per day.

A genAI token, whether powered by a large or small language model, “is essentially equivalent to a generated word”, Austin wrote.

“So genAI is cranking out the equivalent of about a billion words per day at AT&T.”

“These are everything from automated summaries of inbound customer calls, which saves between 30 seconds to several minutes per call, to suggestions on relevant products and services for customers, to lines of computer code and more.”

Austin said genAI is enabling a level of automation and information that “simply wasn’t possible before”, which makes employees much more efficient at their jobs.