Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values
The Arab world did not play a key role in the PC, internet and mobile eras. In the AI era, it will be different. (Shutterstock)
Short Url

Advances in the large language models that underpin generative AI are changing everything, from medicine and education to entertainment.

Our relationship with technology is becoming more intimate as machines change from passive tools into active assistants that amplify our innate human abilities.

This new era poses both a challenge and an opportunity for the Middle East.

The challenge is that leaders in this new field, like OpenAI’s ChatGPT and Google’s Gemini, come from Silicon Valley, or from China, where my team at 01.AI has built models that rival the Americans. In Europe, too, startups such as France’s Mistral have entered the race.

The opportunity is for the Middle East to join this league and make sure its voice is heard.

Inspired by my latest trip to Riyadh, I decided to test how the current crop of AI models would handle a simple request. I imagined myself as a young Saudi getting ready to host a dinner party and asked ChatGPT to prepare a menu.

The food it recommended sounded delicious — stuffed grape leaves, tabouleh salad, mandi and stuffed dates. But the beverages were a problem.

Aside from drinks such as mint lemonade and jallab, a mixture of dates, grape molasses and rose water, ChatGPT also offered this: “For alcoholic beverages, you could offer a selection of international wines, beers, or non-alcoholic mocktails.”

To its credit, when I repeated the question, it offered only non-alcoholic drinks.

If a model recommends breaking both the law and cultural norms, imagine how it might answer other more sensitive questions about politics or religion? Indeed, researchers have even shown that some models have exhibited an anti-Muslim bias.

My modest test underlines the urgent need to develop an Arabic large language model that reflects local values.

The first step to building this is creating enough high-quality Arabic digitized data to properly train a new generation of models.

Although there are 400 million Arabic speakers, only an estimated 2 percent of online content is in Arabic. Meta’s open source LLM model Llama is overwhelmingly trained on English data, with Arabic comprising less than 0.1 percent of the data.

The lack of data naturally skews the results. To fix this dearth of data, either a visionary entrepreneur or a government-backed organization should collect, digitize and convert the many Arabic books into training data for Arabic models.

Once the data is gathered, it can be fed into the breakthrough pre-training process, which reads trillions of words and creates its own virtual concept space or model of the world. This concept space has been shown to be mostly in English and Chinese.

Adding a sizable number of texts in Arabic, which has enormous cultural output and significance, will make the concept space more knowledgeable about Arabic and more balanced in its concepts and views.

After such pre-training, the model needs to be fine-tuned by data and labels from the Arab world, which will align with the values of the region. Those are different from American models, which are aligned to US values, and Chinese models, which reflect Chinese values.

The collection of alignment data, the coordination of human labeling and the alignment process will need to be done in-region by AI experts.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Kai-fu Lee

Finally, safety modules will need to be added to ensure legal compliance and to avoid harm. These will also need to be developed locally.

The above steps will create localized, sovereign models that will reflect the traditions of the Middle East. Privately developed or government-backed, it could be the foundation for a new wave of Arabic AI innovation.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Imagine an AI tool that could find, summarize, organize and write insightful content, an AI teacher that makes learning fun and customized, an AI doctor that is more knowledgeable than any human, an AI engineer that can write software and applications, and an AI assistant that knows its owner better than the owner themselves.

The Arab world did not play a leading role in the PC, internet and mobile eras. In the AI era, it will be different.

This transformation is by no means an easy feat. It will require an unprecedented investment of money, energy and human capital.

Middle Eastern leaders like Saudi Crown Prince Mohammed bin Salman and others have shown that they have the vision, determination and resources to lead their countries into the future.

Standing on my hotel balcony in Jeddah recently, overlooking the King Abdullah University of Science and Technology, I saw part of that vision coming to fruition.

Universities such as KAUST and the Mohamed bin Zayed University of Artificial Intelligence in the UAE are striking examples of the resources that have already been poured into this transformation.

These world-class academic institutions can attract and retain the best top tier global talent.  It is especially important to bring in the world’s best computer engineers to help fulfill this vision of the future AI.

Our team at 01.AI has shown what a group of talented and motivated computer scientists can achieve in just one year. With the right commitment of resources and drawing upon the best talent, countries like Saudi Arabia can easily catch up with their global peers.

The Middle East can also lead the world in the use of renewables to run power-hungry generative AI models.

As it seeks to diversify its economy, Saudi Arabia is actively promoting the use of alternative energy sources such as solar, which could power server farms and reduce their carbon footprint — a growing concern as AI becomes more widespread.

It may take time for countries to figure out their strategy for building a sovereign AI. But it is critical for the Arab world to quickly catalyze the creation of culturally appropriate LLMs and build a rich ecosystem to allow AI-powered Arabic apps to blossom.

A recent encounter with a female sales assistant at a computer store in Riyadh served as an apt reminder of what is at stake. Dressed in jeans and sporting a tattoo, she was a reminder of the transformative changes that the country is undergoing.

Where are you from, I asked. “I’m Saudi,” she said. “One day I want to be Saudi Arabia’s Elon Musk.” I hope on my next visit she will pitch me a homegrown AI app.

Kai-Fu Lee is a computer scientist, CEO of 01.AI, chairman of Sinovation Ventures, former president of Google China, and author of “AI 2041” and “AI Superpowers”
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

Russia open to hearing Trump’s proposals for ending the war, an official says

Russia open to hearing Trump’s proposals for ending the war, an official says
Updated 10 min 51 sec ago
Follow

Russia open to hearing Trump’s proposals for ending the war, an official says

Russia open to hearing Trump’s proposals for ending the war, an official says
  • Russian Deputy Foreign Minister Sergei Ryabkov said Moscow and Washington were “exchanging signals” on Ukraine via “closed channels”
  • Russia is ready to listen to Trump’s proposals on Ukraine provided these were “ideas on how to move forward in the area of settlement”

KYIV: Russia is open to hearing President-elect Donald Trump’s proposals on ending the war, an official said, as a Russian drone killed one person and wounded 13 in the Ukrainian port city of Odesa and the European Union foreign policy chief held talks in Kyiv after the change in US leadership.
Russian Deputy Foreign Minister Sergei Ryabkov said Moscow and Washington were “exchanging signals” on Ukraine via “closed channels.” He did not specify whether the communication was with the current administration or Trump and members of his incoming administration.
Russia is ready to listen to Trump’s proposals on Ukraine provided these were “ideas on how to move forward in the area of settlement, and not in the area of further pumping the Kyiv regime with all kinds of aid,” Ryabkov said Saturday in an interview with Russian state news agency Interfax.
In Kyiv, Foreign Minister Andrii Sybiha told reporters that Ukraine is ready to work with the Trump administration.
“Remember that President (Volodymyr) Zelensky was one of the first world leaders ... to greet President Trump,” he said. “It was a sincere conversation (and) an exchange of thoughts regarding further cooperation.”
“Also during the telephone conversation, further steps to establish communication between teams were discussed and this work has also begun. Therefore, we are open for further cooperation and I’m sure that a unified goal of reaching just peace unites all of us,” Sybiha said.
Sybiha appeared alongside EU foreign policy chief Josep Borrell, who said his visit is meant to stress the European Union’s support to Ukraine.
“This support remains unwavering. This support is absolutely needed, for you to continue defending yourself against Russian aggression,” he said.
Borrell urged “faster deliveries and fewer self imposed red lines” in getting Western weapons to Ukraine. He had appealed to allies in August to lift restrictions on Ukraine’s use of Western-supplied long-range weapons to strike Russian military targets.
In Odesa, regional Gov. Oleh Kiper said high-rise residential buildings, private houses and warehouses in the Black Sea port city were damaged overnight by the “fall” of a drone. He did not specify whether the drone had been shot down by air defenses.
A further 32 Russian drones were shot down over 10 Ukrainian regions, while 18 were “lost,” according to Ukraine’s air force, likely having been electronically jammed.
A Russian aerial bomb struck a busy highway overnight in the northeastern Kharkiv province, Kharkiv Mayor Ihor Terekohov said. No casualties were reported.
Russia is mounting an intensified aerial campaign that Ukrainian officials say they need more Western help to counter. However, doubts are deepening over what Kyiv can expect from a new US administration. Trump has repeatedly taken issue with US aid to Ukraine, made vague vows to end the war and has praised Russian President Vladimir Putin.
In Russia, the Defense Ministry said 50 Ukrainian drones were destroyed over seven Russian regions — more than half over the Bryansk region, bordering Ukraine.


Vinícius nets hat trick in win as three Real Madrid players go down injured

Vinícius nets hat trick in win as three Real Madrid players go down injured
Updated 16 min 15 sec ago
Follow

Vinícius nets hat trick in win as three Real Madrid players go down injured

Vinícius nets hat trick in win as three Real Madrid players go down injured
  • Madrid had heard jeers in their previous two home games
  • Vinícius got his second hat trick of the season

BARCELONA: Vinícius Júnior scored a hat trick to lead Real Madrid a 4-0 win over Osasuna on Saturday in a much-needed victory that was dampened when teammate Éder Militão left on a stretcher.
Madrid had heard jeers in their previous two home games — a 4-0 loss to fierce rival Barcelona and 3-1 defeat to AC Milan.
But the easy victory may have come at the high price.
Militão was taken off after the central defender crumbled to the turf and clutched the back of his right knee shortly before halftime. Rodrygo and Lucas Vázquez also were unable to continue after apparently sustaining muscle injuries in the first half.
Vinícius, who felt overlooked when the Ballon d’Or went to Spain’s Rodri last week, got his second hat trick of the season and took his overall tally to 12 goals. Jude Bellingham added a goal to make it 2-0.
Madrid are in second place in the Spanish league at six points behind leader Barcelona.


Dutch PM to skip climate summit during probe into soccer violence

Dutch PM to skip climate summit during probe into soccer violence
Updated 09 November 2024
Follow

Dutch PM to skip climate summit during probe into soccer violence

Dutch PM to skip climate summit during probe into soccer violence
  • “Due to the major social impact of the events of last Thursday night in Amsterdam, I will remain in the Netherlands,” he said on X
  • “Violence and hate in all their manifestations have no place in sports,” the Palestine Football Association said

AMSTERDAM: Dutch Prime Minister Dick Schoof will miss the COP29 climate summit after clashes in Amsterdam this week between Israeli soccer fans and pro-Palestinian protesters as his government investigates if warning signs from Israel were missed.
“I will not be going to Azerbaijan next week for the UN Climate Conference COP29. Due to the major social impact of the events of last Thursday night in Amsterdam, I will remain in the Netherlands,” he said on social media platform X.
Dutch Climate Minister Sophie Hermans will still attend the Nov. 11-22 environment meeting while a climate envoy will replace Schoof, the premier added, saying Thursday night’s violence in Amsterdam would be discussed at Monday’s cabinet meeting.
At least five people were injured during the unrest involving fans of the visiting Maccabi Tel Aviv soccer team who lost 5-0 to Ajax in the Europa League.
Justice Minister David van Weel said in a letter to parliament that information was still being gathered, including on possible warning signs from Israel, and whether the assaults were organized and had an antisemitic motive.
Fast-track justice would be applied with maximum efforts to find every suspect, he vowed.
Four people remain in custody over the unrest, police said.
Political leaders from Schoof down have denounced the attacks as antisemitic and urged swift justice.
Videos of the unrest on social media showed riot police in action, with some attackers shouting anti-Israeli slurs.
Footage also showed Maccabi Tel Aviv supporters chanting anti-Arab slogans before the match.
Israel sent planes to The Netherlands to bring fans home.
“Violence and hate in all their manifestations have no place in sports,” the Palestine Football Association (PFA) said.
Amsterdam banned demonstrations at the weekend and gave police emergency stop-and-search powers.
Antisemitic incidents have surged in the Netherlands during the Gaza war, with many Jewish organizations and schools reporting threats and hate mail.


Waring holds one-shot Abu Dhabi lead as McIlroy struggles

Waring holds one-shot Abu Dhabi lead as McIlroy struggles
Updated 09 November 2024
Follow

Waring holds one-shot Abu Dhabi lead as McIlroy struggles

Waring holds one-shot Abu Dhabi lead as McIlroy struggles
  • A day after setting a course record 61, the 39-year-old Waring was the only player among the top-29 on the leaderboard to post an over-par score for a total 18-under par 198
  • Fast-rising Dane Niklas Norgaard Moller hit a third round 69 to cut Waring’s five-shot overnight lead

ABU DHABI: England’s Paul Waring shot a one-over par 73 and held a one-shot lead going into the final round of the Abu Dhabi Championship on Saturday as Ireland’s Rory McIlroy continued to struggle.
A day after setting a course record 61, the 39-year-old Waring was the only player among the top-29 on the leaderboard to post an over-par score for a total 18-under par 198.
Fast-rising Dane Niklas Norgaard Moller hit a third round 69 to cut Waring’s five-shot overnight lead.
World number three Rory McIlroy dropped a big number in his closing holes for the second day in a row, this time a double bogey on the par-5 18th after an errant tee shot found water on the left side, to sit five shots off the lead.
On Friday, the Northern Irishman had made a triple bogey on the par-3 17th.
“If you’d given me a one-shot lead going into the final round at the beginning of the week, I would have snatched your hand,” said Waring, who is looking for his first win since the 2018 Nordea Masters.
“A little disappointed, because I felt like I could have really moved forward today and put myself out of sight.
“You’ve got to have an average day, don’t you?“
Three shots back, Ireland’s Shane Lowry (66), the 2019 tournament winner, was tied for third with Englishman Tommy Fleetwood (71), Dane Thorbjoern Olesen (71) and Swede Sebastian Soederberg (68) at 15-under par.
With the wind picking up toward the afternoon and the greens becoming firmer and faster, the conditions were challenging after two benign days.
Waring had taken advantage of the conditions with rounds of 64 and 61 and started the day at 19-under.
An early birdie extended his advantage, but a three-putt bogey on the par-3 fourth hole frayed his nerves, after which he struggled to get his speed and line right with the putter.
British Masters champion Norgaard made his first bogey of the tournament on the ninth hole, but three birdies on the back nine kept him in the hunt for a second title this year.
“Very satisfied with today,” said the 32-year-old, who is almost guaranteed a PGA Tour card next season as one of top-10 players from the DP World Tour’s Race to Dubai rankings.
A disappointed McIlroy closed with a three-under-par 69 and dropped to tied 13th position on 13-under-par 203.
He still felt confident of getting his hands on the trophy in Abu Dhabi for the first time in his career.
“Playing the last two holes two-over two days in a row is not ideal. Cost myself a few shots there,” said McIlroy, who is seeking to secure his sixth DP World Tour Order of Merit crown next week in Dubai and match the legendary Spaniard Seve Ballesteros.
“The leaders weren’t getting away, which was nice and I was making a little bit of a charge. And yeah, just one mistake, that drive on 18, and with it playing so much into the wind.
“It was an untimely mistake, just like yesterday on the 17th, and I dug myself a little bit of a hole to get out of, but depending on what the leaders do, I can still go into tomorrow feeling like I have half a chance.
“I just need to put it all together and play the way I’ve been playing and keep the big mistakes and big numbers off my card and if I can do that and post a score, you never know.”


Croatia arrests four over attack on foreign workers

Croatia arrests four over attack on foreign workers
Updated 09 November 2024
Follow

Croatia arrests four over attack on foreign workers

Croatia arrests four over attack on foreign workers
  • Police said on Saturday that the four arrested were being investigated over a “hate crime“
  • The attack was immediately followed by three other incidents targeting foreign food-delivery workers, also in Split

ZAGREB: Police in Croatia on Saturday said that four men were arrested over a racially-motivated attack against foreign workers followed by three similar incidents that left one Nepali seriously injured.
The European Union country of 3.8 million people is struggling to overcome chronic labor shortage as it faces mass emigration and a shrinking population.
Traditionally reliant on seasonal workers from its Balkan neighbors, Croatia is increasingly counting on laborers from Nepal, India, the Philippines and elsewhere to fill tens of thousands of jobs notably in construction and its key tourism sector on the Adriatic coast.
Police said on Saturday that the four arrested, who are suspected of physically attacking a food-delivery worker in the coastal town of Split, were being investigated over a “hate crime.”
Late Friday, a 41-year-old foreign national and one attacker sustained minor injuries, a police statement said.
The attack was immediately followed by three other incidents targeting foreign food-delivery workers, also in Split, in which one Nepali was seriously injured.
Another victim was Indian, while the nationalities of the other two were not disclosed.
Police said a search for the perpetrators was ongoing.
The government condemned the incidents, labelling them “shocking and disturbing” and vowed on social media “not to allow Croatia to become a country where violence and hatred toward foreign workers are normalized.”
“Foreign workers filled a segment on the labor market that we obviously could not,” Prime Minister Andrej Plenkovic told reporters citing construction and tourist sectors.
Croatia in 2023 provided nearly 120,000 non-EU nationals with work permits, 40 percent more than the previous year.
This year the figure will be surpassed as nearly 150,000 work permits have so far been issued to non-EU nationals.
The number of attacks on foreign workers, notably those delivering food has been increasing, police in the capital Zagreb said earlier this year.
In most cases, they were not racially-motivated but were robberies.
Migrants have been regularly pilloried online with the new labor force facing language barriers and negative attitudes toward foreigners.
Ethnic Croats make up more than 90 percent of Croatia’s population — nearly 80 percent of whom are Roman Catholics.