Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values
The Arab world did not play a key role in the PC, internet and mobile eras. In the AI era, it will be different. (Shutterstock)
Short Url

Advances in the large language models that underpin generative AI are changing everything, from medicine and education to entertainment.

Our relationship with technology is becoming more intimate as machines change from passive tools into active assistants that amplify our innate human abilities.

This new era poses both a challenge and an opportunity for the Middle East.

The challenge is that leaders in this new field, like OpenAI’s ChatGPT and Google’s Gemini, come from Silicon Valley, or from China, where my team at 01.AI has built models that rival the Americans. In Europe, too, startups such as France’s Mistral have entered the race.

The opportunity is for the Middle East to join this league and make sure its voice is heard.

Inspired by my latest trip to Riyadh, I decided to test how the current crop of AI models would handle a simple request. I imagined myself as a young Saudi getting ready to host a dinner party and asked ChatGPT to prepare a menu.

The food it recommended sounded delicious — stuffed grape leaves, tabouleh salad, mandi and stuffed dates. But the beverages were a problem.

Aside from drinks such as mint lemonade and jallab, a mixture of dates, grape molasses and rose water, ChatGPT also offered this: “For alcoholic beverages, you could offer a selection of international wines, beers, or non-alcoholic mocktails.”

To its credit, when I repeated the question, it offered only non-alcoholic drinks.

If a model recommends breaking both the law and cultural norms, imagine how it might answer other more sensitive questions about politics or religion? Indeed, researchers have even shown that some models have exhibited an anti-Muslim bias.

My modest test underlines the urgent need to develop an Arabic large language model that reflects local values.

The first step to building this is creating enough high-quality Arabic digitized data to properly train a new generation of models.

Although there are 400 million Arabic speakers, only an estimated 2 percent of online content is in Arabic. Meta’s open source LLM model Llama is overwhelmingly trained on English data, with Arabic comprising less than 0.1 percent of the data.

The lack of data naturally skews the results. To fix this dearth of data, either a visionary entrepreneur or a government-backed organization should collect, digitize and convert the many Arabic books into training data for Arabic models.

Once the data is gathered, it can be fed into the breakthrough pre-training process, which reads trillions of words and creates its own virtual concept space or model of the world. This concept space has been shown to be mostly in English and Chinese.

Adding a sizable number of texts in Arabic, which has enormous cultural output and significance, will make the concept space more knowledgeable about Arabic and more balanced in its concepts and views.

After such pre-training, the model needs to be fine-tuned by data and labels from the Arab world, which will align with the values of the region. Those are different from American models, which are aligned to US values, and Chinese models, which reflect Chinese values.

The collection of alignment data, the coordination of human labeling and the alignment process will need to be done in-region by AI experts.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Kai-fu Lee

Finally, safety modules will need to be added to ensure legal compliance and to avoid harm. These will also need to be developed locally.

The above steps will create localized, sovereign models that will reflect the traditions of the Middle East. Privately developed or government-backed, it could be the foundation for a new wave of Arabic AI innovation.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Imagine an AI tool that could find, summarize, organize and write insightful content, an AI teacher that makes learning fun and customized, an AI doctor that is more knowledgeable than any human, an AI engineer that can write software and applications, and an AI assistant that knows its owner better than the owner themselves.

The Arab world did not play a leading role in the PC, internet and mobile eras. In the AI era, it will be different.

This transformation is by no means an easy feat. It will require an unprecedented investment of money, energy and human capital.

Middle Eastern leaders like Saudi Crown Prince Mohammed bin Salman and others have shown that they have the vision, determination and resources to lead their countries into the future.

Standing on my hotel balcony in Jeddah recently, overlooking the King Abdullah University of Science and Technology, I saw part of that vision coming to fruition.

Universities such as KAUST and the Mohamed bin Zayed University of Artificial Intelligence in the UAE are striking examples of the resources that have already been poured into this transformation.

These world-class academic institutions can attract and retain the best top tier global talent.  It is especially important to bring in the world’s best computer engineers to help fulfill this vision of the future AI.

Our team at 01.AI has shown what a group of talented and motivated computer scientists can achieve in just one year. With the right commitment of resources and drawing upon the best talent, countries like Saudi Arabia can easily catch up with their global peers.

The Middle East can also lead the world in the use of renewables to run power-hungry generative AI models.

As it seeks to diversify its economy, Saudi Arabia is actively promoting the use of alternative energy sources such as solar, which could power server farms and reduce their carbon footprint — a growing concern as AI becomes more widespread.

It may take time for countries to figure out their strategy for building a sovereign AI. But it is critical for the Arab world to quickly catalyze the creation of culturally appropriate LLMs and build a rich ecosystem to allow AI-powered Arabic apps to blossom.

A recent encounter with a female sales assistant at a computer store in Riyadh served as an apt reminder of what is at stake. Dressed in jeans and sporting a tattoo, she was a reminder of the transformative changes that the country is undergoing.

Where are you from, I asked. “I’m Saudi,” she said. “One day I want to be Saudi Arabia’s Elon Musk.” I hope on my next visit she will pitch me a homegrown AI app.

Kai-Fu Lee is a computer scientist, CEO of 01.AI, chairman of Sinovation Ventures, former president of Google China, and author of “AI 2041” and “AI Superpowers”
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

German president: accept that US won’t heed international rules

German president: accept that US won’t heed international rules
Updated 4 min 4 sec ago
Follow

German president: accept that US won’t heed international rules

German president: accept that US won’t heed international rules
  • “We have to accept that and we can deal with it,” Steinmeier said at the Munich Security Conference

FRANKFURT: Germany’s President Frank-Walter Steinmeier said the international community will have to deal with a disregard by the new US administration for established diplomatic rules.
“The new American administration has a very different world view to ours, one that has no regard for established rules, partnership and grown trust,” said the German head of state, whose office is largely ceremonial.
“We have to accept that and we can deal with it. But I am convinced that it is not in the interests of the international community for this world view to become the dominant paradigm,” he added, speaking at the Munich Security Conference on Friday.


Israel says received names of 3 hostages to be freed Saturday

Israel says received names of 3 hostages to be freed Saturday
Updated 27 min 30 sec ago
Follow

Israel says received names of 3 hostages to be freed Saturday

Israel says received names of 3 hostages to be freed Saturday
  • One of them is being held by Hamas’s militant ally Islamic Jihad
  • Israel had warned Hamas that it must free three living hostages this weekend or face a resumption of the war

JERUSALEM: Israel said Friday it had received the names of three hostages to be freed by militants this weekend, after a crisis in the ceasefire threatened to plunge Gaza back into war.
The hostages due for release Saturday are Israeli-Russian Sasha Trupanov, Israeli-American Sagui Dekel-Chen and Israeli-Argentinian Yair Horn, the office of Prime Minister Benjamin Netanyahu said in a statement.
One of them is being held by Hamas’s militant ally Islamic Jihad, which participated in the October 7, 2023 attack on Israel that sparked the war in Gaza.
Israel had warned Hamas that it must free three living hostages this weekend or face a resumption of the war, after the group said it would pause releases over what it described as Israeli violations of the Gaza truce.
The January 19 ceasefire has been under massive strain since President Donald Trump proposed a US takeover of the territory, under which Gaza’s population of more than two million would be moved to Egypt or Jordan.
Arab countries have come together to reject the plan, and Saudi Arabia will on February 20 host the leaders of Egypt, Jordan, Qatar and the United Arab Emirates for a summit on the issue.
The releases of Israeli hostages in exchange for Palestinian prisoners, as agreed under the terms of the truce, have brought much-needed relief to families on both sides of the war, but the emaciated state of the Israeli captives freed last week sparked anger in Israel and beyond.
“The latest release operations reinforce the urgent need for ICRC access to those held hostage,” the International Committee of the Red Cross, which has facilitated the exchanges, said in a statement Friday.
“We remain very concerned about the conditions of the hostages.”
Following Hamas’s handover ceremony last week, during which the captives were forced to speak, the ICRC appealed for future handovers to be more private and dignified.
Israeli-American hostage Keith Siegel, who was released in a previous exchange nearly two weeks ago, described mistreatment during his captivity in a video message.
“I am a survivor. I was held for 484 days in unimaginable conditions, every single day felt like it could be my last,” he said.
“I was starved and I was tortured, both physically and emotionally.”
Trump, whose proposal to take over Gaza and move its 2.4 million residents to Egypt or Jordan sparked global outcry, warned this week that “hell” would break loose if Hamas failed to release “all” remaining hostages by noon on Saturday.
Israel later insisted Hamas release “three living hostages” on Saturday.
“If those three are not released, if Hamas does not return our hostages, by Saturday noon, the ceasefire will end,” said government spokesman David Mencer.
If fighting resumes, Israeli Defense Minister Israel Katz said it would not just lead to the “defeat of Hamas and the release of all the hostages,” but also “allow the realization of US President Trump’s vision for Gaza.”
US Secretary of State Marco Rubio was due in Israel, Saudi Arabia and the United Arab Emirates to discuss the ceasefire after attending the Munich Security Conference, where he will hold talks on Ukraine.
Katz last week ordered the Israeli army to prepare for “voluntary” departures from Gaza, and the military said it had already begun reinforcing its troops around the territory.
Mairav Zonszein of the International Crisis Group said despite their public disputes, Israel and Hamas were still interested in maintaining the truce and have not “given up on anything yet.”
“They’re just playing power games,” she said.
Arab countries have put on a rare show of unity in their rejection of Trump’s proposal for Gaza.
After the Riyadh summit, the Arab League will convene in Cairo on February 27 to discuss the same issue.
Trump has threatened to cut off a vital aid lifeline to long-standing allies Jordan and Egypt should they refuse to come on board.
Jordan is already home to more than two million Palestinian refugees. More than half of the country’s population of 11 million is of Palestinian origin.
Egypt put forward its own proposal for the reconstruction of Gaza under a framework that would allow for the Palestinians to remain in the territory.
Palestinians in Gaza have also voiced opposition to the plan.
For Palestinians, any forced displacement evokes memories of the “Nakba,” or catastrophe — the mass displacement of their ancestors during Israel’s creation in 1948.
“Who is Trump? Is he God almighty? The land of Jordan is for Jordanians, and the land of Egypt belongs to Egyptians,” said Gaza City resident Abu Mohamed Al-Husari.
“We are here, deeply rooted in Gaza — the resilient, besieged and unbreakable Gaza.”
Hamas’s October 7, 2023 attack on Israel resulted in the deaths of 1,211 people, mostly civilians, according to an AFP tally of official Israeli figures.
Militants also took 251 hostages, of whom 73 remain in Gaza, including at least 35 the Israeli military says are dead.
Israel’s retaliatory campaign has killed at least 48,239 people in Gaza, the majority of them civilians, according to figures from the health ministry in the Hamas-run territory that the UN considers reliable.


Habib Bank, S&P Global launch Pakistan’s first index to track manufacturing sector

Habib Bank, S&P Global launch Pakistan’s first index to track manufacturing sector
Updated 46 min 55 sec ago
Follow

Habib Bank, S&P Global launch Pakistan’s first index to track manufacturing sector

Habib Bank, S&P Global launch Pakistan’s first index to track manufacturing sector
  • The index will be a standardized economic indicator based on a survey of a diverse panel of industries
  • It will help track economic developments in Pakistan, support decision making by financial institutions

ISLAMABAD: Pakistan’s largest bank, Habib Bank Limited (HBL), and global financial information and analytics firm S&P Global have launched a new index to track the country’s manufacturing sector, the companies said on Friday.
Rising taxes and power tariffs have led to social unrest and hammered industries in Pakistan’s $350 billion economy, as it navigates a tricky path to recovery under a $7 billion International Monetary Fund (IMF) program approved in September.
The HBL S&P Global Purchasing Managers’ Index will be a standardized economic indicator based on a survey of a diverse panel of industries.
It will be Pakistan’s first comprehensive manufacturing index and a welcome source of information for investors in a country where economic data is scarce.
The industries will be asked about their perceptions of current business conditions and future expectations and the index will be released on the first working day of each month, the companies said in a statement.
“The launch of Pakistan’s first ever PMI is a significant event contributing to the accessibility of timely and high-frequency data to track economic developments in Pakistan and support decision making by financial institutions, investors and businesses,” said Luke Thompson, Managing Director of S&P Global Market Intelligence, in a statement.
Muhammad Nassir Salim, President & CEO of HBL said the series will enhance investor confidence and transparency in Pakistan’s economy.


Saudi Arabia praises US-Russia call, welcomes possible summit in Kingdom

Ukrainian President Zelensky, US President Trump and Russian President Putin. (AFP/AP)
Ukrainian President Zelensky, US President Trump and Russian President Putin. (AFP/AP)
Updated 44 sec ago
Follow

Saudi Arabia praises US-Russia call, welcomes possible summit in Kingdom

Ukrainian President Zelensky, US President Trump and Russian President Putin. (AFP/AP)
  • Foreign ministry statement reaffirmed Kingdom’s commitment to mediating resolution to war in Ukraine

RIYADH: Saudi Arabia on Friday welcomed a recent phone call between US President Donald Trump and Russian President Vladimir Putin, as well as the possibility of hosting a summit between the two leaders in the Kingdom, according to a statement issued by the Ministry of Foreign Affairs.

“The Kingdom of Saudi Arabia commends the phone call that took place between His Excellency President Donald J. Trump, President of the United States of America, and His Excellency President Vladimir Putin, President of the Russian Federation, on February 12, 2025,” the statement read.

It further expressed Saudi Arabia’s readiness to host any potential summit and reaffirmed its commitment to mediating a resolution to the ongoing war in Ukraine.

Crown Prince Mohammed bin Salman has reiterated the Kingdom’s support for mediation since the beginning of the war, and during separate calls with both Putin and Ukrainian President Volodymyr Zelensky on March 3, 2022.

“The Kingdom affirms its continued efforts to achieve lasting peace between Russia and Ukraine,” the statement added, underscoring Riyadh’s ongoing diplomatic initiatives over the past three years.


LEAP 2025 concludes with multi-billion dollar investments and global expansion

LEAP 2025 concludes with multi-billion dollar investments and global expansion
Updated 56 min 1 sec ago
Follow

LEAP 2025 concludes with multi-billion dollar investments and global expansion

LEAP 2025 concludes with multi-billion dollar investments and global expansion

The fourth edition of LEAP 2025, held in Riyadh, has reinforced Saudi Arabia’s position as a global hub for artificial intelligence, cloud computing, and digital infrastructure investments, securing total investments exceeding $14.9 billion.

It was the region’s premier technology event, which concluded at the Riyadh Exhibition and Convention Center in Malham under the theme “Into New Horizons.”

The milestone highlighted the Kingdom’s growing influence in the digital economy and innovation landscape.

The success of LEAP 2025 is a testament to the unwavering support of Saudi Crown Prince Mohammed bin Salman bin Abdulaziz, whose vision continues to drive Saudi Arabia’s digital transformation.

This aligns with Vision 2030’s objectives, solidifying the Kingdom’s leadership in technology, AI, and entrepreneurship.

As a global platform, LEAP brings together top thinkers, industry leaders, and investors to drive innovation and accelerate the shift toward a prosperous and sustainable digital economy.

Organized by the Ministry of Communications and Information Technology, in collaboration with the Saudi Federation for Cybersecurity, Programming, and Drones and Tahaluf Company, LEAP 2025 featured over 200,000 attendees; participation from 1,800 international and local entities; more than 1,000 global speakers; a showcase of 680 startups; and major deals with asset managers overseeing $22 trillion in portfolios

These figures underscore Saudi Arabia’s role as a global innovation hub, attracting top-tier companies and talent in the tech sector.

LEAP 2025 witnessed the announcement of landmark investments that will accelerate digital growth in the Kingdom, including a $1.5 billion partnership between Groq and Aramco Digital to boost AI-powered cloud computing investments; a $2 billion investment between Saudi-based Alat and China’s Lenovo to establish an advanced AI and robotics-driven manufacturing and technology hub; and the launch of several digital infrastructure and innovation projects, reinforcing Saudi Arabia’s position as a top destination for global tech investments.

Eng. Abdullah Alswaha, minister of communications and information technology, emphasized that LEAP 2025 reflects the Kingdom’s leadership in global technology innovation.

He said that the international investments and success achieved by Saudi Arabia’s technology sector "are a direct result of the support and empowerment" of the crown prince.

"This momentum drives Saudi Arabia toward its Vision 2030 goals, strengthening its position as a leader in AI, digital transformation, and entrepreneurship,” he added.

Faisal Al-Khamissi, chairman of the Saudi Federation for Cybersecurity, Programming, and Drones and Chairman of Tahaluf, said: “None of these achievements would have been possible without the support and vision of the crown prince, who has transformed Saudi Arabia into a global hub for advanced technology, innovation, and a sustainable digital economy.

“LEAP 2025 was not just a tech event — it was the largest and most impactful edition to date. It connected startups with investors, drove innovation, and unlocked new opportunities for entrepreneurs.”

Al-Khamissi added: "LEAP will expand globally, with two editions planned for 2026 — one in Riyadh and another in Hong Kong. This marks a significant step in Saudi Arabia’s leadership in the global digital economy and AI-driven innovation."

LEAP 2025 has already accelerated the growth of several startups, with notable success stories including Ejari, which started with a small booth at LEAP and secured a $1 million seed round, followed by another $15 million investment. It has since expanded its team to over 30 employees and attracted $50 million in business opportunities.

Data Lexing, a Saudi startup that debuted at LEAP 2022, has since expanded to 10 international markets with over 2,000 clients, 65% of whom are outside Saudi Arabia.

Quant secured $2 million in investment in a single day at LEAP.

With its record-breaking deals, global partnerships, and visionary expansion plans, LEAP continues to shape the future of AI, innovation, and technology investment worldwide.