Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values

Training an Arabic LLM that reflects local values
The Arab world did not play a key role in the PC, internet and mobile eras. In the AI era, it will be different. (Shutterstock)
Short Url

Advances in the large language models that underpin generative AI are changing everything, from medicine and education to entertainment.

Our relationship with technology is becoming more intimate as machines change from passive tools into active assistants that amplify our innate human abilities.

This new era poses both a challenge and an opportunity for the Middle East.

The challenge is that leaders in this new field, like OpenAI’s ChatGPT and Google’s Gemini, come from Silicon Valley, or from China, where my team at 01.AI has built models that rival the Americans. In Europe, too, startups such as France’s Mistral have entered the race.

The opportunity is for the Middle East to join this league and make sure its voice is heard.

Inspired by my latest trip to Riyadh, I decided to test how the current crop of AI models would handle a simple request. I imagined myself as a young Saudi getting ready to host a dinner party and asked ChatGPT to prepare a menu.

The food it recommended sounded delicious — stuffed grape leaves, tabouleh salad, mandi and stuffed dates. But the beverages were a problem.

Aside from drinks such as mint lemonade and jallab, a mixture of dates, grape molasses and rose water, ChatGPT also offered this: “For alcoholic beverages, you could offer a selection of international wines, beers, or non-alcoholic mocktails.”

To its credit, when I repeated the question, it offered only non-alcoholic drinks.

If a model recommends breaking both the law and cultural norms, imagine how it might answer other more sensitive questions about politics or religion? Indeed, researchers have even shown that some models have exhibited an anti-Muslim bias.

My modest test underlines the urgent need to develop an Arabic large language model that reflects local values.

The first step to building this is creating enough high-quality Arabic digitized data to properly train a new generation of models.

Although there are 400 million Arabic speakers, only an estimated 2 percent of online content is in Arabic. Meta’s open source LLM model Llama is overwhelmingly trained on English data, with Arabic comprising less than 0.1 percent of the data.

The lack of data naturally skews the results. To fix this dearth of data, either a visionary entrepreneur or a government-backed organization should collect, digitize and convert the many Arabic books into training data for Arabic models.

Once the data is gathered, it can be fed into the breakthrough pre-training process, which reads trillions of words and creates its own virtual concept space or model of the world. This concept space has been shown to be mostly in English and Chinese.

Adding a sizable number of texts in Arabic, which has enormous cultural output and significance, will make the concept space more knowledgeable about Arabic and more balanced in its concepts and views.

After such pre-training, the model needs to be fine-tuned by data and labels from the Arab world, which will align with the values of the region. Those are different from American models, which are aligned to US values, and Chinese models, which reflect Chinese values.

The collection of alignment data, the coordination of human labeling and the alignment process will need to be done in-region by AI experts.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Kai-fu Lee

Finally, safety modules will need to be added to ensure legal compliance and to avoid harm. These will also need to be developed locally.

The above steps will create localized, sovereign models that will reflect the traditions of the Middle East. Privately developed or government-backed, it could be the foundation for a new wave of Arabic AI innovation.

A new Arabic-enhanced large language model could encourage entrepreneurs and developers to build new applications tailored to the needs of their nations.

Imagine an AI tool that could find, summarize, organize and write insightful content, an AI teacher that makes learning fun and customized, an AI doctor that is more knowledgeable than any human, an AI engineer that can write software and applications, and an AI assistant that knows its owner better than the owner themselves.

The Arab world did not play a leading role in the PC, internet and mobile eras. In the AI era, it will be different.

This transformation is by no means an easy feat. It will require an unprecedented investment of money, energy and human capital.

Middle Eastern leaders like Saudi Crown Prince Mohammed bin Salman and others have shown that they have the vision, determination and resources to lead their countries into the future.

Standing on my hotel balcony in Jeddah recently, overlooking the King Abdullah University of Science and Technology, I saw part of that vision coming to fruition.

Universities such as KAUST and the Mohamed bin Zayed University of Artificial Intelligence in the UAE are striking examples of the resources that have already been poured into this transformation.

These world-class academic institutions can attract and retain the best top tier global talent.  It is especially important to bring in the world’s best computer engineers to help fulfill this vision of the future AI.

Our team at 01.AI has shown what a group of talented and motivated computer scientists can achieve in just one year. With the right commitment of resources and drawing upon the best talent, countries like Saudi Arabia can easily catch up with their global peers.

The Middle East can also lead the world in the use of renewables to run power-hungry generative AI models.

As it seeks to diversify its economy, Saudi Arabia is actively promoting the use of alternative energy sources such as solar, which could power server farms and reduce their carbon footprint — a growing concern as AI becomes more widespread.

It may take time for countries to figure out their strategy for building a sovereign AI. But it is critical for the Arab world to quickly catalyze the creation of culturally appropriate LLMs and build a rich ecosystem to allow AI-powered Arabic apps to blossom.

A recent encounter with a female sales assistant at a computer store in Riyadh served as an apt reminder of what is at stake. Dressed in jeans and sporting a tattoo, she was a reminder of the transformative changes that the country is undergoing.

Where are you from, I asked. “I’m Saudi,” she said. “One day I want to be Saudi Arabia’s Elon Musk.” I hope on my next visit she will pitch me a homegrown AI app.

Kai-Fu Lee is a computer scientist, CEO of 01.AI, chairman of Sinovation Ventures, former president of Google China, and author of “AI 2041” and “AI Superpowers”
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

Pakistan and Denmark discuss multilateral cooperation, strengthening economic ties

Pakistan and Denmark discuss multilateral cooperation, strengthening economic ties
Updated 10 min 42 sec ago
Follow

Pakistan and Denmark discuss multilateral cooperation, strengthening economic ties

Pakistan and Denmark discuss multilateral cooperation, strengthening economic ties
  • Both countries began their two-year terms as non-permanent Security Council members in January
  • They discuss trade and investment with by leveraging public-private partnership, says the foreign office

ISLAMABAD: Pakistan’s Deputy Prime Minister Ishaq Dar on Saturday discussed cooperation on multilateral platforms with Denmark’s Foreign Minister Lars Løkke Rasmussen, with the foreign office in Islamabad describing their conversation as “productive.”
Both Pakistan and Denmark began their two-year terms as non-permanent members of the United Nations Security Council in January 2025, joining the 15-member body responsible for maintaining international peace and security.
The Council comprises five permanent members — China, France, Russia, the United Kingdom, and the United States — and ten non-permanent members elected for two-year terms by the General Assembly.
Non-permanent members play a crucial role in shaping the Council’s agenda, participating in decision-making processes, and contributing to resolutions on global issues.​
“Deputy Prime Minister and Foreign Minister Senator Mohammad Ishaq Dar had a productive phone conversation with Denmark’s Foreign Minister, Lars Løkke Rasmussen,” the foreign office said in a social media post. “The two leaders expressed their resolve to strengthen the long-standing friendship between Pakistan and Denmark into a strong economic partnership.”
“They explored ways to expedite collaboration in trade and investment through the promotion of Public-Private Partnerships,” it continued. “Additionally, as non-permanent members of the UN Security Council for 2025 and 2026, they pledged to cooperate on multilateral platforms to advance mutual interests, peace, and sustainable development.”
In August 2022, Pakistan and Denmark signed the Green Framework Engagement Agreement, aiming to enhance collaboration on climate change mitigation, renewable energy and sustainable development. While the volume of trade between the two countries remains modest, Pakistan has sought to intensify economic diplomacy with European nations in recent years.​
The relations between both sides have also faced challenges in the past. In 2005, anti-Islam caricatures published in a Danish newspaper led to public protests in Pakistan.
In 2008, the Danish embassy in Islamabad was targeted in a bombing that led to several casualties, with Al-Qaeda claiming responsibility and citing the cartoons as its motivation.
More recently, Denmark faced criticism from Muslim states for allowing public burnings of Islamic scripture.
In December 2023, however, the Danish government enacted a law criminalizing the public desecration of religious texts, deescalating tensions and aiding in the normalization of diplomatic relations with Muslim-majority countries.​


In war-torn Sudan, a school offers a second chance at education

In war-torn Sudan, a school offers a second chance at education
Updated 34 min ago
Follow

In war-torn Sudan, a school offers a second chance at education

In war-torn Sudan, a school offers a second chance at education

Port Sudan: In a worn-down classroom in eastern Sudan, men and women watch attentively from a wood bench as a teacher scribbles Arabic letters on a faded blackboard.
Nodding approvingly in the corner is the school’s 63-year-old founder Amna Mohamed Ahmed, known to most as “Amna Oor,” which partly means lion in the Beja language of eastern Sudan.
She has spent the last three decades helping hundreds return to their education in Port Sudan, now the country’s de facto capital.
The educator, who wears an orange headscarf wrapped neatly around her head, said she started the project in 1995 because of widespread illiteracy in her community.
“That’s what pushed me to act. People wanted to learn — if they didn’t, they wouldn’t have kept coming,” she told AFP.
Ahmed’s classes offer a second chance to those who missed out on formal education, particularly women who were denied schooling due to cultural or financial barriers.
A fresh start
For 39-year-old Nisreen Babiker, going back to school has been a long-held dream.
She left school in 2001 after marrying and taking on the responsibility of raising her younger siblings following her father’s death.
“My siblings grew up and studied, and my children too,” she said.
“I felt the urge to return to school. Even after all these years, it feels like I’m starting fresh,” she told AFP.
Ahmed’s school has also become a haven for those displaced by Sudan’s ongoing conflict, which erupted in April 2023 between army chief Abdel Fattah Al-Burhan and his former deputy Mohamed Hamdan Dagalo, who leads the paramilitary Rapid Support Forces (RSF).
The war has killed tens of thousands, uprooted over 12 million, and driven swathes of the country into hunger and famine.
Maria Adam is among those who fled their homes after war broke out. She arrived in Port Sudan seeking safety and a better future.
“When I arrived in Port Sudan, I heard about this place and joined,” said the 28-year-old, noting that she dropped out of school when she was 11.
Changing lives
“I want to finish my education so I can help my children,” Adam told AFP.
Sudan’s education system has been shattered by the conflict, with the United Nations estimating that over 90 percent of the country’s 19 million school-age children now have no access to formal learning.
Across the nation, most classrooms have been converted into shelters for displaced families.
Even before the war, a 2022 Save the Children analysis ranked Sudan among the countries most at risk of educational collapse.
But the determination to learn remains strong at the Port Sudan school, where many students have gone on to enter high school and some have even graduated from university.
In one corner of the classroom, a mother joins her young son in a lesson, hoping to reshape both their futures.
“To watch someone go from not knowing how to read or write to graduating from university, getting a job, supporting their family — that is what keeps me going,” Ahmed said.
“They go from being seen as a burden to becoming productive, educated members of society,” she added.


Ukrainian attacks on Russian border have killed 652 civilian so far, TASS reports

Ukrainian attacks on Russian border have killed 652 civilian so far, TASS reports
Updated 41 min 51 sec ago
Follow

Ukrainian attacks on Russian border have killed 652 civilian so far, TASS reports

Ukrainian attacks on Russian border have killed 652 civilian so far, TASS reports

Ukrainian attacks on Russian regions on and near the border with Ukraine have killed 652 civilians so far, the head of Russia's Investigative Committee told the TASS news agency in remarks published on Sunday, without providing evidence.
Twenty-three children were among those killed, Alexander Bastrykin, the head of the committee, told TASS. Nearly 3,000 have been wounded, he added.
Both sides deny targeting civilians in their attack in the war that Russia launched with its full-scale invasion on Ukraine three years ago. But thousands of civilians have died in the conflict, the vast majority of them Ukrainian.


US defense chief signs declaration to expedite delivery of $4 billion in military aid to Israel

US defense chief signs declaration to expedite delivery of $4 billion in military aid to Israel
Updated 02 March 2025
Follow

US defense chief signs declaration to expedite delivery of $4 billion in military aid to Israel

US defense chief signs declaration to expedite delivery of $4 billion in military aid to Israel
  • Since January 20, the Trump administration has approved nearly $12 billion in major foreign military sales to Israel
  • On Friday, the US approved the potential sale of nearly $3 billion worth of bombs, demolition kits and other weaponry to Israel

 

WASHINGTON: US Secretary of State Marco Rubio said on Saturday he had signed a declaration to expedite delivery of approximately $4 billion in military assistance to Israel.
The Trump administration, which took office on January 20, has approved nearly $12 billion in major foreign military sales to Israel, Rubio said in a statement, adding that it “will continue to use all available tools to fulfill America’s long-standing commitment to Israel’s security, including means to counter security threats.”
Rubio said he had used emergency authority to expedite the delivery of military assistance to Israel to its Middle East ally, now in a fragile ceasefire with Hamas militants in their war in Gaza.
The Pentagon said on Friday that the State Department had approved the potential sale of nearly $3 billion worth of bombs, demolition kits and other weaponry to Israel.
The administration notified Congress of those prospective weapons sales on an emergency basis, sidestepping a long-standing practice of giving the chairs and ranking members of the House Foreign Affairs and Senate Foreign Relations Committees the opportunity to review the sale and ask for more information before making a formal notification to Congress.
Friday’s announcements marked the second time in recent weeks that President Donald Trump’s administration has declared an emergency to quickly approve weapons sales to Israel. The Biden administration also used emergency authority to approve the sale of arms to Israel without congressional review.
On Monday, the Trump administration rescinded a Biden-era order requiring it to report potential violations of international law involving US-supplied weapons by allies, including Israel. It has also eliminated most US humanitarian foreign aid.
The January 19 Israel-Hamas ceasefire agreement halted 15 months of fighting and paved the way for talks on ending the war, while leading to the release of 44 Israeli hostages held in Gaza and around 2,000 Palestinian prisoners and detainees held by Israel.
Hours after the first phase of the agreed ceasefire was set to expire, Israel said early on Sunday it would adopt a proposal by Trump’s envoy, Steve Witkoff, for a temporary ceasefire in Gaza for the Ramadan and Passover periods.
Israel and Hamas have accused each other of violating the ceasefire, casting doubt over the second phase of the deal meant to include releases of additional hostages and prisoners as well as steps toward a permanent end of the war.


Private US spaceship hours from Moon landing attempt

Private US spaceship hours from Moon landing attempt
Updated 02 March 2025
Follow

Private US spaceship hours from Moon landing attempt

Private US spaceship hours from Moon landing attempt
  • Firefly Aerospace’s Blue Ghost Mission 1 is targeting landing no sooner than 3:34 a.m. US Eastern time (0834 GMT) on Sunday
  • Landing site target is near Mons Latreille, a volcanic feature in Mare Crisium on the Moon’s northeastern near side

WASHINGTON: After a long journey through space, a US company is just hours away from attempting a daring lunar touchdown — its spacecraft poised to become only the second private lander to achieve the feat if it succeeds.
Firefly Aerospace’s Blue Ghost Mission 1 is targeting landing no sooner than 3:34 a.m. US Eastern time (0834 GMT) on Sunday, aiming for a site near Mons Latreille, a volcanic feature in Mare Crisium on the Moon’s northeastern near side.
“Blue Ghost is ready to take the wheel!” the company posted on X on Saturday evening, adding flight controllers had just initiated a key maneuver that lowers a spacecraft’s orbit.
Nicknamed “Ghost Riders in the Sky,” the mission comes just over a year after the first-ever commercial lunar landing and is part of a NASA partnership with industry to cut costs and support Artemis, the program aiming to return astronauts to the Moon.
The golden lander, about the size of a hippopotamus, launched on January 15 on a SpaceX Falcon 9 rocket, capturing stunning footage of Earth and the Moon along the way. It shared a ride with a Japanese company’s lander set to attempt a landing in May.
Blue Ghost carries ten instruments, including a lunar soil analyzer, a radiation-tolerant computer and an experiment testing the feasibility of using the existing global satellite navigation system to navigate the Moon.
Designed to operate for a full lunar day (14 Earth days), Blue Ghost is expected to capture high-definition imagery of a total eclipse on March 14, when Earth blocks the Sun from the Moon’s horizon.
On March 16, it will record a lunar sunset, offering insights into how dust levitates above the surface under solar influence — creating the mysterious lunar horizon glow first documented by Apollo astronaut Eugene Cernan.

This undated image released by Firefly Aerospace Moon shot from Blue Ghost's top deck while in lunar orbit, shows imagery of the Moon’s south pole on the far left. (Handout / Firefly Aerospace / AFP)

Blue Ghost’s arrival will be followed on March 6 by Intuitive Machines’ IM-2 mission, featuring its lander Athena.
In February 2024, Intuitive Machines became the first private company to achieve a soft lunar landing — also the first US landing since the crewed Apollo 17 mission of 1972.
However, the success was tempered by a mishap: the lander came down too fast, tipped over on impact, leaving it unable to generate enough solar power and cutting the mission short.
This time, the company says it has made key improvements to the hexagonal-shaped lander, which has a taller, slimmer profile than Blue Ghost, and is around the height of an adult giraffe.
Athena launched on Wednesday aboard a SpaceX rocket, taking a more direct route toward Mons Mouton — the southernmost lunar landing site ever attempted.
Its payloads include three rovers, a drill to search for ice and the star of the show: a first-of-its-kind hopping drone designed to explore the Moon’s rugged terrain.

This undated image released by Firefly Aerospace Moon shot from Blue Ghost's top deck while in lunar orbit, shows imagery of the Moon’s south pole on the far left. (Handout / Firefly Aerospace / AFP)

Landing on the Moon presents unique challenges due to the absence of an atmosphere, making parachutes ineffective.
Instead, spacecraft must rely on precisely controlled thruster burns to slow their descent.
Until Intuitive Machines’ first successful mission, only five national space agencies had accomplished this feat: the Soviet Union, the United States, China, India and Japan, in that order.
Now, the United States is working to make private lunar missions routine through NASA’s $2.6 billion Commercial Lunar Payload Services (CLPS) program.
The missions come at a delicate moment for NASA, amid speculation that it may scale back or even cancel its Artemis lunar program in favor of prioritizing Mars exploration — a key goal of both President Donald Trump and his close adviser, SpaceX founder Elon Musk.