Norway's 2 petabytes of Huawei flash storage and LLM training

TL;DR

Norway’s National Library is developing a Norwegian-language large language model (LLM) using 2 petabytes of Huawei flash storage. The project aims to create a sovereign AI that understands Norwegian culture and language, with ongoing training and technical challenges.

Norway’s National Library is actively training a Norwegian-language large language model (LLM) using 2 petabytes of Huawei OceanStor Dorado flash storage, aiming to develop a sovereign AI that reflects Norwegian language, culture, and history. Training an LLM in Swift, Part 1: Taking matrix mult from Gflop/s to Tflop/s

The project was discussed by Marius Husnes, Head of IT at the National Library, during Huawei’s ID Forum 2026 in Paris. Norway’s government tasked the library with creating a sovereign AI to preserve and represent Norwegian cultural heritage, leveraging the library’s extensive digital collection, which includes books, newspapers, web content, and multimedia, totaling around 20 petabytes of unique data.

The library has digitized its collection since 2005, accumulating a total of approximately 60 petabytes of data stored across multiple media types, including optical disks and tapes. Husnes explained that the main challenge is not compute power but data quality, cleaning, and pipeline throughput. The data pipeline involves ingestion, cleaning, deduplication, normalization, and validation, with storage infrastructure comprising Huawei OceanStor Dorado all-flash arrays for low-latency processing.

The training process utilizes Norway’s national supercomputer, Sigma2 Olivia, equipped with 448 GPUs and over 64,000 CPU cores, connected to a 5.3 petabyte Cray storage system. Husnes highlighted difficulties in moving data from the large, cost-optimized preservation system to the AI pipeline, which requires high throughput and low latency. The project is still in progress, with ongoing efforts to develop evaluation tools, governance policies, and system orchestration.

Why It Matters

This project demonstrates the increasing role of Huawei’s flash storage solutions in European AI infrastructure, especially for small nations seeking to build sovereign AI models. It highlights the technical and governance challenges involved in developing language-specific LLMs, which are crucial for preserving cultural identity and ensuring national autonomy in AI development.

For countries with less dominant languages, creating a local LLM ensures better representation of local history, culture, and news, addressing limitations of globally trained, English-centric models. Norway’s initiative may serve as a blueprint for other non-English-speaking nations aiming for sovereignty in AI technology.

Amazon

Huawei OceanStor Dorado all-flash storage array

As an affiliate, we earn on qualifying purchases.

Background

Norway’s effort is part of a broader global trend where smaller nations seek to develop localized AI models to preserve cultural identity and ensure control over their data and AI applications. The project builds on Norway’s extensive digital archives, digitized since 2005, and responds to the lack of commercial solutions tailored for Norwegian language and culture. The use of Huawei storage solutions indicates significant involvement of Chinese technology in European AI infrastructure, amidst geopolitical considerations.

“No private company has this,”

— Marius Husnes, Head of IT at the Norwegian National Library

“The bottleneck is not compute; it’s data quality, cleaning, and pipeline throughput.”

— Husnes

“We are still learning about evaluation, governance, and orchestration,”

— Husnes

fanxiang 1TB PCIe 5.0 NVMe M.2 SSD,Up to 14000 MB/s,High Performance Solid State Drive for 8K Video Editing, AI Training,Gaming, PC, Laptop

Extreme PCIe 5.0 Performance:Utilizing the latest PCIe 5.0 interface, bandwidth is doubled to approximately 64 GT/s, delivering sequential read speeds up…

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how effective the Norwegian LLM will be in real-world applications or how governance and access policies will be finalized. The development of evaluation tools and standards for such a language-specific model remains ongoing, and the project’s long-term impact is still uncertain.

QNAP TL-R1600PES-RP-US 16 Bay 3U Short Depth rackmount PCIe Interface SATA JBOD for petabyte-Scale Expansion (Diskless)

Mini-SAS HD (SFF-8644) 1 x 2 (in, out)

As an affiliate, we earn on qualifying purchases.

What’s Next

The next steps include completing the training process, developing evaluation and governance frameworks, and integrating the LLM into practical applications. Norway plans to refine its models and policies, with potential public deployment once these challenges are addressed. Monitoring how the model performs and is adopted will be key.

HHCJ6 Dell NVIDIA Tesla K80 24GB GDDR5 PCI-E 3.0 Server GPU Accelerator (Renewed)

Dell Nvidia Tesla K80 GPU (Nvidia Part Number: 900-22080-0000-000)

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is Norway developing its own language model?

Norway aims to create a sovereign AI that accurately reflects its language, culture, and history, which is not fully possible with globally trained, English-centric models.

What role does Huawei storage play in this project?

Huawei’s OceanStor Dorado all-flash arrays provide high-performance, low-latency storage crucial for managing and processing the large-scale datasets used in training the Norwegian LLM.

What are the main technical challenges faced?

The primary challenges involve data quality, pipeline throughput, and efficiently moving PB-scale datasets from archival storage to AI training environments.

When will the Norwegian LLM be ready for deployment?

The project is still ongoing; no specific deployment date has been announced. Completion depends on resolving evaluation, governance, and technical issues.

Could this approach be adopted by other countries?

Yes, especially for small or non-English-speaking nations seeking to preserve their language and culture through AI, though technical and governance challenges will vary by context.

Source: Hacker News

Norway’s 2 petabytes of Huawei flash storage and LLM training

Up next

Using AI to write better code more slowly

Author

Tech Trend Trove Team

Share article

Why It Matters

Huawei OceanStor Dorado all-flash storage array

Background

fanxiang 1TB PCIe 5.0 NVMe M.2 SSD,Up to 14000 MB/s,High Performance Solid State Drive for 8K Video Editing, AI Training,Gaming, PC, Laptop

What Remains Unclear

QNAP TL-R1600PES-RP-US 16 Bay 3U Short Depth rackmount PCIe Interface SATA JBOD for petabyte-Scale Expansion (Diskless)

What’s Next

HHCJ6 Dell NVIDIA Tesla K80 24GB GDDR5 PCI-E 3.0 Server GPU Accelerator (Renewed)

Key Questions

Why is Norway developing its own language model?

What role does Huawei storage play in this project?

What are the main technical challenges faced?

When will the Norwegian LLM be ready for deployment?

Could this approach be adopted by other countries?

CRISPR tech selectively shreds cancer cells, including “undruggable” cancers

Post-silicon era gets closer as industry giants crack the 2D transistor scaling bottleneck with breakthrough tech — imec, ASML, and TSMC fab complementary 2D-material transistors at 50nm pitch on a 300mm wafer

Glasspane: When Transparency Itself Becomes the Product

Why people might ditch their smartwatches for something simpler

The Art And Engineering Of Sega CD Silpheed

Apple’s New SpeechAnalyzer API, Benchmarked Against Whisper And Its Predecessor

6 Best Programmable Pocket Computers in 2026

Telegram’s T.me Domain Has Been Suspended

Norway’s 2 petabytes of Huawei flash storage and LLM training

Up next

Author

Tech Trend Trove Team

Share article

Why It Matters

Huawei OceanStor Dorado all-flash storage array

Background

fanxiang 1TB PCIe 5.0 NVMe M.2 SSD,Up to 14000 MB/s,High Performance Solid State Drive for 8K Video Editing, AI Training,Gaming, PC, Laptop

What Remains Unclear

QNAP TL-R1600PES-RP-US 16 Bay 3U Short Depth rackmount PCIe Interface SATA JBOD for petabyte-Scale Expansion (Diskless)

What’s Next

HHCJ6 Dell NVIDIA Tesla K80 24GB GDDR5 PCI-E 3.0 Server GPU Accelerator (Renewed)

Key Questions

Why is Norway developing its own language model?

What role does Huawei storage play in this project?

What are the main technical challenges faced?

When will the Norwegian LLM be ready for deployment?

Could this approach be adopted by other countries?

You May Also Like