Introduction: When the world’s most profitable AI company suddenly starts focusing on PCs

If you rewind three years, hardly anyone would have believed that Jensen Huang would focus a globally-watched keynote on personal computers.

Today, Nvidia is no longer just the company surviving on gaming GPUs. In the past two years, the demand for training large models has caused a global explosion in computing power requirements, making Nvidia's GPUs the backbone of the entire AI industry. From OpenAI to Microsoft, from Meta to Amazon, nearly all leading tech companies are placing orders with Nvidia. The data center business has become Nvidia's most significant revenue stream, while the importance of the consumer market continues to decline. Given this backdrop, many were puzzled when the RTX Spark was launched: how could a company that has become the world's most valuable by relying on AI servers return to the PC market?

In fact, if you simply understand RTX Spark as a new PC chip, you will miss the real message Jensen Huang wants to convey. Because the conference is not really discussing computers but rather the terminal entry point of the AI era. As AI models become more powerful and training capabilities are no longer an industry bottleneck, a new question begins to emerge: where will these capabilities ultimately be used? Who will become the first entry point for users to access AI? For Jensen Huang, the competition in the next decade is not just between data centers but between terminal platforms.

The AI revolution is shifting from the 'training era' to the 'application era'.

Over the past three years, the entire AI industry has operated around the same theme—training larger models. Whether it's GPT-4, Claude, or Gemini, the industry's focus has always been on parameter scale, training data, and computational investment. The capital market has also formed a consensus: whoever has more GPUs holds greater competitive advantages. Thus, global tech companies have begun to compete for data center resources, build supercomputing clusters, and procure more AI chips. In this process, NVIDIA has become the biggest beneficiary.

However, the development of the tech industry often follows similar patterns. Once the infrastructure is in place, the focus of competition will inevitably shift from the supply side to the demand side. This was true in the internet era, true in the mobile internet era, and it will be no different in the AI era. Today's large models are smart enough, and the computational resources owned by major global tech companies far exceed what they had three years ago. What the industry truly lacks is not model capability but methods to enable these capabilities to enter ordinary people's work and life scenarios on a large scale.

From this perspective, RTX Spark is not an isolated product but NVIDIA's response to the next stage of AI competition. Jensen Huang sees that the biggest opportunity in the next decade may not lie in data centers but in terminal devices. As AI begins to become a productivity tool and Agents start to replace traditional software, computers will once again become the core carrier of technological innovation.

Why 128GB and not 64GB?

Among all the announcements, the most likely to spark discussion is the 128GB unified memory. Many consumers might find this number somewhat exaggerated. After all, mainstream computers on the market today still hover around 16GB and 32GB configurations, with 64GB considered high-end workstation level. So the question arises: why does NVIDIA believe the future computer needs 128GB of memory?

The reason is that people still tend to understand AI-era computers with the mindset of the traditional software era. For decades, computers have run software. Whether it's Office, Photoshop, or browsers, their resource consumption has increased, but overall it remains within predictable limits. However, large models are completely different. For a model with 70 billion parameters, just the model weights can take up dozens of GB of space. If you add long-term memory, local knowledge bases, multimodal understanding, and multiple Agents running in coordination, the memory demand will quickly exceed the capabilities of traditional PCs.

More importantly, future users may not run just one AI but multiple AIs simultaneously. Just like today's computers run multiple applications at once, future operating systems may simultaneously host writing Agents, research Agents, office Agents, and programming Agents. They need to remain resident, require persistent memory, and need to call local data in real-time. This mode of operation determines that future computers will have memory demands far exceeding those of today's software era.

Therefore, 128GB is not just a simple parameter upgrade but a new demand for computing architecture. It corresponds not to today's usage scenarios but to those of the next five to ten years.

What Jensen Huang really wants to do is to pull AI back from the cloud to the local environment.

In recent years, AI applications have been built almost entirely on a cloud computing model. Users send questions to ChatGPT, data is transmitted to remote servers, and after the model completes inference, results are sent back. This model has driven the rapid proliferation of generative AI, but it has also brought increasingly evident issues.

First is the cost issue. Each inference consumes resources from data centers. When the user base reaches hundreds of millions, even giants like Microsoft and OpenAI must bear enormous operational costs. Secondly, there’s the privacy issue. More and more companies are realizing that core business data is not suitable for long-term upload to public cloud platforms. Finally, there’s the latency issue. When an Agent needs to call local software in real-time, analyze local files, and complete complex tasks, cloud interactions often cannot meet efficiency requirements.

As a result, the industry is forming a new trend: training stays in the cloud, inference returns to the terminal. Future data centers will still be responsible for training the most advanced large models, but more and more practical applications will occur on personal devices. This is also why Jensen Huang continually emphasizes the importance of AI PCs. Because in his vision, future computers will no longer just be terminals connecting to the cloud; they will be true computing platforms with local intelligence.

This isn't a PC upgrade; it's a shift in computing paradigms.

Many analysts like to compare AI PCs with past upgrade cycles to judge whether it will trigger a massive upgrade wave like smartphones. But this comparison is actually inaccurate. Because the greatest value of AI PCs lies not in performance improvement but in the change of computational logic.

In the past, computers were tools. Users opened software, input commands, and then waited for results.

In the future, computers might act as assistants. Users set the goals, and the Agent proactively completes tasks and delivers results.

This change may seem small, but it actually signifies a fundamental reconstruction of the relationship between humans and computers. When computers start to have long-term memory, can understand context, and can autonomously call software, they cease to be traditional productivity tools and begin to evolve into digital partners.

From mainframes to personal computers, and from personal computers to smartphones, every computing revolution redefines the relationship between humans and machines. And AI PCs are driving the next transformation.

Conclusion

Looking back at the Computex conference, RTX Spark may not be the most important product, and the 128GB unified memory may not be the most critical spec. What truly deserves attention is the judgment Jensen Huang revealed: the AI industry is shifting from infrastructure competition to terminal competition. The most valuable entry point in the future may not be data centers but rather the computers on everyone's desks.

Over the past three years, NVIDIA has helped build the computational infrastructure for the AI era worldwide. In the next decade, it hopes to participate in defining the terminal platforms of the AI age. RTX Spark is just the first step in this strategic transition, while the 128GB unified memory is the infrastructure that Jensen Huang is preparing for the future.

Years later, people may not remember how many GPU cores RTX Spark had or its benchmark scores. But they might remember that Computex 2026 was an important turning point for AI to start moving from the cloud to personal computers. What Jensen Huang really wants to reinvent has never been just a chip but the next generation of personal computing.