Huang Renxun: Nvidia's AI computing power has been sold at a 10% discount

Author | Ling Zijun, Li Yuan

Editor | Jing Yu

Image source: Generated by Unbounded AI

Huang Renxun, wearing a leather jacket, stood on a blue surfboard and posed a few surfing poses.

This is not VidCon, the "Internet Red Festival" in the United States, but a scene at the developer conference of Snowflake, a well-known data platform in the United States.

On June 26 local time, Nvidia founder Huang Renxun and Snowflake CEO Frank Slootman discussed "how to bring generative AI to enterprise users". The host is the former Greylock GP, who is now the founder of the investment agency Conviction.

At the meeting, compared with the mature and prudent professional manager of the "host" Frank, the "Godfather in Leather" was astonishing as always, not only saying that the cooperation between the two parties is "We are Lovers, not Fighters" (We are Lovers, not Fighters), It is even more joking that the trained model provided for Snowflake is equivalent to "discounting 10%" to customers.

On the same day, Nvidia and Snowflake jointly released another big move: the world's No. 1 chip company cooperated with the most popular cloud data platform to launch a joint cooperation. **Snowflake users can directly use Nvidia's pre-trained AI model to analyze their own company's data on the cloud platform without leaving the platform, and develop "AI applications" for their own data. **

"The current major changes come from data + AI algorithm + computing engine. Through our cooperation, we can bring these three points together." Huang Renxun said.

Talking Points:

  • Large language model + enterprise-specific database = AI application for specific problems;
  • It used to be Data going to Work, but now it is Work going to Data, allowing computing to go where the data is located, avoiding data islands;
  • The pre-training model provided by Nvidia has been trained in the Nvidia AI factory at a cost of tens of millions of dollars, so calling the computing engine on Snowflake has already "discounted 0.5%";
  • In the software 3.0 era, based on models and databases, enterprises can build their own exclusive applications within a few days;
  • In the future, enterprises will be able to produce many intelligent agents and run them;
  • For enterprises, the real problem is how to mobilize mixed structured and unstructured data. This may lead to an update of the business model.

The following is the main content of the dialogue between the two parties, edited by Geek Park:

01 Talk about cooperation: bring the best computing engine to the most valuable data

Frank:

NVIDIA currently plays an important role in history. For us, being able to bring data and relationships to large enterprises. We need to enable this technology, and the entire service stack to use it effectively. I don't want to describe it as "a match made in heaven", but for a layman, it is a good opportunity to enter this door of opportunity.

Huang Renxun:

We are lovers, not rivals. **We want to bring the world's best computing engine to the world's most valuable data. Looking back, I've been working for a long time, but I'm not that old. Frank, you are older (laughs). **

Recently, data is huge and data is precious due to well-known reasons. It must be safe. Moving data is hard, and the gravity of data is real. So it was much easier for us to bring our compute engine to Snowflake. Our partnership is about accelerating Snowflake, but it's also about bringing AI to Snowflake. **

**The core is the combination of data + artificial intelligence algorithm + computing engine, our partnership combines all three things together. **Incredibly valuable data, unbelievably great artificial intelligence, unbelievably great calculation engine.

What we can do together is help customers take their proprietary data and use it to write AI applications. You know, the big breakthrough here is that for the first time you can develop a large language model. You put it in front of your data, and then you talk to your data like you talk to a human, and that data is augmented into a large language model.

The combination of a large language model plus a knowledge base equals an AI application. ** This one is simple, a large language model turns any data knowledge base into an application. **

Think of all the amazing apps people have written. At its core, there has always been some valuable data. Now you have a query engine general query engine at the front, it's super smart, you can make it respond to you, but you can also connect it to a proxy, which is the breakthrough that Langchain and vector databases bring. Breakthrough stuff that overlays data and big language models is happening everywhere, and everyone wants to do it. And Frank and I will help you do that.

02 Software 3.0: Build an AI application to solve a specific problem

host:

As an investor looks at this change, software 1.0 is very deterministic code written by engineers functionally; software 2.0 is optimizing a neural network with carefully collected labeled training data.

You guys are helping people leverage Software 3.0, a set of underlying models that are incredibly capable on their own, but they still need to work with enterprise data and custom datasets. It's much cheaper to just develop those apps against them.

**One question for those looking deeply into this field, the underlying model is very general, can it do everything? Why do we need custom models and enterprise data? **

Frank:

So we have very generalized models that can do poetry, do summaries of The Great Gatsby, do math problems.

But in business, we don't need these. What we need is a Copilot to gain extraordinary insights on a very narrow but very complex data set.

We need to understand business models and business dynamics. This need not be so computationally expensive, because a model does not need to be trained on a million things, but only needs to know very few, but deep topics.

for example. I'm on the board of Instacart and one of our big customers, like DoorDash and all the other businesses that have a problem is they keep ramping up their marketing spend, a customer comes in, a customer places an order and the customer either doesn't come back or comes back in 90 days , which is very unstable. They call this churn.

This is the analysis of complex issues because there can be many reasons why a customer does not come back. People want to find the answers to these questions, and it's in the data, not in the general Internet, and it can be found through artificial intelligence. This is an example of where great value can be generated.

host:

How should these models interact with enterprise data?

Huang Renxun:

Our strategy and products are state-of-the-art pre-trained models of all sizes, and sometimes you need to create a very large pre-trained model so that it can be produced to teach smaller models.

And smaller models can run on almost any device, perhaps with very low latency. However, its generalization ability is not high, and the zero shot (zero sample learning) ability may be more limited.

So you may have several models of different types and sizes, but in each case you have to do supervised fine-tuning, you have to do RLHF (reinforcement learning with human feedback) so that it stays with your goals and principles Consistently, you need to augment it with something like a vector database, so it all comes together on one platform. We have the skills, the knowledge, and the underlying platform to help them create their own AI and then connect it to the data in Snowflake.

Now, **shouldn't be the goal of every enterprise customer to think about how do I build a large language model, their goal should be, how do I build an AI application to solve a specific problem? **That app may take 17 questions to do to finally come up with the correct answer. And then you might say, I want to write a program, it could be a SQL program, it could be a Python program, so that I can do this automatically in the future.

**You still have to guide this artificial intelligence so that he can finally give you the correct answer. **But after that, you can create an application that can run 24/7 as an agent (Agent), looking for relevant situations, and reporting to you in advance. So our job is to help customers build these artificial intelligence applications, which are specific and customized with safety guardrails.

Ultimately, we're all going to be smart manufacturers in the future, employing employees of course, but we're going to create a bunch of agents that can be created with something like Lang Chain, connected models, knowledge bases, other APIs, deployed in the cloud, and connect it to all Snowflake data.

You can operate these AIs at scale and improve these AIs continuously. So each of us will make AI, run an AI factory. We will put the infrastructure on Snowflake's database, where customers can use their data, train and develop their models, operate their AI, so Snowflake will be your data repository and bank.

With their own goldmine of data, all will run AI factories on Snowflake. This is the goal.

03 Although "Nuclear Bomb" is expensive, using the model directly is equivalent to "10% off"

Huang Renxun:

We have established five AI factories in NVIDIA, four of which are the world's top 500 supercomputers, and the other is on the line. We use these supercomputers to do pre-training models. So when you use our Nemo AI base service in Snowflake, you get a state-of-the-art pre-trained model that has already cost tens of millions of dollars, not to mention R&D. So it is pre-trained.

Then there's a whole bunch of other models around it that are used for fine-tuning, RLHF. All of these models are much more expensive to train.

So now you've adapted the pre-trained model to your features, to your guardrails, optimized for the type of skills or features you want it to have, augmented with your data. Therefore, this would be a more cost-effective approach.

More importantly, within days, not months. You can develop AI applications that connect to your data at Snowflake.

You should be able to quickly build AI applications in the future.

Because we're seeing it happening in real time now. There are already apps that allow you to chat with data, such as ChatPDF.

host:

**Yes, in the software 3.0 era, 95% of the training costs are already covered by others. **

Huang Renxun:

(laughs) Yeah, 95% off, I can't imagine a better deal.

host:

That's the real motivator, and as an investor, I've seen very young companies in analytics, automation, legal, etc., whose applications have achieved real business value in six months or less. Part of that is they're starting with these pre-trained models, which is a huge opportunity for businesses.

Huang Renxun:

Every company will have hundreds, maybe even 1,000 AI applications, just connected to all kinds of data in your company. So, all of us have to be good at building these things.

04 It used to be data looking for business, now it is business looking for data

host:

One of the questions I keep hearing from big business players is we have to invest in AI, do we need a new stack? How should we think about connecting with our existing data stack?

Frank:

I think it's evolving. Models are gradually becoming simpler, safer, and better managed. So, we don't have a really clear view that this is the reference architecture that everybody will be using? Some will have settings for some central service. Microsoft has a version of AI in Azure, and many of their customers are interacting with Azure.

**But we don't know what model will dominate, we think the market will sort itself on things like ease of use and cost. **This is just the beginning, not the final state.

The security sector will also be involved, and the issue of copyright will be reformed. Now that we are fascinated by technology, real problems will be dealt with at the same time.

Huang Renxun:

We are now experiencing the first fundamental computing platform change in 60 years. If you've just read the IBM 360 press release, you've heard about central processing units, IO subsystems, DMA controllers, virtual memory, multitasking, scalable computing forward and backward compatible, and these concepts, actually It's all 1964, and these concepts have helped us scale CPUs for the past six decades.

Such expansion has been going on for 60 years, but it has come to an end. Now everyone understands that we can no longer scale the CPU, and all of a sudden, the software changes. The way software is written, the way software operates, and what software can do is very different from what it used to be. We call the previous software software 2.0. Now it's software 3.0.

The truth is, **computing has fundamentally changed. We see two fundamental dynamics happening at the same time, which is why things are shaking violently right now. **

For one thing, you can no longer keep buying CPUs. If you buy another bunch of CPUs next year, your computing throughput will not increase. Because the end of CPU scaling has come. You'll pay a whole bunch more, and you won't get any more throughput. So, the answer is you have to go for acceleration (Nvidia Accelerated Computing Platform). The Turing Award winner talked about acceleration, Nvidia pioneered acceleration, and accelerated computing is now here.

On the other hand, the entire operating system of the computer has undergone profound changes. We have a layer called NVIDIA AI Enterprise, and the data processing, training, and reasoning deployment in it have now been integrated or are being integrated into Snowflake. Therefore, from the beginning of data processing to the final large model deployment, the whole behind The calculation engine has been accelerated. We're going to power Snowflake, where you'll be able to do more, and you'll be able to do more with less.

If you go to any cloud, you'll see that NVIDIA GPUs are the most expensive computing entities in there. But if you put a workload on it, you'll see that we're doing it really fast. It's like you're getting a 95% discount. We are the most expensive computing entity, but we are the most cost-effective TCO.

So, if your job is to run a workload, maybe training a large language model, maybe fine-tuning a large language model, if you want to do that, definitely accelerate it.

** Accelerate every workload, this is the reshaping of the entire stack. **Processors change because of it, operating systems change because of it, big language models are different, the way you write AI applications is different.

In the future, we will all write applications. We all have to connect our and our context, with a few Python commands, to a large language model and our own database or the company's database, and develop our own applications. Everyone will be an app developer.

host:

But the same thing is, it's still your data. You still need to fine-tune it.

Frank:

It turns out that we all feel that faster is always more expensive. Actually all of a sudden, faster is cheaper, which is kind of counter-intuitive. So sometimes people want to reduce the supply, thinking it's cheaper, and it turns out to be more expensive.

Another contradiction with the previous one is that ** used to be data going to work (data going to work), but now, business is looking for data (work going to data). ** For the past sixty years, or more, we have been letting data go to business, which has resulted in large-scale information silos. And if you want to have an AI factory, it will be very difficult to use the previous method. We must bring computing to where the data is. I think what we're doing now is the right way.

05 How enterprises can obtain the fastest and greatest value

Frank:

Being the fastest and getting the most value are actually two very different problems.

If it is the fastest, **you will soon see that the AI-enhanced search method is online everywhere in the database, because this is the easiest function to add. **It's incredible that even an illiterate person can get valuable information from data now, the ultimate democratization of interaction. The search function is greatly enhanced, you just ask a question to the main interface, and they can bring these questions to the data for their own query. That's the low hanging fruit, the easiest, we think it's stage one.

Next, we start to really focus on the real problem, which is proprietary enterprise data, mixed structured, unstructured, all of these, how do we mobilize this data? **

I have already mentioned the churn rate and supply chain management issues faced by to C companies. When the supply chain is particularly complex, if an event occurs, how do we re-adjust a supply chain to make it work? what should i do now A supply chain is made up of many different entities, not a single enterprise. Historically, this is a problem that has never been solved computationally. Supply chain management has never been a platform, it's pretty much an e-mail, a spreadsheet, with a few minor exceptions. So this is extremely exciting.

Or we can recalculate the investment in large call centers and optimize retail pricing. As I said, this is the real potential of redefining the business model that CEOs of large companies have been looking forward to. **

06 Suggestions for enterprises:

Huang Renxun:

**I would ask myself, number one, what is my single most valuable database? The second thing, I would ask myself, if I had a super, super, super smart person, and all the data in the enterprise went through that super intelligence, what would I ask that person? **

This is different according to each person's company. Frank's company customer database is very important because he has many customers. And my own company, I don't have that many clients, but for my company, my supply chain is super complicated, and my design database is super complicated.

**For NVIDIA, we can't build a GPU without artificial intelligence. Because none of our engineers can do a lot of iteration and exploration for us like AI. ** Therefore, when we proposed artificial intelligence, the first application was in our own company. Moreover, it is impossible for Hopper (NVIDIA supercomputing product) to be designed without artificial intelligence.

We will also apply our own AI to our own data. Our bug database is a perfect use case for this. If you look at the amount of code at NVIDIA AI, we have hundreds of software packages that, combined, enable an application to run. Some of the things we're working on right now is how to use AI to figure out how to patch it security, how best to maintain it, so that we don't have to interfere with the entire upper application layer while being backwards compatible.

This is what AI can provide you with answers. We can use a big language model to answer these questions, find the answer for us, or reveal something to us, and then engineers can fix it. Or AI can recommend a repair method, and human engineers can confirm whether it is a good repair method.

I don't think everyone realizes how much intelligence, insight, and influence is hidden in the data they process every day. **Which is why we all need to get involved and help bring about this future.

Now, for the first time, the data you store in the data warehouse can be connected to the artificial intelligence factory. **You will be able to produce information intelligence, the most valuable commodity in the world. You're sitting on a gold mine of natural resources - your company's proprietary data, and we're now hooking it up to an artificial intelligence engine, and the other end is generating information intelligence directly every day, with an incredible amount of intelligence pouring in from the other end out, even while you sleep it keeps on coming out. It's the best thing ever.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)