Don't let ChatGPT get away

Mark wiens


On the face of it, ChatGPT's recent decline is consistent with what many AI industry naysayers have always said -- that, like Deep Blue, which defeated the chess champion, or AlphaGo, the Go champion, AI tools that explode in popularity eventually die down.

Because these cool tools often have an unavoidable question: where are the prospects for commercialization?

In terms of usage scenarios, no one except professional chess players will need to play with robots every day. However, ChatGPT, as a large model that absorbs countless language materials and has more than 170 billion model parameters alone, seems to be the most suitable scenario at present only to complete the content outline writing and unified annotation format in academic papers, and to help paper authors avoid the risk of review. Indeed, ChatGPT is so good at it that it has become such a secret among international students that some Chinese students have created an app called GPTZero to identify ChatGPT's content in assignments.

But that's about it. From the perspective of cost, the development and deployment cost of tens of millions of dollars gives the outside world the reason to firmly decline it. This so-called intelligent chat tool is too expensive, not to mention the most amazing part of it: the understanding of human language and dialogue logic, the "generative" creation of the answer content, are quickly "discharm" with more and more instances of "turning over". The optimism that it would displace search engines and upend intelligent voice assistants is fading.

ChatGPT seems to be going the way of the AI tools of the past in the public eye, dazzling like a meteor and then falling silent.

The first man to build an airplane

Sheng, a doctoral student at Tsinghua University who studies pre-training large models, spoke of ChatGPT with a mixture of excitement and nervousness.

"As recently as two years ago, the academic community was still debating whether to go in the direction of pre-training large models." Sheng said that for the reason mentioned above, the cost of training a large model is too high, and the results can be uncertain. Few people are willing to take risks. Domestic players of relevant directions once preferred to use large and small models to cooperate with each other to improve the effect of AI tools, because the traditional view is that the effect of training on relatively small models is not necessarily worse than that of large models. And according to more than one AI practitioner, the industry hasn't paid enough attention to human-flagged data to expect the kind of reinforcement learning that ChatGPT uses based on human feedback to work so well.

That is, until OpenAI introduced ChatGPT.

"There is as much intelligence as there is labor." It's an oft-mocked phrase in the field of artificial intelligence, and it's apt to describe ChatGPT. As a pre-training large model, it embodies the word "big" very well. On the one hand, the parameter size of GPT3 has increased by nearly 1500 times compared to GPT1. On the other hand, due to the so-called "self-supervised learning" mechanism, the model can be trained using a large amount of text data on the Internet.

Large models of this magnitude have never been seen before.

"Recent research tells us that when models reach a certain scale, there is something that emerges as an emergent ability." Sheng said.

To some extent, ChatGPT's developers, OpenAI, are taking a gamble. No one knows if this path is going to work, and it is their persistent and costly investment that ultimately proves that the large pre-training model has cognitive understanding and generalization capabilities that the average model does not. In other words, the pre-training grand model is very similar to the ideal "universal model" of AI.

Unlike AlphaGo, which was designed specifically for Go, ChatGPT is not an AI tool developed for a specific narrow domain problem. Instead, Chatgpt may be more like some kind of immature general-purpose AI computing model with the ability to answer open questions, showing the potential to be deployed flexibly in a variety of domains.

This is why ChatGPT is important, showing the power of pre-training large models. This means that the third wave of AI has reached a critical juncture after more than a decade of development.

"ChatGPT/GPT-3.5 is an epoch-making product that is almost as different from the common language models of the past as a missile and a bow, and must be given the highest attention." An article that attempts to help the open source community reproduce the GPT3.5 technology roadmap begins with a serious point about this.

Sheng likens ChatGPT's creation to the Wright Brothers' invention of the airplane. "Everyone knows that airplanes can be made in theory, but no one has ever seen one. ChatGPT is like someone suddenly throws a plane in front of you, even though it can only fly 100 meters and easily fail, but it's there."

Bigger than bigger, how much more potential is there for the big model?

ChatGPT's flaws and weak commercial prospects pale in comparison to the significance of the key nodes in the AI wave revealed by ChatGPT. Moreover, for many practitioners, the shortcomings exposed by ChatGPT are not insurmountable.

One of the things that many people point to is the so-called database timeline. ChatGPT training is based on a fixed database with a deadline of September 2021, which means ChatGPT has no information on anything that has happened in the world since then, from the launch of the iPhone 14 to the US midterm elections, or even today's weather conditions. ChatGPT doesn't even do as well as any of the current smart voice assistants.

But technically the problem is not difficult to solve. In fact, Microsoft, which has a strategic partnership with Open AI, will release a new version of Bing with AI-enabled conversations in March that will combine ChatGPT's capabilities with its search engine. Microsoft even plans to introduce the same capabilities into its Office suite.

The most interesting cost problem is that there are also many optimization and iteration ideas at the algorithm level. For example, since ChatGPT has well demonstrated the ability of the machine to simulate human behavior through special training in the process of answering questions, then at the algorithm-based level, it is worth exploring to make ChatGPT not call its own database but directly capture content from the network when it comes to pure knowledge and information by imitating the way people consult information. In this way, large models can be scaled down without compromising their performance, and training costs will be reduced.

As for the commercial landing scene, in addition to the text generation, intelligent assistant field has been relatively determined, to be honest with you, there are still a large number of barren areas to be developed, but many practitioners have expressed optimism.

"The hard part is the original innovation from zero to one, and everything after that is not a problem." "Especially in China, where the market is so big and everyone is so involved, now that the big model approach is proven to work, all the smart people will soon join in," said an AI researcher who works for Dachang. Sheng also predicts that commercial products based on pre-training large models will appear in a year or two.

Disclaimer: All information on this website is collected from the Internet, which does not represent our opinion. This website is not responsible for its authenticity and legitimacy. If any information violates your rights and interests, please inform us and we will deal with it immediately.