It’s time to talk about the issue of AI open source.
Obviously, this is a problem that developers have to face. Basically since 2006, the issue of open source has become one of the top issues.
Matt Asay is responsible for marketing at MongoDB. Prior to that, he was head of Amazon Web Services and head of developer ecosystem at Adobe.
Before joining Adobe, Asay held a series of positions in open source companies. VP of Business Development, Marketing and Community at MongoDB, VP of Business Development at real-time analytics company Nodeable (later acquired by Appcelerator), VP of Business Development and Interim CEO at mobile HTML5 startup Strobe (later acquired by Facebook), and Ubuntu Linux COO of company Canonical and head of Americas for content management startup Alfresco.
Eventually, Asay became an emeritus director of the Open Source Initiative (OSI) and received a J.D. from Stanford University.
Previously, Matt Asay accused Google and Yahoo of having reservations about open source code, and then he Got scolded.
Now that I think about it, it makes sense.
Tim O'Reilly said that in the open source cloud era, the motivation for developers to share code is to let others run their own programs, thereby providing a copy of the source code. And the necessity for this has slowly disappeared.
## Reilly goes on to point out that not only is it unnecessary, but it is no longer possible for the largest apps. Over the past decade, this impossibility of sharing has overturned the original definition of open source. Today, new definitions are impacting the way we think about artificial intelligence. As Mike Loukides points out, collaboration on AI has never been more important, nor has it ever been more difficult. Just like cloud computing in 2006, the companies doing the most interesting work in artificial intelligence will likely strive to open source in traditional ways. But even if their open source method is traditional, it does not mean that they cannot be open in a more meaningful way. Open Infrastructure Loukides believes: “Although many companies now say they are engaged in AI, they really push the industry forward. There are only three companies - Meta, OpenAI and Google." The three of them have one thing in common: they all have the ability to run large models at scale. Behind this ability, we need strong infrastructure and technical means, which many individuals and companies often do not have. It is true that you can download the source code of OPT-175B from Meta, but the hardware you have on hand cannot train it. Even for universities or other research institutions, the OPT-175B is too large. On the other hand, even Google and OpenAI, which have sufficient computing resources, cannot easily reproduce OPT-175B . The reason is also very simple: OPT-175B is too closely connected to Meta’s own infrastructure (including custom hardware) and is difficult to be transplanted elsewhere.In other words, Meta is not trying to hide anything about OPT-175B, but it is really difficult to build a similar infrastructure. Even for those with the money and technology, the end result will be a different version.
And that's exactly what Yahoo's Jeremy Zawodny and Google's Chris DiBona made at OSCON 2006.
But then again, it’s hard to trust an AI if you don’t understand the scientific principles inside the machine.
So, we need to find some way to make the infrastructure open for use.
Loukides believes that free access should be provided to external researchers and early adopters. However, it’s not like giving them a master key to access Meta, Google or OpenAI’s data centers, but through a public API.
This may not be the "open source" that most people expect, but it is actually acceptable.
Now, Matt Asay’s accusations against Google and Yahoo are meaningless.
Since 2006, Google has packaged and open sourced critical infrastructure to meet strategic needs.
In Matt Asay’s view, TensorFlow is the entrance to open source, and Kubernetes is the exit to open source. These open source machine learning industry standards are expected to improve Google Cloud workloads or ensure portability between Google Clouds, thereby winning more workloads to Google Cloud.
The people who came up with this are smart, but it's not open source in the Pollyanna sense.
It’s not just Google. It just does open source better than other companies. Open source is inherently selfish, and companies and individuals will always open up code that benefits themselves or their customers.
Always has been, and always will be.
Loukides believes that AI should be open in a meaningful way (despite the differences between the three major AI giants and other companies), but the open source he refers to is not open source in our general sense. why?
The reason is that while traditional open source is great, it has never successfully solved the problem DiBona and Zawodny proposed at OSCON in 2006, both for the creators and consumers of the software. The cloud open source problem.
More than ten years have passed and we are still no closer to the answer.
Then again, we are indeed a little closer.
Matt Asay believes that we need to look at open source in a new way.
He is close to Loukides’s thinking: The key is to provide researchers with sufficient access so they can Discover how a specific AI model succeeds or fails.
"They don't need full access to all code and infrastructure to run these models." As he puts it, full access to the code only makes sense if developers can run open source programs on their laptops and create derivative works.
Given the scale and unique complexity of code run by Google or Microsoft today, this makes no sense — we won’t have full access to cloud code at scale.
We need to understand: open source is not a lens through which to view the open source world. And considering the cloud age we live in today, open source is used less and less.
Our goal, both as a company and as individuals, should be to open up access to software in a way that benefits customers and third-party developers, making it easier to understand, rather than trying to reinvent open source concepts from decades ago Chengyun. It doesn't apply to open source, just like it doesn't apply to AI.
It’s time to change your mind.
The above is the detailed content of Is AI not suitable for open source? MongoDB Vice President: Open source code is not suitable for artificial intelligence. For more information, please follow other related articles on the PHP Chinese website!