Defining the measurable impact of Open-Source creation
As part of the committee that reviews and evaluates Ansys’ open-source projects, we commonly discuss what the value and impact of our open-source projects are to “the greater community”. This always triggers some interesting conversations, with those who are new to open source and who do not have a good understanding of what it means to be part of an open-source community. I also find with “corporate open source” there is also the overwhelming “more is better” ethos that all corporate thinkers follow. In the case of open-source communities though, this is not the case.
Now to be completely transparent, our community is not that big. It turns out that if you start whittling down the numbers from “every developer in the world” to “people who know enough about Ansys to use our products” you get to a MUCH smaller number very quickly. We are not likely to release a project that has 10,000 contributors, and millions of users. This is perfectly OK.
Relevant vs Good
For the majority of the open-source world, the “impact” in real terms is about much more than the stuff you create. It is about your presence and participation, and perhaps most importantly, it is about the quality of those contributions and projects. I have come to the realization that people make a critical mistake in their understanding. They are confusing:- “quality” – the level of “good, presentable, legible” that something is, with
- “relevance” - the level of "appeal, interest, attraction” something will have.
I think that overall, it is not possible to gauge relevance. It is just not possible to know if something is relevant to the topic unless you are an expert. The only group who will ever know if something is relevant is the group of people interested in that thing. Relevance is an ephemeral measure and depends on a great many factors that are hard to judge.
We can all make a basic judgment on quality, as it is much more objective. If something is legible, if the code/docs/information is clean and well formatted, if the words make sense and there are no grammar mistakes, etc.... In this way, something that we want to make public, be it through open-source or any other distribution mechanism, may not be particularly relevant – as it may cater to an extremely small, group of users, or even none at all. It should always be of at least acceptable quality however – as I have said many times over the years, “because it’s open-source is not an excuse to do whatever you want in a low-quality way.”
The other issue that this discussion highlighted was that there is a sentiment around open source which is “more is better, at all costs.” I think this reflects a very immature take on open source, which aligns to our relatively recent entry into it. This attitude also exposes a more fundamental question. How much open source is “enough”?
I believe numbers tell the truth, and we should trust them despite our opinions. I wanted to know how “everyone else” does it, because maybe my underlying assumptions were off and my insistence on quality over quantity is counter to the rest of the OSS world. Is the idea that:
“Putting out poor quality stuff is just fine, because hey, it’s open source and lots of it will be Dead on Arrival, because just look how much open source we do!”
Something the rest of the industry does? The information I used to answer the question comes from my own experience, was discovered through research, and confirmed by asking my peers in the companies that we seek to mirror. The answer is very clear:
No. Quality is never secondary to quantity.
Open source isn’t about putting up big numbers to say how much you are a heavy weight in OSS. It is not the “guaranteed outcome” for every side project and passion project. At those companies we seek to emulate, it is a privilege to be recognized as an OSS project, not a foregone conclusion. It’s about releasing things that might have value, sure, but most importantly what is released is high quality and consumable. It is also about meaningful contributions back to the larger OSS ecosystem, but more on that in a different article.
Having projects with zero stars and no contributions is fine – they are not appealing to anyone and so get no traction (and are often culled or retired by the organization – AWS is ruthless about this, MS and Google keep them online, but archived.) They are not, however, riddled with spelling and grammar mistakes, half formed ideas, or just something that was fun that we felt like making open source. In this way, they expose the underlying process to release OSS that is in place at these organizations. It is that process which ensures that what makes it out the door might be irrelevant but is most assuredly not low quality.
Open source.... enough.
Along these lines, I felt it important to see if I could discern what is “enough” open-source contribution to be a meaningful member of that fraternity. A lot of people at Ansys have the impression that quantity of projects is the most important metric. It turns out there is a sound mathematical ratio for the release of open-source projects. It is 1% of your workforce.
How do I know this? I took the time and did the math on the top OSS companies and their OSS portfolios:
- Microsoft - a staggering 5.8k repositories – 221,000 employees – 2.62% (lots of archives though – this number is way smaller for just active projects – much closer to 1%)
- +Google - 2.6k repositories – 140,000 employees – 1.85% (again – archives bloat this – just at 1% removing old/archived stuff)
- IBM – 3k repos – 345,000 employees - .8%
- Intel – "only" 1.1k? – 131,900 employees - .83%
- +AWS – they are RUTHLESS about OSS quality – 405 repos – 136,000 employees – .002% and an example of extreme process control – not everyone just does whatever and hopes for the best in OSS land!
- Meta – 124 – 86,000 - .001% and they have several of the largest OSS community projects going (React, Docusaurus, and their new LLM stuff)
More "presence per capita” than Google, IBM, Intel, AWS, Meta, and a mere .5% behind Microsoft in terms of open-source!
We also only started 3 years ago; all of them have been at it for over a decade or more and have clear, refined processes and policies. So no, we do not need to “push more things out the door”. OSS does not mean “do whatever you want” and it also doesn’t mean more is always better! All you need to do is look at this one basic thing to see that for everyone else, standards and quality are the focus, not quantity.
Is it enough?
To bring this interesting research project to a close:
- No one of us can judge the appeal of a project – we do not know what will hit and what won’t (I mean to be fair; I think any of us can have a pretty good idea, but we could also be wrong.)
- You should encourage submissions of new ideas – innovation comes from this, with the clear understanding that:
- You do not need to release every project as OSS – there are many ways to make things public, and they do not all have to be open source. (Public != Open Source and vice versa).
- If you are measuring and are not lagging or behind in OSS quantity, you do not need to “just publish as much as possible” to be relevant. (Contribution is a whole other article)
For your own open-source metrics, I encourage you to continue to look at consumption and adoption, but also take a real look at quality and ensure your process supports the goal of meaningful contributions. It is easy to just push stuff out the door, companies do it all the time. The real meaningful contributions to open source are the ones that focus on quality and impact over sheer volume.