We just returned from the annual SciPy conference full of energy as usual. This conference is unique because it includes core maintainers from most of the core Python data projects (Jupyter, Pandas, Numpy, Scikit-Learn, Dask, Xarray, Numba, Matplotlib and many more). It’s an accident of history that we all transitioned into software development from scientific careers. It’s a wonderful community of science nerds that unexpectedly changed the world.
This community is also wonderful because everyone opted in to it. While today these developers can take home $500k+ salaries, ten or twenty years ago this was a thankless labor of trying to make the world better in our spare time. The folks who showed up at that time are, invariably, wonderful. They were all mission driven. That mission evolved over time:
Mission 1: Write software to accelerate science#
At first, we were all just trying to cobble together software that would accelerate our own research or the research of our colleagues.
Mission 2: Write software to enable society to understand data#
Then projects like Pandas and Scikit-Learn came about, and the data science and machine learning revolutions arose. Suddenly this code was useful not just in scientific labs, but in banks and hedge funds and insurance companies and big agriculture and everywhere else. We then tried to shift the stack to be more useful for these domains.
We succeeded. We weren’t just accelerating science, we were the de-facto platform on which the worlds data science ran.
Mission 3: Get paid#
Cool. We had accomplished our original mission, and then some. Most hallway conversation at this point shifted to funding models.
Could we find a postdoc that would pay us to write just software?
Are professorships off the table?
Did you hear about company X that just pays for a person to write OSS?
We succeeded and we did get jobs. These were new jobs and career paths often created just for us. It was great.
Mission 4: Sustainable open source#
After working 60 hour weeks on things that we loved, many of us grew tired, and started wondering how to get even more people paid to do this. We shifted to thinking about larger funding models?
Could we get government grants to sustain OSS?
How about philanthropic agencies?
Maybe consulting companies are the answer?
Should we be making our own product companies?
And so many of us did this as well. My sense is that we’re succeeding today.
Due either to effort and genius or to a sequence of fortunate historical accidents, the SciPy community has been wildly successful at these missions. This software stack is among the most widespread and most impactful in the world. The original developers of the stack are financially secure and well respected (sometimes wildly so) and we have money pouring into the ecosystem from a diverse array of funding sources.
To make this explicit, I’m going to list a few folks from my generation here:
Chelle Gentemann (Panegeo) runs TOPS, an open source initiative at NASA with $10,000,000 annual budget to transform NASA to open science
Gael Varoquaux (Scikit-Learn) runs a research lab at INRIA, and does things like brief the French president
Andy Mueller (Scikit-Learn) works at Microsoft and says things like “I can pretty much work on whatever I want”
Brian Grainger (Jupyter) works at Amazon, helping them to productize Jupyter in Sagemaker
Ralf Gommers (Numpy) leads Quansight labs, supporting OSS projects through industry collaboration
Katy Huff (Nuclear) is the assistant secretary of energy for the US government, literally the ranking nuclear engineer in the country
Joe Hamman (Xarray) co-started CarbonPlan, a non-profit to help the world understand carbon sequestration technologies
… I could go on for quite a while
We’re well paid. We have budget. We have headcount. We have the attention of the world.
What do we do next?
I asked the question of what our next mission should be of a variety of folks at SciPy and heard several answers. I’d like to share them below.
By Matthew Rocklin
First, just to set expectations, it’s totally ok for us to all just go and buy houses along some lake and drink beer together. It’s ok to be done if we think that we’ve accomplished our mission and retire off to some comfortable and well-earned normal career.
What are some other possibilities though?
SciPy: The Next Generation#
By Jason Grout, Jupyter / Databricks
We’re just one generation in a sequence of generations. Just as our generation was enabled by earlier folks (Travis, Peter, Brian, Fernando) we should make space and opportunities for the next generation coming into their own now.
Universal Data Access#
By Andy Mueller, Scikit-Learn / Microsoft
Public data and metadata should be universally accessible, and a project of the size and scope of the space program or US interstate highway system.
Today we have lots of data, yes, but it’s not universally accessible or understandable. We should invest in data access technologies and metadata standards to allow the world to ask questions without munging.
By Paige Bailey, Tensorflow/Jax / Google AI
Data/ML/AI may no longer require a PhD, but it does require substantial education and resources. We need to continue pushing on both increased accessibility and also increased education so that data driven thinking and insights are available to the entire populace from a young age.
The Data-8 program at UC Berkeley is a wonderful case study of this. We should go further.
Science in Production#
By Joe Hamman, Xarray / Carbon Plan
We need to move beyond writing papers.
We have excellent tools, let’s use them to change the world.
Opt-in value driven communities are fun#
I’ve loved working with these folks over the last decade. We all came together not because it paid well or because it furthered our careers. We came together because we believed in something, and we worked together to achieve it. This was both fulfilling and fun. Thank you all for the experience.
I’d love to work with you all again in something that we all believe in. We all have more agency. I look forward to seeing the emergence of future missions, like those mentioned above.