I build and maintain open source software for Python’s data science ecosystem. This is part of a broader effort to increase accessibility to computation, accelerate science, and enable more informed policy decisions.
I coordinate and maintain several libraries within Python’s numeric computing ecosystem, particularly around efficient and scalable computing.
I am primarily known for my work on Dask, a library for scalable computing with dynamic task scheduling. Dask combines a high-speed task scheduler with parallel algorithms to scale existing Python libraries like Numpy, Pandas, and Scikit-Learn.
More generally though I work with other core developers within the ecosystem to promote the general health and efficiency. I contribute to and maintain dozens of libraries. A more complete record of my contribution is available on GitHub: github.com/mrocklin.
I write frequently about my work at matthewrocklin.com/blog.