Matthew Rocklin
Continuum Analytics
Credit: Dan Allan
Visualization ...
But distributed computing is more complex
What Dask needed:
Bokeh
Bokeh Server
from bokeh.models import ColumnDataSource
tasks = ColumnDataSource({'start': [], 'stop': [], 'color': [],
'worker': [], 'name': []})
from bokeh.plotting import figure
plot = figure(title='Task Stream')
plot.rect(source=tasks, x='start', y='stop', color='color', y='worker')
plot.text(source=tasks, x='start', y='stop', text='name')
while True:
collect_diagnostics_data()
tasks.update({'start': [...], 'stop': [...], 'color': [...],
'worker': [...], 'name': [...]})
.
About 700 lines of Python
Jupyter Lab
JupyterLab + Dask + Bokeh
Credit: Work by Luke Canavan
Overview, custom algorithms, some machine learning
Fine-grained parallelism, scheduling motivation and heuristics
Plotcon - November, 2016:
Visualization of distributed systems
I recently tweeted these slides under @mrocklin