Dagger.jl Fast, Smart Parallelism

Dagger Development and Roadmap

Hey all! As Dagger.jl's maintainer, I'd like to give an update on where Dagger stands this year and what the plans are for the rest of 2023!

Aside: I'll be at JuliaCon this year, so come find me if you want to discuss anything about Dagger or parallelism in Julia! I'll also be giving two Dagger talks, one at the Data minisymposium, and one at the HPC minisymposium, so come listen in to hear what Dagger can do for you!

Developments so far:

GSoC and the DArray

This year we've got a Google Summer of Code student @fda-tome working on Dagger's DArray, with a focus on updating its internal implementation, adding MPI support, and implementing more linear algebra operations. He's been very successful with this so far, with the DArray now in a much better place in terms of matching Julia's AbstractArray interface, and with MPI support rapidly approaching feature parity with its Distributed support. Some of these changes are going to be a part of my HPC minisymposium talk, so come check it out!

Documentation overhaul

Dagger's documentation has historically been pretty unapproachable for new users and for experienced users alike, sometimes having too few details in important areas, and too much detail in areas that few people are interested in. I've spent some time working on the docs to improve this situation, with a new "Quickstart" introduction to Dagger on the first page, and a reordering of some documentation to line up related details in the same sections. The docs could always use more love, so if anyone is willing to help out or just point out where the docs could be improved, please reach out!

Improved GPU support

A few long-overdue changes have landed in DaggerGPU.jl lately, providing improved AMDGPU integration, Metal integration (thanks @Ronis_BR and Eric Hallahan!), and direct support for compiling and launching KernelAbstraction kernels. Together with Dagger's new spawn_sequential task queue for in-order kernel launch (details in my HPC talk), utilizing GPUs with Dagger has never been easier!

Improved file/out-of-core support

I've been putting together a set of changes for MemPool.jl (Dagger's storage and I/O dependency) which make it possible to utilize files as lazy inputs to Dagger tasks - a new set of APIs (Dagger.File and Dagger.tofile) will be landing in Dagger to make use of these new features, and will be wired into the DTable as well for easier and more efficient table ingest from files.

Thanks to changes by @krynju , it's also become easier to configure out-of-core support via the new Dagger.enable_disk_caching! API, which makes it easy to configure out-of-core across multiple Julia processes; see the docstring for details!

Website

Dagger has a new website: https://daggerjl.ai. We've got some information on what Dagger is and why you'd want to use it, as well as a blog (where this post will also be available). The source is available at https://github.com/jpsamaroo/www.daggerjl.com, so please feel free to add any Dagger-related content, including any blog posts and benchmarks!

Dagger also has a new logo!

Dagger.jl Logo

I figured this was long overdue, and I'm quite happy with the result!

Roadmap for 2023:

Dagger 0.18

The release of Dagger 0.18.0 will ideally happen sometime before Thursday the 27th to coincide with the two Dagger talks I'll be giving! This release will include any of the above work that hasn't yet made it into a release. I'm bumping the minor version as the DArray is breaking its API to better match Julia's AbstractArray API, but otherwise there shouldn't be any other breaking changes since 0.17.0.

Machine Learning

As machine learning and AI has become substantially more relevant and powerful over the last year, it's high time for Julia's support for these technologies to improve. To achieve these goals, distributed model training and inference should be available and easy to use; to this end, I'm planning to work with the Flux/Lux maintainers and ecosystem to pick up DaggerFlux.jl development and add support for Distributed Data Parallel (DDP) and other parallelism strategies. I'm also keen to see strong AMDGPU support in DaggerFlux, as well as support for other accelerators that have the requisite ML operators. Please reach out if you're interested in helping!

Improved mutation support

Dagger has always pursued the functional programming approach, where tasks are generally expected to operate out-of-place and allocate results to ensure reliable behavior in the face of automatic multithreaded and distributed execution. However, out-of-place operations aren't always feasible when working with large data, which is especially visible to users of the DTable and DArray (which don't currently have support for mutating tables/arrays in-place). It's my intention to add more formal mutable data support to Dagger, and in doing so, allow the DTable and DArray to add in-place operators as seen in DataFrames.jl and most AbstractArray implementations, respectively.

Distributed graphs

After presenting on Dagger in Toronto this year at WAW23, and listening in on what users there were working on, I'm interested in implementing a Dagger-powered distributed graph abstraction, which would allow graph theory research and graph operations (like Graph NNs) to operate across multiple processes and automatically benefit from multithreading and possibly GPU support. I'm not amazingly well-versed with the best way to implement something like this, so if someone more graph-oriented is willing to work with me or take the lead, that would be amazing!

Intel GPUs, GraphCore IPUs, and other accelerators

With the impending deployment of the Aurora supercomputer (sporting all Intel GPUs) and support for GraphCore IPUs thanks to @giordano , I'm planning to add integrations to these and other accelerators in DaggerGPU soon! Intel GPU support should be particularly easy given that it has a reasonably mature array and KernelAbstractions backend, so if anyone wants to tackle this one, I'm happy to provide guidance!

Conclusion

This has been an excellent year for Dagger's development, with many amazing possibilities not far away. I hope that as an ecosystem, we can push Dagger to become the best parallel programming API for many use cases and ensure that its performance is as good as possible.

I'm also very interested in hearing what people are excited to use Dagger for, and what features they're interested in having! Dagger needs community support to thrive, so I'd encourage people to ask questions, provide suggestions, and talk about how their experience with Dagger has been!

CC BY-SA 4.0 Julian P Samaroo. Last modified: July 22, 2023.
Website built with Franklin.jl and the Julia programming language.