Path: blob/master/site/en-snapshot/federated/collaborations/notes/2022-08-24.md
21377 views
Notes from the 8/24/2022 meeting of TFF collaborators
Sparse tensor support in TFF:
EW - We have Keras models that we want to port to TFF, they have sparse tensors
Simply mapping to dense tensors results in unacceptable memory cost and slowness in our use case, so we’re looking to avoid that
ZG on existing sparse tensor support in TFF
Issues mentioned on GitHub mostly related to tf.data.Dataset
Mostly works otherwise, but it requires some DIY, particularly w.r.t, aggregations where we can’t just naively do a sparse sum on the triple of constituent tensors, that wouldn’t have the desired outcome
(question about relative importance)
EW - this is not blocking for us, but a good performance/resoruce footprint optimization
ZG - with respect to the GitHub issues, might work around by hiding dataset inside the TFF computation, so it’s not part of the input-output boundary
KO - clarifying that our “it mostly works” comment refers to the common practice of representing/handling sparse tensors as tuples of dense tensors. Have you tried dealing with sparse as tuples of dense tensors for datasets usage as well?
EW - haven’t tried yet
KO - sparse in this conversation has come up in two places - for model parameters, but also for sparse input data - are both equally important?
EW - would ideally have both
KO - one action item for Ewan to try to work with tuples of dense tensors that represent the constituent parts.
KO - this still leaves a question about better APIs/helpers for sparse tensor handling, but can unblock this particular use case. Thoughts on the API?
EW - ideally could this just be transparent (no need to do anything special for sparse by the customer using TFF and it just works)
KO, ZG - in some cases, it’s not obvious, e.g., for aggregation - there’s potentially more than one way to aggregate the constituent parts of sparse tensors, a choice ideally to be made by the customer
KR - probably having a small family of dedicated “sparse sum” symbols is most actionable
KO - perhaps we can start by prototyping the version of sparse sum needed by EW and upstream it to TFF as a generic sparse sum operator to seed this, and build on that (to follow up on this offline - maybe on discord)
EW +1
Jeremy’s proposal, continuing from 2 weeks ago:
(todo for all to review it later as it was just shared shortly before the meeting)
(Jeremy is presenting)
JL - proposing the “task store” abstraction for exchanging requests between a “Cloud” and the per-client executors (e.g., in browsers), with the latter pulling tasks from a centralized “task store”. Has something like this been considered in any other context?
KR - yes, in failure handling scenarios
More hairly problems, though - state transfer across executors is difficult, not sure how much carries over to the scenario presented by Jeremy
HV - can the executors in the leaves be stateless
JL - this would make it more like the SysML paper on cross-device
(question about performance in this scenario, compared to bi-directional streaming in a way that more closely resembles the native TFF protocol)
JL - ack that there are latency considerations
bi-directional streaming not supported in some transports, so not always a viable option
(ran out of time)
(to be continued in 2 weeks - first point of agenda for the next meeting, Jeremy will join)