Path: blob/master/site/en-snapshot/federated/collaborations/notes/2022-09-22.md
21377 views
Notes from the 9/22/2022 meeting of TFF collaborators
[Ajay Kannan, Michael Reneer] Managing versioning/dependencies
[Michael] Two concerns
Versions TFF depends on TF and Python
Pythin - can we support old, can we support new
We support 3.9 for now, soon 3.10
[A] Could negotiate specific versions - let’s unpack
[M] Why 3.9
Mostly for pytype
May be other features - could be flag guarded
(lots of back and forth on nuts and bolts - didn’t take notes)
Resolution/action items:
TFF to downgrade OSS version of things to what works
Michael to coordinate downgrade with Ajay, Ajay to test what works
Revised version of the proposal to follow
Will need a system for periodically updating the “downgraded version” to make sure it keeps advancing
Ajay, Michael to propose an upgrade schedule for that
Revision draft async, to present next time
[Tong Zhou et al.] Discussion of recent experiments/findings on scalability
[Tong] Question on expected length for TFF rounds
The extra time doesn’t seem to be spent in forward or backprop
Suspecting aggregation
Unsusprising that TFF vs. Keras performance-match for a single round
Reading data not a factor
All time is TF time
Data ingestion a likely suspect, needs to be measured better
Overlapping data ingestion and processing one of the factors,
In general, missed opportunities for optimization when training rounds are O(seconds)
Thre’s support in TFF for prefetching/preprocessing data K rounds ahead of training
APIs used in tutorial synchronous, but async and pipelining are natively available under the hood in the TFF runtime
Relevant code in OSS, just not very well exposed for use
Looks like it could solve the problem - to try out
AI on TFF team to follow up with links to how to setup ingestion and preprocessing K rounds ahead
Tong to follow up with new experiments
Async instance of next meeting possibly in 1 week
To follow up interactively on Discord.