GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ja/federated/tutorials/random_noise_generation.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2021 The TensorFlow Federated Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TFF でのランダムノイズの生成

このチュートリアルでは、TFF でランダムノイズを生成するための推奨されるベストプラクティスについて説明します。ランダムノイズの生成は、差分プライバシーなどの連合学習アルゴリズムにおける多くのプライバシー保護技術の重要なコンポーネントです。

始める前に

まず、関連するコンポーネントがコンパイルされたバックエンドにノートブックが接続されていることを確認しましょう。

In [ ]:

#@test {"skip": true}
!pip install --quiet --upgrade tensorflow-federated

In [1]:

import numpy as np
import tensorflow as tf
import tensorflow_federated as tff

以下の「Hello World」の例を実行して、TFF 環境が正しくセットアップされていることを確認してください。動作しない場合は、インストールガイドの手順を参照してください。

In [43]:

@tff.federated_computation
def hello_world():
  return 'Hello, World!'

hello_world()

Out[43]:

b'Hello, World!'

クライアントのランダムノイズ

クライアントで必要とされるノイズは、一般に同一のノイズと i.i.d ノイズの 2 つのケースに分類されます。

同一のノイズの場合、推奨されるパターンは、サーバー上でシードを維持し、それをクライアントにブロードキャストし、tf.random.stateless 関数を使用してノイズを生成することです。
i.i.d. ノイズの場合、from_non_deterministic_state でクライアント上で初期化された tf.random.Generator を使用します。TF の推奨事項に沿って tf.random.<distribution> 関数を回避するためです。

クライアントの動作はサーバーとは異なります（後で説明する落とし穴に悩まされることはありません）。これは、各クライアントが独自の計算グラフを作成し、独自のデフォルトシードを初期化するためです。

クライアントの同一のノイズ

In [5]:

# Set to use 10 clients.
tff.backends.native.set_sync_local_cpp_execution_context(default_num_clients=10)

@tff.tf_computation
def noise_from_seed(seed):
  return tf.random.stateless_normal((), seed=seed)

seed_type_at_server = tff.type_at_server(tff.to_type((tf.int64, [2])))

@tff.federated_computation(seed_type_at_server)
def get_random_min_and_max_deterministic(seed):
  # Broadcast seed to all clients.
  seed_on_clients = tff.federated_broadcast(seed)

  # Clients generate noise from seed deterministicly.
  noise_on_clients = tff.federated_map(noise_from_seed, seed_on_clients)

  # Aggregate and return the min and max of the values generated on clients.
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

seed = tf.constant([1, 1], dtype=tf.int64)
min, max = get_random_min_and_max_deterministic(seed)
assert min == max
print(f'Seed: {seed.numpy()}. All clients sampled value {min:8.3f}.')

seed += 1
min, max = get_random_min_and_max_deterministic(seed)
assert min == max
print(f'Seed: {seed.numpy()}. All clients sampled value {min:8.3f}.')

Out[5]:

Seed: [1 1]. All clients sampled value    1.665.
Seed: [2 2]. All clients sampled value   -0.219.

クライアントの独立したノイズ

In [ ]:

@tff.tf_computation
def nondeterministic_noise():
  gen = tf.random.Generator.from_non_deterministic_state()
  return gen.normal(())

@tff.federated_computation
def get_random_min_and_max_nondeterministic():
  noise_on_clients = tff.federated_eval(nondeterministic_noise, tff.CLIENTS)
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

min, max = get_random_min_and_max_nondeterministic()
assert min != max
print(f'Values differ across clients. {min:8.3f},{max:8.3f}.')

new_min, new_max = get_random_min_and_max_nondeterministic()
assert new_min != new_max
assert new_min != min and new_max != max
print(f'Values differ across rounds.  {new_min:8.3f},{new_max:8.3f}.')

Values differ across clients.   -1.490,   1.172.
Values differ across rounds.    -1.358,   1.208.

クライアントのモデル初期化子

In [ ]:

def _keras_model():
  inputs = tf.keras.Input(shape=(1,))
  outputs = tf.keras.layers.Dense(1)(inputs)
  return tf.keras.Model(inputs=inputs, outputs=outputs)

@tff.tf_computation
def tff_return_model_init():
  model = _keras_model()
  # return the initialized single weight value of the dense layer
  return tf.reshape(
      tff.learning.models.ModelWeights.from_model(model).trainable[0], [-1])[0]

@tff.federated_computation
def get_random_min_and_max_nondeterministic():
  noise_on_clients = tff.federated_eval(tff_return_model_init, tff.CLIENTS)
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

min, max = get_random_min_and_max_nondeterministic()
assert min != max
print(f'Values differ across clients. {min:8.3f},{max:8.3f}.')

new_min, new_max = get_random_min_and_max_nondeterministic()
assert new_min != new_max
assert new_min != min and new_max != max
print(f'Values differ across rounds.  {new_min:8.3f},{new_max:8.3f}.')

Values differ across clients.   -1.022,   1.567.
Values differ across rounds.    -1.675,   1.550.

サーバーのランダムノイズ

推奨されない使用法: `tf.random.normal` を直接使用すること

TF でのランダムノイズ生成のチュートリアルによるとランダムノイズ生成用の API tf.random.normal のような TF1.x は、TF2 では強く推奨されていません。これらの API を tf.function および tf.random.set_seed と一緒に使用すると、驚くような動作が発生する可能性があります。たとえば、次のコードは、呼び出しごとに同じ値を生成します。この驚くような動作は TF では期待されており、説明は tf.random.set_seed のドキュメントに記載されています。

In [ ]:

tf.random.set_seed(1)
 
@tf.function
def return_one_noise(_):
  return tf.random.normal([])

n1=return_one_noise(1)
n2=return_one_noise(2) 
assert n1 == n2
print(n1.numpy(), n2.numpy())

0.3052047 0.3052047

TFF では、少し異なります。ノイズ生成を tf.function ではなく tff.tf_computation としてラップすると、非確定的なランダムノイズが生成されます。ただし、このコードスニペットを複数回実行すると、毎回異なる (n1, n2) のセットが生成されます。TFF のグローバルランダムシードを設定する簡単な方法はありません。

In [ ]:

tf.random.set_seed(1)
 
@tff.tf_computation
def return_one_noise(_):
  return tf.random.normal([])

n1=return_one_noise(1)
n2=return_one_noise(2) 
assert n1 != n2
print(n1, n2)

0.11990704 1.9185987

さらに、シードを明示的に設定しなくても、TFF では確定的ノイズを生成できます。次のコードスニペットの関数 return_two_noise は、2 つの同一のノイズ値を返します。TFF は実行前に事前に計算グラフを作成するため、これは予想される動作です。ただし、これは、ユーザーが TFF での tf.random.normal の使用に注意を払う必要があることを示しています。

注意して使用する: `tf.random.Generator`

TF チュートリアルで提案されているように、 tf.random.Generator を使用できます。

In [ ]:

@tff.tf_computation
def tff_return_one_noise(i):
  g=tf.random.Generator.from_seed(i)
  @tf.function
  def tf_return_one_noise():
    return g.normal([])
  return tf_return_one_noise()

@tff.federated_computation
def return_two_noise():
  return (tff_return_one_noise(1), tff_return_one_noise(2))

n1, n2 = return_two_noise() 
assert n1 != n2
print(n1, n2)

0.3052047 -0.38260335

ただし、使用する場合は注意する必要があります

tf.random.Generator は、tf.Variable を使用して、RNG アルゴリズムの状態を維持します。TFF では、ジェネレータを tff.tf_computation 内に構築することをお勧めします。また、ジェネレータとその状態を tff.tf_computation 関数間で渡すことは困難です。
前述のコードスニペットも、ジェネレータでシードを注意深く設定することに依存しています。代わりにtf.random.Generator.from_non_deterministic_state() を使用すると、期待どおりの驚くべき結果（確定的 n1 == n2）が得られる場合があります。

一般に、TFF は関数演算を優先します。次のセクションでは、tf.random.stateless_* 関数の使用法を紹介します。

連合学習の TFF では、スカラーの代わりにネストされた構造を使用することが多く、前述のコードスニペットはネストされた構造に自然に拡張できます。

In [ ]:

@tff.tf_computation
def tff_return_one_noise(i):
  g=tf.random.Generator.from_seed(i)
  weights = [
         tf.ones([2, 2], dtype=tf.float32),
         tf.constant([2], dtype=tf.float32)
     ]
  @tf.function
  def tf_return_one_noise():
    return tf.nest.map_structure(lambda x: g.normal(tf.shape(x)), weights)
  return tf_return_one_noise()

@tff.federated_computation
def return_two_noise():
  return (tff_return_one_noise(1), tff_return_one_noise(2))

n1, n2 = return_two_noise() 
assert n1[1] != n2[1]
print('n1', n1)
print('n2', n2)

n1 [array([[0.3052047 , 0.5671378 ],
       [0.41852272, 0.2326421 ]], dtype=float32), array([1.1675092], dtype=float32)]
n2 [array([[-0.38260335, -0.4780486 ],
       [-0.5187485 , -1.8471988 ]], dtype=float32), array([-0.77835274], dtype=float32)]

推奨される使用法: ヘルパーを使用した `tf.random.stateless_*`

TFF での一般的な推奨事項は、ランダムノイズの生成にtf.random.stateless_* 関数を使用することです。これらの関数は、ランダムノイズを生成するための明示的な入力引数として seed（形状 [2] のテンソルまたは 2 つのスカラーテンソルの tuple）を取ります。まず、シードを疑似状態として維持するためのヘルパークラスを定義します。ヘルパー RandomSeedGenerator には、state-in-state-out 方式の関数の演算子があります。tf.random.stateless_* の疑似状態としてカウンターを使用するのは合理的です。これらの関数は、シードを使用する前にシードをスクランブルして、相関シードによって生成されるノイズを統計的に無相関にするためです。

In [ ]:

def timestamp_seed():
  # tf.timestamp returns microseconds as decimal places, thus scaling by 1e6.
  return tf.cast(tf.timestamp() * 1e6, tf.int64)

class RandomSeedGenerator():

  def initialize(self, seed=None):
    if seed is None:
      return tf.stack([timestamp_seed(), 0])
    else:
      return tf.constant(self.seed, dtype=tf.int64, shape=(2,))

  def next(self, state):
    return state + tf.constant([0, 1], tf.int64)

  def structure_next(self, state, nest_structure):
    "Returns seed in nested structure and the next state seed."
    flat_structure = tf.nest.flatten(nest_structure)
    flat_seeds = [state + tf.constant([0, i], tf.int64) for
                  i in range(len(flat_structure))]
    nest_seeds = tf.nest.pack_sequence_as(nest_structure, flat_seeds)
    return nest_seeds, flat_seeds[-1] + tf.constant([0, 1], tf.int64)

次に、ヘルパークラスと tf.random.stateless_normal を使用して、TFF でランダムノイズ（のネストされた構造）を生成します。次のコードスニペットは、TFF 反復プロセスによく似ています。連合学習アルゴリズムを TFF 反復プロセスとして表現する例として、simple_fedavg を参照してください。ここでのランダムノイズ生成の疑似シード状態は tf.Tensor であり、TFF および TF 関数で簡単に利用できます。

In [ ]:

@tff.tf_computation
def tff_return_one_noise(seed_state):
  g=RandomSeedGenerator()
  weights = [
         tf.ones([2, 2], dtype=tf.float32),
         tf.constant([2], dtype=tf.float32)
     ]
  @tf.function
  def tf_return_one_noise():
    nest_seeds, updated_state = g.structure_next(seed_state, weights)
    nest_noise = tf.nest.map_structure(lambda x,s: tf.random.stateless_normal(
        shape=tf.shape(x), seed=s), weights, nest_seeds)
    return nest_noise, updated_state
  return tf_return_one_noise()

@tff.tf_computation
def tff_init_state():
  g=RandomSeedGenerator()
  return g.initialize()

@tff.federated_computation
def return_two_noise():
  seed_state = tff_init_state()
  n1, seed_state = tff_return_one_noise(seed_state)
  n2, seed_state = tff_return_one_noise(seed_state)
  return (n1, n2)

n1, n2 = return_two_noise() 
assert n1[1] != n2[1]
print('n1', n1)
print('n2', n2)

n1 [array([[ 0.86828816,  0.8535084 ],
       [ 1.0053564 , -0.42096713]], dtype=float32), array([0.18048067], dtype=float32)]
n2 [array([[-1.1973879 , -0.2974589 ],
       [ 1.8309833 ,  0.17024393]], dtype=float32), array([0.68991095], dtype=float32)]

Copyright 2021 The TensorFlow Federated Authors.

TFF でのランダムノイズの生成

始める前に

クライアントのランダムノイズ

クライアントの同一のノイズ

クライアントの独立したノイズ

クライアントのモデル初期化子

サーバーのランダムノイズ

推奨されない使用法: `tf.random.normal` を直接使用すること

注意して使用する: `tf.random.Generator`

推奨される使用法: ヘルパーを使用した `tf.random.stateless_*`

Product

Resources

Company

Copyright 2021 The TensorFlow Federated Authors.

TFF でのランダムノイズの生成

始める前に

クライアントのランダムノイズ

クライアントの同一のノイズ

クライアントの独立したノイズ

クライアントのモデル初期化子

サーバーのランダムノイズ

推奨されない使用法: tf.random.normal を直接使用すること

注意して使用する: tf.random.Generator

推奨される使用法: ヘルパーを使用した tf.random.stateless_*

推奨されない使用法: `tf.random.normal` を直接使用すること

注意して使用する: `tf.random.Generator`

推奨される使用法: ヘルパーを使用した `tf.random.stateless_*`