GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/federated/tutorials/random_noise_generation.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2021 The TensorFlow Federated Authors.

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TFF에서 무작위 노이즈 생성

이 튜토리얼에서는 TFF의 무작위 노이즈 생성에 대한 권장 모범 사례를 설명합니다. 무작위 노이즈 생성은 페더레이션 학습 알고리즘(예: 차등 프라이버시)의 많은 개인정보보호 기술에서 중요한 부분을 차지합니다.

시작하기 전에

먼저, 노트북이 관련 구성 요소가 컴파일된 백엔드에 연결되어 있는지 확인합니다.

In [ ]:

#@test {"skip": true}
!pip install --quiet --upgrade tensorflow-federated

In [1]:

import numpy as np
import tensorflow as tf
import tensorflow_federated as tff

다음 "Hello World" 예제를 실행하여 TFF 환경이 올바르게 설정되었는지 확인합니다. 작동하지 않으면 설치 가이드에서 지침을 참조하세요.

In [43]:

@tff.federated_computation
def hello_world():
  return 'Hello, World!'

hello_world()

Out[43]:

b'Hello, World!'

클라이언트의 무작위 노이즈

클라이언트에서 노이즈의 필요성은 일반적으로 동일한 노이즈와 i.i.d. 노이즈의 두 가지 경우로 나뉩니다.

동일한 노이즈의 경우, 권장되는 패턴은 서버에 시드를 유지하고 이를 클라이언트에 브로드캐스트한 다음, tf.random.stateless 함수를 사용하여 노이즈를 생성하는 것입니다.
i.i.d. 노이즈의 경우 tf.random.<distribution> 함수를 피하라는 TF의 권장 사항에 따라 from_non_deterministic_state로 클라이언트에서 초기화된 tf.random.Generator를 사용합니다.

클라이언트 동작은 서버와 다릅니다(나중에 설명할 함정이 없음). 각 클라이언트가 자체 계산 그래프를 빌드하고 자체 기본 시드를 초기화하기 때문입니다.

클라이언트의 동일 노이즈

In [5]:

# Set to use 10 clients.
tff.backends.native.set_sync_local_cpp_execution_context(default_num_clients=10)

@tff.tf_computation
def noise_from_seed(seed):
  return tf.random.stateless_normal((), seed=seed)

seed_type_at_server = tff.type_at_server(tff.to_type((tf.int64, [2])))

@tff.federated_computation(seed_type_at_server)
def get_random_min_and_max_deterministic(seed):
  # Broadcast seed to all clients.
  seed_on_clients = tff.federated_broadcast(seed)

  # Clients generate noise from seed deterministicly.
  noise_on_clients = tff.federated_map(noise_from_seed, seed_on_clients)

  # Aggregate and return the min and max of the values generated on clients.
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

seed = tf.constant([1, 1], dtype=tf.int64)
min, max = get_random_min_and_max_deterministic(seed)
assert min == max
print(f'Seed: {seed.numpy()}. All clients sampled value {min:8.3f}.')

seed += 1
min, max = get_random_min_and_max_deterministic(seed)
assert min == max
print(f'Seed: {seed.numpy()}. All clients sampled value {min:8.3f}.')

Out[5]:

Seed: [1 1]. All clients sampled value    1.665.
Seed: [2 2]. All clients sampled value   -0.219.

클라이언트의 독립적 노이즈

In [ ]:

@tff.tf_computation
def nondeterministic_noise():
  gen = tf.random.Generator.from_non_deterministic_state()
  return gen.normal(())

@tff.federated_computation
def get_random_min_and_max_nondeterministic():
  noise_on_clients = tff.federated_eval(nondeterministic_noise, tff.CLIENTS)
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

min, max = get_random_min_and_max_nondeterministic()
assert min != max
print(f'Values differ across clients. {min:8.3f},{max:8.3f}.')

new_min, new_max = get_random_min_and_max_nondeterministic()
assert new_min != new_max
assert new_min != min and new_max != max
print(f'Values differ across rounds.  {new_min:8.3f},{new_max:8.3f}.')

Values differ across clients.   -1.490,   1.172.
Values differ across rounds.    -1.358,   1.208.

클라이언트의 모델 이니셜라이저

In [ ]:

def _keras_model():
  inputs = tf.keras.Input(shape=(1,))
  outputs = tf.keras.layers.Dense(1)(inputs)
  return tf.keras.Model(inputs=inputs, outputs=outputs)

@tff.tf_computation
def tff_return_model_init():
  model = _keras_model()
  # return the initialized single weight value of the dense layer
  return tf.reshape(
      tff.learning.models.ModelWeights.from_model(model).trainable[0], [-1])[0]

@tff.federated_computation
def get_random_min_and_max_nondeterministic():
  noise_on_clients = tff.federated_eval(tff_return_model_init, tff.CLIENTS)
  min = tff.aggregators.federated_min(noise_on_clients)
  max = tff.aggregators.federated_max(noise_on_clients)
  return min, max

min, max = get_random_min_and_max_nondeterministic()
assert min != max
print(f'Values differ across clients. {min:8.3f},{max:8.3f}.')

new_min, new_max = get_random_min_and_max_nondeterministic()
assert new_min != new_max
assert new_min != min and new_max != max
print(f'Values differ across rounds.  {new_min:8.3f},{new_max:8.3f}.')

Values differ across clients.   -1.022,   1.567.
Values differ across rounds.    -1.675,   1.550.

서버의 무작위 노이즈

권장되지 않는 사용: `tf.random.normal`을 직접 사용

TF의 무작위 노이즈 생성 튜토리얼에 따라 무작위 노이즈 생성을 위한 API tf.random.normal과 같은 TF1.x는 TF2에서 가능한 한 권장되지 않습니다. 이러한 API를 tf.function 및 tf.random.set_seed와 함께 사용하면 놀라운 동작이 발생할 수 있습니다. 예를 들어, 다음 코드는 각 호출에서 동일한 값을 생성합니다. 이 놀라운 동작은 TF에 대해 예상되며, 설명은 tf.random.set_seed 문서에서 찾을 수 있습니다.

In [ ]:

tf.random.set_seed(1)
 
@tf.function
def return_one_noise(_):
  return tf.random.normal([])

n1=return_one_noise(1)
n2=return_one_noise(2) 
assert n1 == n2
print(n1.numpy(), n2.numpy())

0.3052047 0.3052047

TFF에서는 상황이 약간 다릅니다. 노이즈 생성을 tff.tf_computation 대신 tf.function으로 래핑하면 비결정성 무작위 노이즈가 생성됩니다. 그러나 이 코드 조각을 여러 번 실행하면 매번 다른 (n1, n2) 세트가 생성됩니다. TFF에 대한 전역 무작위 시드를 설정하는 쉬운 방법은 없습니다.

In [ ]:

tf.random.set_seed(1)
 
@tff.tf_computation
def return_one_noise(_):
  return tf.random.normal([])

n1=return_one_noise(1)
n2=return_one_noise(2) 
assert n1 != n2
print(n1, n2)

0.11990704 1.9185987

또한 명시적으로 시드를 설정하지 않고도 TFF에서 결정성 있는 노이즈가 생성될 수 있습니다. 다음 코드 조각의 return_two_noise 함수는 두 개의 동일한 노이즈 값을 반환합니다. 이는 TFF가 실행 전에 미리 계산 그래프를 작성하기 때문에 예상되는 동작입니다. 그러나 이것은 사용자가 TFF에서 tf.random.normal을 사용하면서 주의를 기울여야 함을 시사합니다.

주의가 필요한 사용: `tf.random.Generator`

TF 튜토리얼에서 제안한 대로 tf.random.Generator를 사용할 수 있습니다.

In [ ]:

@tff.tf_computation
def tff_return_one_noise(i):
  g=tf.random.Generator.from_seed(i)
  @tf.function
  def tf_return_one_noise():
    return g.normal([])
  return tf_return_one_noise()

@tff.federated_computation
def return_two_noise():
  return (tff_return_one_noise(1), tff_return_one_noise(2))

n1, n2 = return_two_noise() 
assert n1 != n2
print(n1, n2)

0.3052047 -0.38260335

다만, 이용자에게 사용 상의 주의가 필요할 수 있습니다.

tf.random.Generator는 tf.Variable을 사용하여 RNG 알고리즘의 상태를 유지합니다. TFF에서는 tff.tf_computation 내에서 생성기를 구성하는 것이 좋습니다. 그리고 tff.tf_computation 함수 사이에서 생성기와 그 상태를 전달하는 것은 어렵습니다.
이전 코드 조각에서는 생성기에서 시드를 설정할 때 주의가 요구됩니다. tf.random.Generator.from_non_deterministic_state()를 대신 사용하면 예상은 하지만 놀라운 결과(결정성 있는 n1==n2)를 얻을 수 있습니다.

일반적으로, TFF는 기능적 연산을 선호하며 다음 섹션에서 tf.random.stateless_* 함수의 사용법을 보여줄 것입니다.

페더레이션 학습을 위한 TFF에서 우리는 종종 스칼라 대신 중첩 구조로 작업하며 이전 코드 조각은 자연스럽게 중첩 구조로 확장될 수 있습니다.

In [ ]:

@tff.tf_computation
def tff_return_one_noise(i):
  g=tf.random.Generator.from_seed(i)
  weights = [
         tf.ones([2, 2], dtype=tf.float32),
         tf.constant([2], dtype=tf.float32)
     ]
  @tf.function
  def tf_return_one_noise():
    return tf.nest.map_structure(lambda x: g.normal(tf.shape(x)), weights)
  return tf_return_one_noise()

@tff.federated_computation
def return_two_noise():
  return (tff_return_one_noise(1), tff_return_one_noise(2))

n1, n2 = return_two_noise() 
assert n1[1] != n2[1]
print('n1', n1)
print('n2', n2)

n1 [array([[0.3052047 , 0.5671378 ],
       [0.41852272, 0.2326421 ]], dtype=float32), array([1.1675092], dtype=float32)]
n2 [array([[-0.38260335, -0.4780486 ],
       [-0.5187485 , -1.8471988 ]], dtype=float32), array([-0.77835274], dtype=float32)]

권장 사용법: 헬퍼가 있는 `tf.random.stateless_*`

TFF에서 일반적인 권장 사항은 무작위 노이즈 생성에 기능적인 tf.random.stateless_* 함수를 사용하는 것입니다. 이러한 함수는 무작위 노이즈를 생성하기 위한 명시적 입력 인수로 seed(형상이 [2]인 텐서 또는 두 스칼라 텐서의 tuple)를 취합니다. 먼저 시드를 의사 상태로 유지하기 위해 헬퍼 클래스를 정의합니다. 헬퍼 RandomSeedGenerator에는 state-in-state-out 방식의 기능 연산자가 있습니다. 카운터를 tf.random.stateless_*에 대한 의사 상태로 사용하는 것은 합리적인데, 이러한 함수는 통계적으로 상관 관계가 없는 상관된 시드에 의해 노이즈가 만들어지기 전에 시드를 스크램블하기 때문입니다.

In [ ]:

def timestamp_seed():
  # tf.timestamp returns microseconds as decimal places, thus scaling by 1e6.
  return tf.cast(tf.timestamp() * 1e6, tf.int64)

class RandomSeedGenerator():

  def initialize(self, seed=None):
    if seed is None:
      return tf.stack([timestamp_seed(), 0])
    else:
      return tf.constant(self.seed, dtype=tf.int64, shape=(2,))

  def next(self, state):
    return state + tf.constant([0, 1], tf.int64)

  def structure_next(self, state, nest_structure):
    "Returns seed in nested structure and the next state seed."
    flat_structure = tf.nest.flatten(nest_structure)
    flat_seeds = [state + tf.constant([0, i], tf.int64) for
                  i in range(len(flat_structure))]
    nest_seeds = tf.nest.pack_sequence_as(nest_structure, flat_seeds)
    return nest_seeds, flat_seeds[-1] + tf.constant([0, 1], tf.int64)

이제 헬퍼 클래스와 tf.random.stateless_normal을 사용하여 TFF에서 무작위 노이즈(중첩 구조)를 생성해 보겠습니다. 다음 코드 조각은 TFF 반복 프로세스와 매우 유사해 보입니다. 페더레이션 학습 알고리즘을 TFF 반복 프로세스로 표현하는 예로 simple_fedavg를 참조하세요. 여기서 무작위 노이즈 생성을 위한 의사 시드 상태는 TFF 및 TF 함수에서 쉽게 전달할 수 있는 tf.Tensor입니다.

In [ ]:

@tff.tf_computation
def tff_return_one_noise(seed_state):
  g=RandomSeedGenerator()
  weights = [
         tf.ones([2, 2], dtype=tf.float32),
         tf.constant([2], dtype=tf.float32)
     ]
  @tf.function
  def tf_return_one_noise():
    nest_seeds, updated_state = g.structure_next(seed_state, weights)
    nest_noise = tf.nest.map_structure(lambda x,s: tf.random.stateless_normal(
        shape=tf.shape(x), seed=s), weights, nest_seeds)
    return nest_noise, updated_state
  return tf_return_one_noise()

@tff.tf_computation
def tff_init_state():
  g=RandomSeedGenerator()
  return g.initialize()

@tff.federated_computation
def return_two_noise():
  seed_state = tff_init_state()
  n1, seed_state = tff_return_one_noise(seed_state)
  n2, seed_state = tff_return_one_noise(seed_state)
  return (n1, n2)

n1, n2 = return_two_noise() 
assert n1[1] != n2[1]
print('n1', n1)
print('n2', n2)

n1 [array([[ 0.86828816,  0.8535084 ],
       [ 1.0053564 , -0.42096713]], dtype=float32), array([0.18048067], dtype=float32)]
n2 [array([[-1.1973879 , -0.2974589 ],
       [ 1.8309833 ,  0.17024393]], dtype=float32), array([0.68991095], dtype=float32)]

Copyright 2021 The TensorFlow Federated Authors.

TFF에서 무작위 노이즈 생성

시작하기 전에

클라이언트의 무작위 노이즈

클라이언트의 동일 노이즈

클라이언트의 독립적 노이즈

클라이언트의 모델 이니셜라이저

서버의 무작위 노이즈

권장되지 않는 사용: `tf.random.normal`을 직접 사용

주의가 필요한 사용: `tf.random.Generator`

권장 사용법: 헬퍼가 있는 `tf.random.stateless_*`

Product

Resources

Company

Copyright 2021 The TensorFlow Federated Authors.

TFF에서 무작위 노이즈 생성

시작하기 전에

클라이언트의 무작위 노이즈

클라이언트의 동일 노이즈

클라이언트의 독립적 노이즈

클라이언트의 모델 이니셜라이저

서버의 무작위 노이즈

권장되지 않는 사용: tf.random.normal을 직접 사용

주의가 필요한 사용: tf.random.Generator

권장 사용법: 헬퍼가 있는 tf.random.stateless_*

권장되지 않는 사용: `tf.random.normal`을 직접 사용

주의가 필요한 사용: `tf.random.Generator`

권장 사용법: 헬퍼가 있는 `tf.random.stateless_*`