GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/probability/examples/Factorial_Mixture.ipynb
²⁵¹¹⁸ views

Kernel: Python 3

Copyright 2018 The TensorFlow Probability Authors.

Licensed under the Apache License, Version 2.0 (the "License");

In [ ]:

#@title Licensed under the Apache License, Version 2.0 (the "License"); { display-mode: "form" }
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

요인 혼합

이 노트북에서는 TensorFlow Probability(TFP)를 사용하여 다음과 같이 정의된 가우시안 분포의 요인 혼합에서 샘플링하는 방법을 보여줍니다. $p(x_1, ..., x_n) = \prod_i p_i(x_i)$ 여기서 ParseError: KaTeX parse error: Undefined control sequence: \1 at position 137: …gma_{ik}\right)\̲1̲&=\sum_{k=1…

각 변수 $x_i$ 는 가우시안 혼합으로 모델링되며 모든 $n$ 변수에 대한 결합 분포는 이러한 밀도의 곱입니다.

$x^{(1)}, ..., x^{(T)}$ 데이터세트가 주어지면 각 데이터 포인트 $x^{(j)}$ 를 가우시안의 요인 혼합으로 모델링합니다. $p(x^{(j)}) = \prod_i p_i (x_i^{(j)})$

요인 혼합은 적은 수의 매개변수와 많은 수의 모드를 사용하여 분포를 만드는 간단한 방법입니다.

In [ ]:

import tensorflow as tf
import numpy as np
import tensorflow_probability as tfp
import matplotlib.pyplot as plt
import seaborn as sns
tfd = tfp.distributions

# Use try/except so we can easily re-execute the whole notebook.
try:
  tf.enable_eager_execution()
except:
  pass

TFP로 가우시안의 요인 혼합을 빌드합니다.

In [ ]:

num_vars = 2        # Number of variables (`n` in formula).
var_dim = 1         # Dimensionality of each variable `x[i]`.
num_components = 3  # Number of components for each mixture (`K` in formula).
sigma = 5e-2        # Fixed standard deviation of each component.

# Choose some random (component) modes.
component_mean = tfd.Uniform().sample([num_vars, num_components, var_dim])

factorial_mog = tfd.Independent(
   tfd.MixtureSameFamily(
       # Assume uniform weight on each component.
       mixture_distribution=tfd.Categorical(
           logits=tf.zeros([num_vars, num_components])),
       components_distribution=tfd.MultivariateNormalDiag(
           loc=component_mean, scale_diag=[sigma])),
   reinterpreted_batch_ndims=1)

tfd.Independent 사용에 주목하세요. 이 '메타 분포'는 가장 오른쪽 reinterpreted_batch_ndims 배치 차원에 대한 log_prob 계산의 reduce_sum을 적용합니다. 이 경우 log_prob을 계산할 때 배치 차원만 남기고 변수 차원을 합산합니다. 이는 샘플링에 영향을 미치지 않습니다.

밀도를 플롯합니다.

포인트 그리드의 밀도를 계산하고 모드의 위치를 빨간색 별표로 표시합니다. 요인 혼합에서 각 모드는 가우시안의 개별 변수 혼합에서 나온 한 쌍의 모드에 해당합니다. 아래 플롯에서 9개의 모드를 볼 수 있지만 6개의 매개변수만 필요했습니다( $x_1$ 에서 모드의 위치를 지정하는 데 3개, $x_2$ 에서 모드의 위치를 지정하는 데 3개). 대조적으로, 2d 공간 $(x_1, x_2)$ 에서 가우시안 분포의 혼합은 9개의 모드를 지정하기 위해 2 * 9 = 18개의 매개변수가 필요합니다.

In [7]:

plt.figure(figsize=(6,5))

# Compute density.
nx = 250 # Number of bins per dimension.
x = np.linspace(-3 * sigma, 1 + 3 * sigma, nx).astype('float32')
vals = tf.reshape(tf.stack(np.meshgrid(x, x), axis=2), (-1, num_vars, var_dim))
probs = factorial_mog.prob(vals).numpy().reshape(nx, nx)

# Display as image.
from matplotlib.colors import ListedColormap
cmap = ListedColormap(sns.color_palette("Blues", 256))
p = plt.pcolor(x, x, probs, cmap=cmap)
ax = plt.axis('tight');

# Plot locations of means.
means_np = component_mean.numpy().squeeze()
for mu_x in means_np[0]:
  for mu_y in means_np[1]:
    plt.scatter(mu_x, mu_y, s=150, marker='*', c='r', edgecolor='none');
plt.axis(ax);

plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.title('Density of factorial mixture of Gaussians');

Out[7]:

샘플 및 한계 밀도 추정값을 플롯합니다.

In [8]:

samples = factorial_mog.sample(1000).numpy()

g = sns.jointplot(
    x=samples[:, 0, 0],
    y=samples[:, 1, 0],
    kind="scatter",
    marginal_kws=dict(bins=50))
g.set_axis_labels("$x_1$", "$x_2$");

Out[8]:

Copyright 2018 The TensorFlow Probability Authors.

요인 혼합

TFP로 가우시안의 요인 혼합을 빌드합니다.

밀도를 플롯합니다.

샘플 및 한계 밀도 추정값을 플롯합니다.

Product

Resources

Company