Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
tensorflow
GitHub Repository: tensorflow/docs-l10n
Path: blob/master/site/ko/guide/create_op.md
38381 views

op ๋งŒ๋“ค๊ธฐ

์ฐธ๊ณ : C++ ์‚ฌ์šฉ์ž ์ •์˜ ops์˜ ABI๊ฐ€ TensorFlow์˜ ๊ณต์‹ pip ํŒจํ‚ค์ง€์™€ ํ˜ธํ™˜๋˜๋„๋ก ํ•˜๋ ค๋ฉด ์‚ฌ์šฉ์ž ์ •์˜ op ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์˜ ๊ฐ€์ด๋“œ๋ฅผ ๋”ฐ๋ฅด์„ธ์š”. ๊ฐ€์ด๋“œ์—๋Š” ์—”๋“œ ํˆฌ ์—”๋“œ ์ฝ”๋“œ ์˜ˆ์ œ์™€ ์‚ฌ์šฉ์ž ์ง€์ • ops๋ฅผ ์ž‘์„ฑ ๋ฐ ๋ฐฐํฌํ•˜๊ธฐ ์œ„ํ•œ Docker ์ด๋ฏธ์ง€๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์กด TensorFlow ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ํฌํ•จ๋˜์ง€ ์•Š๋Š” op๋ฅผ ๋งŒ๋“ค๋ ค๋ฉด ๋จผ์ € ๊ธฐ์กด Python ops ๋˜๋Š” ํ•จ์ˆ˜์˜ ๊ตฌ์„ฑ์œผ๋กœ op๋ฅผ Python์œผ๋กœ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ๊ฐ€๋Šฅํ•˜์ง€ ์•Š๋‹ค๋ฉด, ์‚ฌ์šฉ์ž ์ •์˜ C++ op๋ฅผ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ์ •์˜ C++ op๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๋ช‡ ๊ฐ€์ง€ ์ด์œ ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ๊ธฐ์กด ops์˜ ๊ตฌ์„ฑ์œผ๋กœ ์ž‘์—…์„ ํ‘œํ˜„ํ•˜๋Š” ๊ฒƒ์€ ์‰ฝ์ง€ ์•Š๊ฑฐ๋‚˜ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

  • ๊ธฐ์กด ํ”„๋ฆฌ๋ฏธํ‹ฐ๋ธŒ์˜ ๊ตฌ์„ฑ์œผ๋กœ ์—ฐ์‚ฐ์„ ํ‘œํ˜„ํ•˜๋Š” ๊ฒƒ์ด ๋น„ํšจ์œจ์ ์ผ ๋•Œ

  • ์‚ฌ์šฉ์ž๊ฐ€ ๋ฏธ๋ž˜์˜ ์ปดํŒŒ์ผ๋Ÿฌ์—์„œ ์œตํ•ฉ์ด ์–ด๋ ค์šด ํ”„๋ฆฌ๋ฏธํ‹ฐ๋ธŒ์˜ ๊ตฌ์„ฑ์„ ์ˆ˜๋™์œผ๋กœ ์œตํ•ฉํ•˜๋ ค ํ•  ๋•Œ

์˜ˆ๋ฅผ ๋“ค์–ด, "MaxPool" ์—ฐ์‚ฐ์ž์™€ ๋น„์Šทํ•œ "์ค‘์•™๊ฐ’ ํ’€๋ง"๊ณผ ๊ฐ™์€ ์—ฐ์‚ฐ์„ ๊ตฌํ˜„ํ•  ๋•Œ ์ตœ๋Œ€๊ฐ’ ๋Œ€์‹  ์Šฌ๋ผ์ด๋”ฉ ์œˆ๋„์šฐ์— ๋Œ€ํ•ด ์ค‘์•™๊ฐ’์„ ๊ณ„์‚ฐํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ์—ฐ์‚ฐ์˜ ๊ตฌ์„ฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ(์˜ˆ: ExtractImagePatches ๋ฐ TopK ์‚ฌ์šฉ), ๋‹จ์ผ ์œตํ•ฉ ์—ฐ์‚ฐ์œผ๋กœ ๋” ๋˜‘๋˜‘ํ•œ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋„ค์ดํ‹ฐ๋ธŒ ์—ฐ์‚ฐ๋ณด๋‹ค๋Š” ์„ฑ๋Šฅ ๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ์˜ ํšจ์œจ์„ฑ์ด ๋–จ์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•ญ์ƒ ๊ทธ๋ ‡๋“ฏ์ด, ์ผ๋ฐ˜์ ์œผ๋กœ ์—ฐ์‚ฐ์ž ๊ตฌ์„ฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์›ํ•˜๋Š” ๊ฒƒ์„ ํ‘œํ˜„ํ•ด ๋ณผ ๋งŒํ•œ๋ฐ, ๊ฐ€์žฅ ์–ด๋ ต๊ณ  ๋น„ํšจ์œจ์ ์ผ ๊ฒฝ์šฐ์—๋งŒ ์ƒˆ ์—ฐ์‚ฐ์„ ์ถ”๊ฐ€ํ•˜๋„๋ก ์„ ํƒํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉ์ž ์ •์˜ op๋ฅผ ํ†ตํ•ฉํ•˜๋ ค๋ฉด ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  1. C++ ํŒŒ์ผ์— ์ƒˆ op๋ฅผ ๋“ฑ๋กํ•ฉ๋‹ˆ๋‹ค. Op ๋“ฑ๋ก์—์„œ op์˜ ๊ตฌํ˜„๊ณผ๋Š” ๋…๋ฆฝ์ ์ธ op ๊ธฐ๋Šฅ์— ๋Œ€ํ•œ ์ธํ„ฐํŽ˜์ด์Šค(์‚ฌ์–‘)๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, op ๋“ฑ๋ก์—์„œ op์˜ ์ด๋ฆ„๊ณผ op์˜ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, ํ…์„œ ํ˜•์ƒ ์œ ์ถ”์— ์‚ฌ์šฉ๋˜๋Š” ํ˜•์ƒ ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.

  2. C++๋กœ op๋ฅผ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค. op์˜ ๊ตฌํ˜„์„ ์ปค๋„์ด๋ผ๊ณ  ํ•˜๋ฉฐ 1๋‹จ๊ณ„์—์„œ ๋“ฑ๋กํ•œ ์‚ฌ์–‘์˜ ๊ตฌ์ฒด์ ์ธ ๊ตฌํ˜„์ž…๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์ž…๋ ฅ/์ถœ๋ ฅ ์œ ํ˜• ๋˜๋Š” ์•„ํ‚คํ…์ฒ˜(์˜ˆ: CPU, GPU)๋ฅผ ์œ„ํ•œ ์ปค๋„์ด ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  3. Python ๋ž˜ํผ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค(์„ ํƒ ์‚ฌํ•ญ). ์ด ๋ž˜ํผ๋Š” Python์—์„œ op๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๊ณต๊ฐœ API์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ๋ž˜ํผ๋Š” op ๋“ฑ๋ก์—์„œ ์ƒ์„ฑ๋˜๋ฉฐ ์ง์ ‘ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  4. op์˜ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค(์„ ํƒ ์‚ฌํ•ญ).

  5. op๋ฅผ ํ…Œ์ŠคํŠธํ•ฉ๋‹ˆ๋‹ค. ๋ณดํ†ต ํŽธ์˜๋ฅผ ์œ„ํ•ด ํŒŒ์ด์ฌ์—์„œ ์ด ์—ฐ์‚ฐ์„ ํ…Œ์ŠคํŠธํ•˜์ง€๋งŒ, C++์—์„œ op๋ฅผ ํ…Œ์ŠคํŠธํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ์ •์˜ํ•˜๋ฉด ํŒŒ์ด์ฌ tf.test.compute_gradient_error์„ ์‚ฌ์šฉํ•˜์—ฌ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Relu ๊ฐ™์€ ์—ฐ์‚ฐ์ž์˜ ์ „๋‹ฌ ํ•จ์ˆ˜์™€ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ํ…Œ์ŠคํŠธํ•˜๋Š” ์˜ˆ์ œ๋Š” relu_op_test.py๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ „์ œ ์กฐ๊ฑด

  • C++์— ์–ด๋А ์ •๋„ ์ต์ˆ™ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • TensorFlow ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์„ค์น˜ํ–ˆ๊ฑฐ๋‚˜ TensorFlow ์†Œ์Šค๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ ๋นŒ๋“œํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

op ์ธํ„ฐํŽ˜์ด์Šค ์ •์˜ํ•˜๊ธฐ

op๋ฅผ TensorFlow ์‹œ์Šคํ…œ์— ๋“ฑ๋กํ•˜์—ฌ op์˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ๋“ฑ๋ก ์‹œ op์˜ ์ด๋ฆ„, ํ•ด๋‹น ์ž…๋ ฅ(์œ ํ˜• ๋ฐ ์ด๋ฆ„) ๋ฐ ์ถœ๋ ฅ(์œ ํ˜• ๋ฐ ์ด๋ฆ„), ๊ทธ๋ฆฌ๊ณ  docstrings ๋ฐ op์— ํ•„์š”ํ•œ attrs๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ž‘๋™ ์›๋ฆฌ๋ฅผ ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด int32์˜ ํ…์„œ๋ฅผ ๊ฐ€์ ธ์™€์„œ ์ฒซ ๋ฒˆ์งธ ์š”์†Œ๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋“  ์š”์†Œ๋ฅผ โ€‹โ€‹0์œผ๋กœ ์„ค์ •ํ•˜์—ฌ ํ…์„œ์˜ ๋ณต์‚ฌ๋ณธ์„ ์ถœ๋ ฅํ•˜๋Š” op๋ฅผ ๋งŒ๋“ ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ฒŒ ํ•˜๋ ค๋ฉด, zero_out.cc์ด๋ผ๋Š” ํŒŒ์ผ์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ, op์˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ •์˜ํ•˜๋Š” REGISTER_OP ๋งคํฌ๋กœ์— ๋Œ€ํ•œ ํ˜ธ์ถœ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

#include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/shape_inference.h" using namespace tensorflow; REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); });

์ด ZeroOut op๋Š” 32-bit ์ •์ˆ˜์˜ ํ…์„œ to_zero ํ•˜๋‚˜๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ณ  32-bit ์ •์ˆ˜์˜ ํ…์„œ zeroed๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ, op๋Š” ํ˜•์ƒ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถœ๋ ฅ ํ…์„œ๊ฐ€ ์ž…๋ ฅ ํ…์„œ์™€ ๊ฐ™์€ ํ˜•์ƒ์ด ๋˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž…๋ ฅ์ด ํ˜•์ƒ[10, 20]์˜ ํ…์„œ์ธ ๊ฒฝ์šฐ, ์ด ํ˜•์ƒ ํ•จ์ˆ˜๋Š” ์ถœ๋ ฅ ํ˜•์ƒ๋„ [10, 20]๋กœ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

์ฐธ๊ณ : op ์ด๋ฆ„์€ CamelCase์—ฌ์•ผ ํ•˜๋ฉฐ ๋ฐ”์ด๋„ˆ๋ฆฌ์— ๋“ฑ๋ก๋œ ๋‹ค๋ฅธ ๋ชจ๋“  op ์ค‘์—์„œ ๊ณ ์œ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

op์˜ ์ปค๋„ ๊ตฌํ˜„ํ•˜๊ธฐ

์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ •์˜ํ•œ ํ›„, ํ•˜๋‚˜ ์ด์ƒ์˜ op ๊ตฌํ˜„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด๋“ค ์ปค๋„ ์ค‘ ํ•˜๋‚˜๋ฅผ ์ž‘์„ฑํ•˜๋ ค๋ฉด, OpKernel์„ ํ™•์žฅํ•˜์—ฌ Compute ๋ฉ”์„œ๋“œ๋ฅผ ๋Œ€์ฒดํ•˜๋Š” ํด๋ž˜์Šค๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค. Compute ๋ฉ”์„œ๋“œ๋Š” ์œ ํ˜• OpKernelContext*์˜ context ์ธ์ˆ˜๋ฅผ ํ•˜๋‚˜ ์ œ๊ณตํ•˜๋ฉฐ, ์ด ์ธ์ˆ˜์—์„œ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ํ…์„œ์™€ ๊ฐ™์€ ์œ ์šฉํ•œ ํ•ญ๋ชฉ์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์œ„์—์„œ ๋งŒ๋“  ํŒŒ์ผ์— ์ปค๋„์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ปค๋„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

#include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat<int32>(); // Create an output tensor Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output_flat = output_tensor->flat<int32>(); // Set all but the first element of the output tensor to 0. const int N = input.size(); for (int i = 1; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if possible. if (N > 0) output_flat(0) = input(0); } };

์ปค๋„์„ ๊ตฌํ˜„ํ•œ ํ›„์—๋Š” TensorFlow ์‹œ์Šคํ…œ์— ์ปค๋„์„ ๋“ฑ๋กํ•ฉ๋‹ˆ๋‹ค. ๋“ฑ๋ก ์‹œ ์ด ์ปค๋„์ด ์‹คํ–‰๋  ๋‹ค๋ฅธ ์ œ์•ฝ ์กฐ๊ฑด์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, CPU์šฉ ์ปค๋„ ํ•˜๋‚˜์™€ GPU์šฉ ์ปค๋„ ํ•˜๋‚˜๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ZeroOut op์šฉ ์ปค๋„์„ ๊ตฌํ˜„ํ•˜๋ ค๋ฉด, zero_out.cc์— ๋‹ค์Œ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp);

์ค‘์š”: OpKernel ์ธ์Šคํ„ด์Šค์— ๋™์‹œ์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Compute ๋ฉ”์„œ๋“œ๋Š” ์Šค๋ ˆ๋“œ๋กœ๋ถ€ํ„ฐ ์•ˆ์ „ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฎคํ…์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํด๋ž˜์Šค ๋ฉค๋ฒ„์— ๋Œ€ํ•œ ์•ก์„ธ์Šค๋ฅผ ๋ณดํ˜ธํ•˜์„ธ์š”. ๋˜๋Š” ๋” ๋‚˜์€ ๋ฐฉ๋ฒ•์œผ๋กœ, ํด๋ž˜์Šค ๋ฉค๋ฒ„๋ฅผ ํ†ตํ•ด ์ƒํƒœ๋ฅผ ๊ณต์œ ํ•˜์ง€ ๋งˆ์„ธ์š”! op ์ƒํƒœ๋ฅผ ์ถ”์ ํ•˜๊ธฐ ์œ„ํ•ด ResourceMgr๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

๋‹ค์ค‘ ์Šค๋ ˆ๋“œ CPU ์ปค๋„

๋‹ค์ค‘ ์Šค๋ ˆ๋“œ CPU ์ปค๋„์„ ์ž‘์„ฑํ•˜๊ธฐ ์œ„ํ•ด work_sharder.h์˜ Shard ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” intra-op ์Šค๋ ˆ๋”ฉ์— ์‚ฌ์šฉ๋˜๋„๋ก ๊ตฌ์„ฑ๋œ ์Šค๋ ˆ๋“œ ๊ฐ„์— ๊ณ„์‚ฐ ํ•จ์ˆ˜๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค(config.proto์˜ intra_op_parallelism_threads ์ฐธ์กฐ).

GPU ์ปค๋„

GPU ์ปค๋„์€ OpKernel ๋ฐ CUDA ์ปค๋„๊ณผ ์‹œ์ž‘ ์ฝ”๋“œ์˜ ๋‘ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌํ˜„๋ฉ๋‹ˆ๋‹ค.

์ž…๋ ฅ ๊ฒ€์‚ฌ ๋ฐ ์ถœ๋ ฅ ํ• ๋‹น๊ณผ ๊ฐ™์ด CPU์™€ GPU ์ปค๋„ ๊ฐ„์— OpKernel ๊ตฌํ˜„์ด ๊ณตํ†ต์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ, ์ œ์•ˆ ๊ตฌํ˜„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  1. Device ํ…œํ”Œ๋ฆฟ ํ˜•์‹์˜ OpKernel๊ณผ ํ…์„œ์˜ ๊ธฐ๋ณธ ์œ ํ˜•์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.

  2. ์ถœ๋ ฅ์˜ ์‹ค์ œ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด Compute ํ•จ์ˆ˜์—์„œ ํ…œํ”Œ๋ฆฟ ํ˜•์‹์˜ functor ๊ตฌ์กฐ์ฒด๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.

  3. CPUDevice์— ๋Œ€ํ•œ ํ•ด๋‹น functor์˜ ์ „๋ฌธํ™”๋Š” ๊ฐ™์€ ํŒŒ์ผ์— ์ •์˜๋˜์–ด ์žˆ์ง€๋งŒ, GPUDevice์— ๋Œ€ํ•œ ์ „๋ฌธํ™”๋Š” CUDA ์ปดํŒŒ์ผ๋Ÿฌ๋กœ ์ปดํŒŒ์ผ๋˜๋ฏ€๋กœ .cu.cc ํŒŒ์ผ์— ์ •์˜๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ์€ ๊ตฌํ˜„ ์˜ˆ์ž…๋‹ˆ๋‹ค.

// kernel_example.h #ifndef KERNEL_EXAMPLE_H_ #define KERNEL_EXAMPLE_H_ #include <unsupported/Eigen/CXX11/Tensor> template <typename Device, typename T> struct ExampleFunctor { void operator()(const Device& d, int size, const T* in, T* out); }; #if GOOGLE_CUDA // Partially specialize functor for GpuDevice. template <typename T> struct ExampleFunctor<Eigen::GpuDevice, T> { void operator()(const Eigen::GpuDevice& d, int size, const T* in, T* out); }; #endif #endif KERNEL_EXAMPLE_H_
// kernel_example.cc #include "kernel_example.h" #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/shape_inference.h" #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; using CPUDevice = Eigen::ThreadPoolDevice; using GPUDevice = Eigen::GpuDevice; REGISTER_OP("Example") .Attr("T: numbertype") .Input("input: T") .Output("input_times_two: T") .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); }); // CPU specialization of actual computation. template <typename T> struct ExampleFunctor<CPUDevice, T> { void operator()(const CPUDevice& d, int size, const T* in, T* out) { for (int i = 0; i < size; ++i) { out[i] = 2 * in[i]; } } }; // OpKernel definition. // template parameter <T> is the datatype of the tensors. template <typename Device, typename T> class ExampleOp : public OpKernel { public: explicit ExampleOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); // Create an output tensor Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); // Do the computation. OP_REQUIRES(context, input_tensor.NumElements() <= tensorflow::kint32max, errors::InvalidArgument("Too many elements in tensor")); ExampleFunctor<Device, T>()( context->eigen_device<Device>(), static_cast<int>(input_tensor.NumElements()), input_tensor.flat<T>().data(), output_tensor->flat<T>().data()); } }; // Register the CPU kernels. #define REGISTER_CPU(T) \ REGISTER_KERNEL_BUILDER( \ Name("Example").Device(DEVICE_CPU).TypeConstraint<T>("T"), \ ExampleOp<CPUDevice, T>); REGISTER_CPU(float); REGISTER_CPU(int32); // Register the GPU kernels. #ifdef GOOGLE_CUDA #define REGISTER_GPU(T) \ /* Declare explicit instantiations in kernel_example.cu.cc. */ \ extern template class ExampleFunctor<GPUDevice, T>; \ REGISTER_KERNEL_BUILDER( \ Name("Example").Device(DEVICE_GPU).TypeConstraint<T>("T"), \ ExampleOp<GPUDevice, T>); REGISTER_GPU(float); REGISTER_GPU(int32); #endif // GOOGLE_CUDA
// kernel_example.cu.cc #ifdef GOOGLE_CUDA #define EIGEN_USE_GPU #include "kernel_example.h" #include "tensorflow/core/util/gpu_kernel_helper.h" using namespace tensorflow; using GPUDevice = Eigen::GpuDevice; // Define the CUDA kernel. template <typename T> __global__ void ExampleCudaKernel(const int size, const T* in, T* out) { for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < size; i += blockDim.x * gridDim.x) { out[i] = 2 * __ldg(in + i); } } // Define the GPU implementation that launches the CUDA kernel. template <typename T> void ExampleFunctor<GPUDevice, T>::operator()( const GPUDevice& d, int size, const T* in, T* out) { // Launch the cuda kernel. // // See core/util/gpu_kernel_helper.h for example of computing // block count and thread_per_block count. int block_count = 1024; int thread_per_block = 20; ExampleCudaKernel<T> <<<block_count, thread_per_block, 0, d.stream()>>>(size, in, out); } // Explicitly instantiate functors for the types of OpKernels registered. template struct ExampleFunctor<GPUDevice, float>; template struct ExampleFunctor<GPUDevice, int32>; #endif // GOOGLE_CUDA

op ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋นŒ๋“œํ•˜๊ธฐ

์‹œ์Šคํ…œ ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ op ์ปดํŒŒ์ผํ•˜๊ธฐ(TensorFlow ๋ฐ”์ด๋„ˆ๋ฆฌ ์„ค์น˜)

์‹œ์Šคํ…œ์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ g++ ๋˜๋Š” clang๊ณผ ๊ฐ™์€ C++ ์ปดํŒŒ์ผ๋Ÿฌ๋กœ zero_out.cc๋ฅผ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ง„ PIP ํŒจํ‚ค์ง€๋Š” ์‹œ์Šคํ…œ์˜ ํŠน์ • ์œ„์น˜์— op๋ฅผ ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ํ—ค๋” ํŒŒ์ผ๊ณผ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ, TensorFlow Python ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ํ—ค๋” ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” get_include ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, get_lib ๋””๋ ‰ํ† ๋ฆฌ์—๋Š” ๋งํฌํ•  ๊ณต์œ  ๊ฐ์ฒด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Ubuntu ๋จธ์‹ ์—์„œ ์ด๋“ค ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

$ python >>> import tensorflow as tf >>> tf.sysconfig.get_include() '/usr/local/lib/python3.6/site-packages/tensorflow/include' >>> tf.sysconfig.get_lib() '/usr/local/lib/python3.6/site-packages/tensorflow'

g++๋ฅผ ์„ค์น˜ํ–ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด, ๋‹ค์Œ์€ op๋ฅผ ๋™์  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ช…๋ น ์‹œํ€€์Šค์ž…๋‹ˆ๋‹ค.

TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') ) TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') ) g++ -std=c++14 -shared zero_out.cc -o zero_out.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2

macOS์—์„œ๋Š” .so ํŒŒ์ผ์„ ๋นŒ๋“œํ•  ๋•Œ ์ถ”๊ฐ€ ํ”Œ๋ž˜๊ทธ "-undefined dynamic_lookup"์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

gcc ๋ฒ„์ „ >=5์— ๋Œ€ํ•œ ์ฐธ๊ณ  ์‚ฌํ•ญ: gcc๋Š” ๋ฒ„์ „ 5๋ถ€ํ„ฐ ์ƒˆ๋กœ์šด C++ ABI๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. TensorFlow 2.8๊ณผ ์ด์ „ ๋ฒ„์ „์€ ๊ธฐ์กด ABI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” gcc4๋กœ ๋นŒ๋“œ๋์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฒ„์ „์˜ TensorFlow๋ฅผ ์‚ฌ์šฉ ์ค‘์ด๊ณ  gcc>=5๋กœ ์—ฐ์‚ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ปดํŒŒ์ผํ•˜๋ ค๋Š” ๊ฒฝ์šฐ, ๋ช…๋ น์ค„์— -D_GLIBCXX_USE_CXX11_ABI=0์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๊ธฐ์กด ABI์™€ ํ˜ธํ™˜๋˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. TensorFlow 2.9+ ํŒจํ‚ค์ง€๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์ตœ์‹  ABI์™€ ํ˜ธํ™˜๋ฉ๋‹ˆ๋‹ค.

bazel(TensorFlow ์†Œ์Šค ์„ค์น˜)์„ ์‚ฌ์šฉํ•˜์—ฌ op ์ปดํŒŒ์ผํ•˜๊ธฐ

TensorFlow ์†Œ์Šค๊ฐ€ ์„ค์น˜๋˜์–ด ์žˆ์œผ๋ฉด, TensorFlow์˜ ๋นŒ๋“œ ์‹œ์Šคํ…œ์„ ์‚ฌ์šฉํ•˜์—ฌ op๋ฅผ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. tensorflow/core/user_ops ๋””๋ ‰ํ† ๋ฆฌ์— ๋‹ค์Œ Bazel ๋นŒ๋“œ ๊ทœ์น™์„ ๊ฐ€์ง„ BUILD ํŒŒ์ผ์„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

load("//tensorflow:tensorflow.bzl", "tf_custom_op_library") tf_custom_op_library( name = "zero_out.so", srcs = ["zero_out.cc"], )

๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜์—ฌ zero_out.so๋ฅผ ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค.

$ bazel build --config opt //tensorflow/core/user_ops:zero_out.so

CUDA ์ปค๋„์„ ์‚ฌ์šฉํ•˜์—ฌ Example ์—ฐ์‚ฐ์„ ์ปดํŒŒ์ผํ•˜๋ ค๋ฉด tf_custom_op_library์˜ gpu_srcs ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ Bazel ๋นŒ๋“œ ๊ทœ์น™์ด ์žˆ๋Š” BUILD ํŒŒ์ผ์„ tensorflow/core/user_ops ๋””๋ ‰ํ„ฐ๋ฆฌ(์˜ˆ: "example_gpu") ๋‚ด์˜ ์ƒˆ ํด๋”์— ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค.

load("//tensorflow:tensorflow.bzl", "tf_custom_op_library") tf_custom_op_library( # kernel_example.cc kernel_example.cu.cc kernel_example.h name = "kernel_example.so", srcs = ["kernel_example.h", "kernel_example.cc"], gpu_srcs = ["kernel_example.cu.cc", "kernel_example.h"], )

๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜์—ฌ kernel_example.so๋ฅผ ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค.

$ bazel build --config opt //tensorflow/core/user_ops/example_gpu:kernel_example.so

์ฐธ๊ณ : ์œ„์—์„œ ์„ค๋ช…ํ•œ ๋Œ€๋กœ gcc>=5๋กœ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ, Bazel ๋ช…๋ น์ค„ ์ธ์ˆ˜์— --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

์ฐธ๊ณ : ํ‘œ์ค€ cc_library ๊ทœ์น™์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ณต์œ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(.so ํŒŒ์ผ)๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์ง€๋งŒ, tf_custom_op_library ๋งคํฌ๋กœ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ด ๋งคํฌ๋กœ๋Š” ํ•„์ˆ˜ ์ข…์†์„ฑ์„ ์ถ”๊ฐ€ํ•˜๊ณ  ๊ณต์œ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ TensorFlow์˜ ํ”Œ๋Ÿฌ๊ทธ์ธ ๋กœ๋”ฉ ๋ฉ”์ปค๋‹ˆ์ฆ˜๊ณผ ํ˜ธํ™˜๋˜๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

Python์—์„œ op ์‚ฌ์šฉํ•˜๊ธฐ

TensorFlow Python API๋Š” tf.load_op_library ํ•จ์ˆ˜๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋™์  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋กœ๋“œํ•˜๊ณ  TensorFlow ํ”„๋ ˆ์ž„์›Œํฌ์— op๋ฅผ ๋“ฑ๋กํ•ฉ๋‹ˆ๋‹ค. load_op_library๋Š” op ๋ฐ ์ปค๋„์— ๋Œ€ํ•œ Python ๋ž˜ํผ๊ฐ€ ํฌํ•จ๋œ Python ๋ชจ๋“ˆ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ์ผ๋‹จ op๋ฅผ ๋นŒ๋“œํ•˜๋ฉด ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•˜์—ฌ Python์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import tensorflow as tf zero_out_module = tf.load_op_library('./zero_out.so') print(zero_out_module.zero_out([[1, 2], [3, 4]]).numpy()) # Prints array([[1, 0], [0, 0]], dtype=int32)

์ƒ์„ฑ๋œ ํ•จ์ˆ˜์—๋Š” snake_case ์ด๋ฆ„์ด ์ง€์ •๋ฉ๋‹ˆ๋‹ค(PEP8 ์ค€์ˆ˜). ๋”ฐ๋ผ์„œ, C++ ํŒŒ์ผ์—์„œ op์˜ ์ด๋ฆ„์ด ZeroOut์ธ ๊ฒฝ์šฐ, Python ํ•จ์ˆ˜์˜ ์ด๋ฆ„์€ zero_out์ž…๋‹ˆ๋‹ค.

Python ๋ชจ๋“ˆ์—์„œ op๋ฅผ ์ •๊ทœ ํ•จ์ˆ˜๋กœ import ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์ด Python ์†Œ์Šค ํŒŒ์ผ์— load_op_library ํ˜ธ์ถœ์„ ํฌํ•จํ•˜๋Š” ๊ฒƒ์ด ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

import tensorflow as tf zero_out_module = tf.load_op_library('./zero_out.so') zero_out = zero_out_module.zero_out

op๊ฐ€ ์ž‘๋™ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ

op๋ฅผ ์„ฑ๊ณต์ ์œผ๋กœ ๊ตฌํ˜„ํ–ˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ์ข‹์€ ๋ฐฉ๋ฒ•์€ ํ…Œ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ๋‚ด์šฉ์œผ๋กœ zero_out_op_test.py ํŒŒ์ผ์„ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

import tensorflow as tf class ZeroOutTest(tf.test.TestCase): def testZeroOut(self): zero_out_module = tf.load_op_library('./zero_out.so') with self.test_session(): result = zero_out_module.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) if __name__ == "__main__": tf.test.main()

๊ทธ๋Ÿฐ ๋‹ค์Œ ํ…Œ์ŠคํŠธ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค(tensorflow๊ฐ€ ์„ค์น˜๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •).

$ python zero_out_op_test.py

op์— ๊ณ ๊ธ‰ ํŠน์„ฑ ๋นŒ๋“œํ•˜๊ธฐ

๊ธฐ๋ณธ (๊ทธ๋ฆฌ๊ณ , ๋‹ค์†Œ ์ œํ•œ์ ์ธ) op ๋ฐ ๊ตฌํ˜„์„ ๋นŒ๋“œํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ณด์•˜์œผ๋ฏ€๋กœ ์ผ๋ฐ˜์ ์œผ๋กœ op์— ๋นŒ๋“œํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ์กฐ๊ธˆ ๋” ๋ณต์žกํ•œ ํ•ญ๋ชฉ์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ๋‹ค์Œ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

์กฐ๊ฑด๋ถ€ ๊ฒ€์‚ฌ ๋ฐ ํ™•์ธ

์œ„์˜ ์˜ˆ์ œ์—์„œ๋Š” op๊ฐ€ ๋ชจ๋“  ํ˜•์ƒ์˜ ํ…์„œ์— ์ ์šฉ๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฒกํ„ฐ์—๋งŒ ์ ์šฉ๋œ ๊ฒฝ์šฐ๋Š” ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ• ๊นŒ์š”? ์œ„์˜ OpKernel ๊ตฌํ˜„์— ๊ฒ€์‚ฌ๋ฅผ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()), errors::InvalidArgument("ZeroOut expects a 1-D vector.")); // ... }

์ž…๋ ฅ์ด ๋ฒกํ„ฐ์ž„์„ ์ธ์ฆํ•˜๋Š” ๋‚ด์šฉ์ด๋ฉฐ, ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, InvalidArgument ์ƒํƒœ๋ฅผ ์„ค์ •ํ•˜์—ฌ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค. OP_REQUIRES ๋งคํฌ๋กœ๋Š” ์„ธ ๊ฐ€์ง€ ์ธ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

  • context๋Š” SetStatus() ๋ฉ”์„œ๋“œ์— ๋Œ€ํ•œ OpKernelContext ๋˜๋Š” OpKernelConstruction ํฌ์ธํ„ฐ(tensorflow/core/framework/op_kernel.h ์ฐธ์กฐ)์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์กฐ๊ฑด. ์˜ˆ๋ฅผ ๋“ค์–ด, tensorflow/core/framework/tensor_shape.h์— ํ…์„œ์˜ ํ˜•์ƒ์„ ํ™•์ธํ•˜๋Š” ํ•จ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • Status ๊ฐ์ฒด๋กœ ํ‘œ์‹œ๋˜๋Š” ์˜ค๋ฅ˜ ์ž์ฒด๋Š” tensorflow/core/platform/status.h๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. Status์—๋Š” ์œ ํ˜•(์ข…์ข… InvalidArgument์ด์ง€๋งŒ, ์œ ํ˜•์˜ ๋ชฉ๋ก ์ฐธ์กฐ)๊ณผ ๋ฉ”์‹œ์ง€๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋ฅ˜ ์ƒ์„ฑ ํ•จ์ˆ˜๋Š” tensorflow/core/lib/core/errors.h์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ผ๋ถ€ ํ•จ์ˆ˜์—์„œ ๋ฐ˜ํ™˜๋œ Status ๊ฐ์ฒด๊ฐ€ ์˜ค๋ฅ˜์ธ์ง€ ํ…Œ์ŠคํŠธํ•˜๋ ค๋Š” ๊ฒฝ์šฐ, OP_REQUIRES_OK๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด ๋‘ ๋งคํฌ๋กœ๋Š” ๋ชจ๋‘ ์˜ค๋ฅ˜ ์‹œ ํ•จ์ˆ˜๋กœ๋ถ€ํ„ฐ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

op ๋“ฑ๋ก

Attrs

Ops๋Š” attr์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ, op๊ฐ€ ๊ทธ๋ž˜ํ”„์— ์ถ”๊ฐ€๋  ๋•Œ ๊ฐ’์ด ์„ค์ •๋ฉ๋‹ˆ๋‹ค. ์ด๋“ค ๊ฐ’์€ op๋ฅผ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋ฉฐ ์ปค๋„ ๊ตฌํ˜„ ๋‚ด์—์„œ, ๊ทธ๋ฆฌ๊ณ  op ๋“ฑ๋ก์—์„œ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ์œ ํ˜•์œผ๋กœ ํ•ด๋‹น ๊ฐ’์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ์ด ๋” ์œ ์—ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ€๋Šฅํ•˜๋ฉด attr ๋Œ€์‹  ์ž…๋ ฅ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. attrs๋Š” ์ƒ์ˆ˜์ด๊ณ  ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ ์‹œ ์ •์˜ํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋ฐ˜๋ฉด์—, ์ž…๋ ฅ์€ ๊ฐ’์ด ๋™์ ์ผ ์ˆ˜ ์žˆ๋Š” ํ…์„œ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ์ž…๋ ฅ์€ ๋‹จ๊ณ„๋งˆ๋‹ค ๋ณ€ํ•  ์ˆ˜ ์žˆ๊ณ  ํ”ผ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Attrs์€ ์„œ๋ช…(์ž…๋ ฅ ๋˜๋Š” ์ถœ๋ ฅ์˜ ์ˆ˜ ๋˜๋Š” ์œ ํ˜•)์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๊ฑฐ๋‚˜ ๋‹จ๊ณ„๋ณ„๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์—†๋Š” ๊ตฌ์„ฑ๊ณผ ๊ฐ™์ด ์ž…๋ ฅ์œผ๋กœ ๊ตฌ์„ฑํ•  ์ˆ˜ ์—†๋Š” ์—ฐ์‚ฐ์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

op๋ฅผ ๋“ฑ๋กํ•  ๋•Œ Attr ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ op์˜ ์ด๋ฆ„๊ณผ ์œ ํ˜•์„ ์ง€์ •ํ•จ์œผ๋กœ์จ attr๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ ํ˜•์‹์˜ ์‚ฌ์–‘์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

<name>: <attr-type-expr>

<name>์€ ๋ฌธ์ž๋กœ ์‹œ์ž‘ํ•˜๊ณ  ์˜์ˆซ์ž์™€ ๋ฐ‘์ค„๋กœ ๊ตฌ์„ฑ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, <attr-type-expr>์€ ์•„๋ž˜ ์„ค๋ช…๋œ ํ˜•์‹์˜ ์œ ํ˜• ํ‘œํ˜„์‹์ž…๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, ZeroOut op๊ฐ€ 0๋ฒˆ์งธ ์š”์†Œ๋งŒ์ด ์•„๋‹Œ ์‚ฌ์šฉ์ž ์ง€์ • ์ธ๋ฑ์Šค๋ฅผ ์œ ์ง€ํ•˜๋„๋ก ํ•˜๋ ค๋ฉด op๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋“ฑ๋กํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("ZeroOut") .Attr("preserve_index: int") .Input("to_zero: int32") .Output("zeroed: int32");

(์†์„ฑ ์œ ํ˜•์˜ ์ง‘ํ•ฉ์€ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ์— ์‚ฌ์šฉ๋˜๋Š” tf.DType๊ณผ๋Š” ๋‹ค๋ฆ…๋‹ˆ๋‹ค.)

์ปค๋„์€ context ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ด ์ƒ์„ฑ์ž์—์„œ ์ด attr์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) { // Get the index of the value to preserve OP_REQUIRES_OK(context, context->GetAttr("preserve_index", &preserve_index_)); // Check that preserve_index is positive OP_REQUIRES(context, preserve_index_ >= 0, errors::InvalidArgument("Need preserve_index >= 0, got ", preserve_index_)); } void Compute(OpKernelContext* context) override { // ... } private: int preserve_index_; };

๊ทธ๋Ÿฐ ๋‹ค์Œ Compute ๋ฉ”์„œ๋“œ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

void Compute(OpKernelContext* context) override { // ... // We're using saved attr to validate potentially dynamic input // So we check that preserve_index is in range OP_REQUIRES(context, preserve_index_ < input.dimension(0), errors::InvalidArgument("preserve_index out of range")); // Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the requested input value output_flat(preserve_index_) = input(preserve_index_); }

Attr ์œ ํ˜•

๋‹ค์Œ ์œ ํ˜•์ด attr์—์„œ ์ง€์›๋ฉ๋‹ˆ๋‹ค.

  • string: ๋ฐ”์ดํŠธ ์‹œํ€€์Šค(UTF8์ผ ํ•„์š”๋Š” ์—†์Œ)

  • int: ๋ถ€ํ˜ธ ์žˆ๋Š” ์ •์ˆ˜

  • float: ๋ถ€๋™ ์†Œ์ˆ˜์  ์ˆซ์ž

  • bool: ์ฐธ ๋˜๋Š” ๊ฑฐ์ง“

  • type: DataType์˜ (๋น„์ฐธ์กฐ) ๊ฐ’ ์ค‘ ํ•˜๋‚˜

  • shape: TensorShapeProto

  • list(<type>): <type>์˜ ๋ชฉ๋ก, <type>์€ ์œ„์˜ ์œ ํ˜• ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. list(list(<type>))๋Š” ์œ ํšจํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋ช…ํ™•ํ•œ ๋ชฉ๋ก์€ op_def_builder.cc:FinalizeAttr์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๊ธฐ๋ณธ๊ฐ’ ๋ฐ ์ œ์•ฝ ์กฐ๊ฑด

Attrs๋Š” ๊ธฐ๋ณธ๊ฐ’์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ, attrs์˜ ์ผ๋ถ€ ์œ ํ˜•์—๋Š” ์ œ์•ฝ ์กฐ๊ฑด์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ œ์•ฝ ์กฐ๊ฑด์ด ์žˆ๋Š” attr์„ ์ •์˜ํ•˜๋ ค๋ฉด, ๋‹ค์Œ <attr-type-expr>์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

{'<string1>', '<string2>'}: ๊ฐ’์€ <string1> ๋˜๋Š” <string2> ๊ฐ’์„ ๊ฐ€์ง„ ๋ฌธ์ž์—ด์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ตฌ๋ฌธ์„ ์‚ฌ์šฉํ•˜๋ฉด ์œ ํ˜• string์˜ ์ด๋ฆ„์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์—ด๊ฑฐํ˜•์„ ์—๋ฎฌ๋ ˆ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("EnumExample") .Attr("e: {'apple', 'orange'}");

{<type1>, <type2>}: ๊ฐ’์€ ์œ ํ˜• type์ด๋ฉฐ, <type1> ๋˜๋Š” <type2> ์ค‘ ํ•˜๋‚˜์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. <type1> ๋ฐ <type2>๋Š” tf.DType์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. attr์˜ ์œ ํ˜•์ด type์ž„์„ ์ง€์ •ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. {...}์— ์œ ํ˜•์˜ ๋ชฉ๋ก์ด ์žˆ์„ ๋•Œ ์•”์‹œ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ด ๊ฒฝ์šฐ attr t์˜ ์œ ํ˜•์€ int32, float ๋˜๋Š” bool์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("RestrictedTypeExample") .Attr("t: {int32, float, bool}");

๋‹ค์Œ์€ ์ผ๋ฐ˜์ ์ธ ์œ ํ˜• ์ œ์•ฝ ์กฐ๊ฑด์— ๋Œ€ํ•œ ๋ฐ”๋กœ ๊ฐ€๊ธฐ์ž…๋‹ˆ๋‹ค.

  • numbertype: ์œ ํ˜• type์€ ์ˆซ์ž(๋ฌธ์ž์—ด๋„ ๋ถ€์šธ๋„ ์•„๋‹Œ) ์œ ํ˜•์œผ๋กœ ์ œํ•œ๋ฉ๋‹ˆ๋‹ค.

  • realnumbertype: ๋ณต์žกํ•œ ์œ ํ˜•์ด ์—†๋Š” numbertype๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

  • quantizedtype: numbertype๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ, ์–‘์žํ™”๋œ ์ˆซ์ž ์œ ํ˜•๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ด๋“ค ์ œ์•ฝ ์กฐ๊ฑด์—์„œ ํ—ˆ์šฉ๋˜๋Š” ์œ ํ˜•์˜ ํŠน์ • ๋ชฉ๋ก์€tensorflow/core/framework/types.h์—์„œ ํ•จ์ˆ˜(์˜ˆ: NumberTypes())๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค. ์ด ์˜ˆ์ œ์—์„œ attr t๋Š” ์ˆซ์ž ์œ ํ˜• ์ค‘ ํ•˜๋‚˜์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("NumberType") .Attr("t: numbertype");

๋‹ค์Œ op์˜ ๊ฒฝ์šฐ:

tf.number_type(t=tf.int32) # Valid tf.number_type(t=tf.bool) # Invalid

๋ชฉ๋ก์€ ๋‹ค๋ฅธ ๋ชฉ๋ก ๋ฐ ๋‹จ์ผ ์œ ํ˜•๊ณผ ๊ฒฐํ•ฉ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ op์—์„œ๋Š” attr t๊ฐ€ ์ˆซ์ž ์œ ํ˜•์ด๊ฑฐ๋‚˜ ๋ถ€์šธ ์œ ํ˜•์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("NumberOrBooleanType") .Attr("t: {numbertype, bool}");

๋‹ค์Œ op์˜ ๊ฒฝ์šฐ:

tf.number_or_boolean_type(t=tf.int32) # Valid tf.number_or_boolean_type(t=tf.bool) # Valid tf.number_or_boolean_type(t=tf.string) # Invalid

int >= <n>: ๊ฐ’์€ <n>๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์€ ์ •์ˆ˜์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. <n>๋Š” ์ž์—ฐ์ˆ˜์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋‹ค์Œ op ๋“ฑ๋ก์—์„œ attr a์˜ ๊ฐ’์€ 2 ์ด์ƒ์ด์–ด์•ผ ํ•จ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("MinIntExample") .Attr("a: int >= 2");

list(<type>) >= <n>: ๊ธธ์ด๊ฐ€ <n> ์ด์ƒ์ธ ์œ ํ˜• <type>์˜ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋‹ค์Œ op ๋“ฑ๋ก์—์„œ attr a์€ ์œ ํ˜• (int32 ๋˜๋Š” float)์˜ ๋ชฉ๋ก์ด๋ฉฐ, ์ ์–ด๋„ 3๊ฐœ ์ด์ƒ ์žˆ์–ด์•ผ ํ•จ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("TypeListExample") .Attr("a: list({int32, float}) >= 3");

attr์˜ ๊ธฐ๋ณธ๊ฐ’์„ ์„ค์ •ํ•˜๋ ค๋ฉด(์ƒ์„ฑ๋œ ์ฝ”๋“œ์—์„œ ์„ ํƒ ์‚ฌํ•ญ), ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋์— = <default>๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("AttrDefaultExample") .Attr("i: int = 0");

๋˜ํ•œ, ์ œ์•ฝ ์กฐ๊ฑด๊ณผ ๊ธฐ๋ณธ๊ฐ’์„ ๋ชจ๋‘ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("AttrConstraintAndDefaultExample") .Attr("i: int >= 1 = 1");

์ง€์›๋˜๋Š” ๊ธฐ๋ณธ๊ฐ’ ๊ตฌ๋ฌธ์€ ์ตœ์ข… GraphDef ์ •์˜์˜ ํ”„๋กœํ† ํƒ€์ž… ํ‘œํ˜„์— ์‚ฌ์šฉ๋˜๋Š” ๊ตฌ๋ฌธ์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ์€ ๋ชจ๋“  ์œ ํ˜•์˜ ๊ธฐ๋ณธ๊ฐ’์„ ์ง€์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค.

REGISTER_OP("AttrDefaultExampleForAllTypes") .Attr("s: string = 'foo'") .Attr("i: int = 0") .Attr("f: float = 1.0") .Attr("b: bool = true") .Attr("ty: type = DT_INT32") .Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }") .Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }") .Attr("l_empty: list(int) = []") .Attr("l_int: list(int) = [2, 3, 5, 7]");

ํŠนํžˆ, ์œ ํ˜• type์˜ ๊ฐ’์€ tf.DType์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋‹คํ˜•์„ฑ

์œ ํ˜• ๋‹คํ˜•์„ฑ

๋‹ค๋ฅธ ์œ ํ˜•์„ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๋‹ค๋ฅธ ์ถœ๋ ฅ ์œ ํ˜•์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” op์˜ ๊ฒฝ์šฐ, op ๋“ฑ๋ก์—์„œ ์ž…๋ ฅ ๋˜๋Š” ์ถœ๋ ฅ ์œ ํ˜•์— attr์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ, ์ง€์›๋˜๋Š” ๊ฐ ์œ ํ˜•์— ๋Œ€ํ•ด OpKernel์„ ๋“ฑ๋กํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, int32 ์ด์™ธ์— float์— ๋Œ€ํ•ด ZeroOut op๊ฐ€ ์ž‘๋™ํ•˜๊ฒŒ ํ•˜๋ ค๋ฉด op ๋“ฑ๋ก์€ ๋‹ค์Œ๊ณผ ๊ฐ™์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("ZeroOut") .Attr("T: {float, int32}") .Input("to_zero: T") .Output("zeroed: T");

op ๋“ฑ๋ก์—์„œ ์ด์ œ ์ž…๋ ฅ์˜ ์œ ํ˜•์ด float ๋˜๋Š” int32์—ฌ์•ผ ํ•จ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ ์œ ํ˜•์ด ๋ชจ๋‘ T์ด๋ฏ€๋กœ ์ถœ๋ ฅ์˜ ์œ ํ˜•๋„ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ช…๋ช…

์ž…๋ ฅ, ์ถœ๋ ฅ ๋ฐ attrs์—๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ snake_case ์ด๋ฆ„์ด ์ง€์ •๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํ•œ ๊ฐ€์ง€ ์˜ˆ์™ธ๋Š” ์ž…๋ ฅ์˜ ์œ ํ˜• ๋˜๋Š” ์ถœ๋ ฅ์˜ ์œ ํ˜•์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” attrs์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ attrs๋Š” op๊ฐ€ ๊ทธ๋ž˜ํ”„์— ์ถ”๊ฐ€๋  ๋•Œ ์œ ์ถ”๋  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ op์˜ ํ•จ์ˆ˜์—๋Š” ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ด ZeroOut์˜ ์ตœ์ข… ์ •์˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ Python ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

def zero_out(to_zero, name=None): """... Args: to_zero: A `Tensor`. Must be one of the following types: `float32`, `int32`. name: A name for the operation (optional). Returns: A `Tensor`. Has the same type as `to_zero`. """

to_zero์— int32 ํ…์„œ๊ฐ€ ์ „๋‹ฌ๋˜๋ฉด, T๋Š” ์ž๋™์œผ๋กœ int32๋กœ ์„ค์ •๋ฉ๋‹ˆ๋‹ค(์‹ค์ œ๋กœ DT_INT32). ์œ ์ถ”๋œ attrs์—๋Š” ๋Œ€๋ฌธ์ž ๋˜๋Š” CamelCase ์ด๋ฆ„์ด ์ง€์ •๋ฉ๋‹ˆ๋‹ค.

์œ ์ถ”๋œ attrs๋ฅผ ์ถœ๋ ฅ ์œ ํ˜•์„ ๊ฒฐ์ •ํ•˜๋Š” ์œ ํ˜• attr์ด ์žˆ๋Š” op์™€ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("StringToNumber") .Input("string_tensor: string") .Output("output: out_type") .Attr("out_type: {float, int32} = DT_FLOAT"); .Doc(R"doc( Converts each string in the input Tensor to the specified numeric type. )doc");

์ด ๊ฒฝ์šฐ, ์‚ฌ์šฉ์ž๋Š” ์ƒ์„ฑ๋œ Python์—์„œ์™€ ๊ฐ™์ด ์ถœ๋ ฅ ์œ ํ˜•์„ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

def string_to_number(string_tensor, out_type=None, name=None): """Converts each string in the input Tensor to the specified numeric type. Args: string_tensor: A `Tensor` of type `string`. out_type: An optional `tf.DType` from: `tf.float32, tf.int32`. Defaults to `tf.float32`. name: A name for the operation (optional). Returns: A `Tensor` of type `out_type`. """
์œ ํ˜• ๋‹คํ˜•์„ฑ ์˜ˆ์ œ
#include "tensorflow/core/framework/op_kernel.h" class ZeroOutInt32Op : public OpKernel { // as before }; class ZeroOutFloatOp : public OpKernel { public: explicit ZeroOutFloatOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat<float>(); // Create an output tensor Tensor* output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat<float>(); // Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if (N > 0) output_flat(0) = input(0); } }; // Note that TypeConstraint<int32>("T") means that attr "T" (defined // in the op registration above) must be "int32" to use this template // instantiation. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<int32>("T"), ZeroOutInt32Op); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<float>("T"), ZeroOutFloatOp);

์ด์ „ ๋ฒ„์ „๊ณผ์˜ ํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•˜๋ ค๋ฉด, ๊ธฐ์กด op์— attr์„ ์ถ”๊ฐ€ํ•  ๋•Œ ๊ธฐ๋ณธ๊ฐ’์„ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("ZeroOut") .Attr("T: {float, int32} = DT_INT32") .Input("to_zero: T") .Output("zeroed: T")

๋” ๋งŽ์€ ์œ ํ˜•์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ด…์‹œ๋‹ค. ์˜ˆ: double

REGISTER_OP("ZeroOut") .Attr("T: {float, double, int32}") .Input("to_zero: T") .Output("zeroed: T");

์œ„์™€ ๊ฐ™์ด ์ค‘๋ณต ์ฝ”๋“œ๋กœ ๋˜ ๋‹ค๋ฅธ OpKernel์„ ์ž‘์„ฑํ•˜๋Š” ๋Œ€์‹ , ์ข…์ข… C++ ํ…œํ”Œ๋ฆฟ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋ฒ„๋กœ๋“œ๋‹น ์—ฌ์ „ํžˆ ํ•˜๋‚˜์˜ ์ปค๋„ ๋“ฑ๋ก(REGISTER_KERNEL_BUILDER ํ˜ธ์ถœ)์ด ์žˆ์Šต๋‹ˆ๋‹ค.

template <typename T> class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat<T>(); // Create an output tensor Tensor* output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat<T>(); // Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if (N > 0) output_flat(0) = input(0); } }; // Note that TypeConstraint<int32>("T") means that attr "T" (defined // in the op registration above) must be "int32" to use this template // instantiation. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<int32>("T"), ZeroOutOp<int32>); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<float>("T"), ZeroOutOp<float>); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<double>("T"), ZeroOutOp<double>);

์˜ค๋ฒ„๋กœ๋“œ๊ฐ€ ๋‘ ๊ฐœ ์ด์ƒ์ธ ๊ฒฝ์šฐ, ๋“ฑ๋ก์„ ๋งคํฌ๋กœ์— ๋„ฃ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

#include "tensorflow/core/framework/op_kernel.h" #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint<type>("T"), \ ZeroOutOp<type>) REGISTER_KERNEL(int32); REGISTER_KERNEL(float); REGISTER_KERNEL(double); #undef REGISTER_KERNEL

์ปค๋„์„ ๋“ฑ๋กํ•˜๋ ค๋Š” ์œ ํ˜•์˜ ๋ชฉ๋ก์— ๋”ฐ๋ผ tensorflow/core/framework/register_types.h์—์„œ ์ œ๊ณต๋˜๋Š” ๋งคํฌ๋กœ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

#include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" REGISTER_OP("ZeroOut") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T"); template <typename T> class ZeroOutOp : public OpKernel { ... }; #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint<type>("T"), \ ZeroOutOp<type>) TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL); #undef REGISTER_KERNEL
์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ๋ชฉ๋ก

๋‹ค์–‘ํ•œ ์œ ํ˜•์„ ํ—ˆ์šฉํ•˜๊ฑฐ๋‚˜ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์„ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ops๋Š” ๋‹ค์–‘ํ•œ ๊ฐœ์ˆ˜์˜ ํ…์„œ๋ฅผ ์†Œ๋น„ํ•˜๊ฑฐ๋‚˜ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ค์Œ ์˜ˆ์ œ์—์„œ, attr T๋Š” ์œ ํ˜•์˜ list๋ฅผ ๋ณด์œ ํ•˜๊ณ , ์ƒ๊ธฐ ์ž…๋ ฅ in๊ณผ ์ถœ๋ ฅ out์œผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ์€ ํ•ด๋‹น ์œ ํ˜•์˜ ํ…์„œ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค(์ถœ๋ ฅ์˜ ํ…์„œ ์ˆ˜์™€ ์œ ํ˜•์€ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์˜ ์œ ํ˜•์ด ๋ชจ๋‘ T์ด๋ฏ€๋กœ ์ž…๋ ฅ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค).

REGISTER_OP("PolymorphicListExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T");

๋ชฉ๋ก์—์„œ ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋Š” ์œ ํ˜•์— ์ œํ•œ์„ ๋‘˜ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ, ์ž…๋ ฅ์€ float ๋ฐ double ํ…์„œ์˜ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. op๋Š” ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž…๋ ฅ ์œ ํ˜• (float, double, float)์„ ํ—ˆ์šฉํ•˜๋ฉฐ, ์ด ๊ฒฝ์šฐ ์ถœ๋ ฅ ์œ ํ˜•๋„ (float, double, float)์ž…๋‹ˆ๋‹ค.

REGISTER_OP("ListTypeRestrictionExample") .Attr("T: list({float, double})") .Input("in: T") .Output("out: T");

๋ชฉ๋ก์˜ ๋ชจ๋“  ํ…์„œ๊ฐ€ ๊ฐ™์€ ์œ ํ˜•์ด ๋˜๋„๋ก ํ•˜๋ ค๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("IntListInputExample") .Attr("N: int") .Input("in: N * int32") .Output("out: int32");

int32 ํ…์„œ์˜ ๋ชฉ๋ก์„ ํ—ˆ์šฉํ•˜๊ณ  int attr N์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชฉ๋ก์˜ ๊ธธ์ด๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

๋‹คํ˜• ์œ ํ˜•์œผ๋กœ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์˜ˆ์ œ์—์„œ, ์ž…๋ ฅ์€ ์œ ํ˜• ("T")์ด ๊ฐ™์€ (ํ•˜์ง€๋งŒ ์ง€์ •๋˜์ง€๋Š” ์•Š์€) ํ…์„œ(๊ธธ์ด "N")์˜ ๋ชฉ๋ก์ด๋ฉฐ, ์ถœ๋ ฅ์€ ์ผ์น˜ํ•˜๋Š” ์œ ํ˜•์˜ ๋‹จ์ผ ํ…์„œ์ž…๋‹ˆ๋‹ค.

REGISTER_OP("SameListInputExample") .Attr("N: int") .Attr("T: type") .Input("in: N * T") .Output("out: T");

๊ธฐ๋ณธ์ ์œผ๋กœ, ํ…์„œ ๋ชฉ๋ก์˜ ์ตœ์†Œ ๊ธธ์ด๋Š” 1์ž…๋‹ˆ๋‹ค. ํ•ด๋‹น attr์— ๋Œ€ํ•œ ">=" ์ œ์•ฝ ์กฐ๊ฑด์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๋‹น ๊ธฐ๋ณธ๊ฐ’์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์˜ˆ์ œ์—์„œ ์ž…๋ ฅ์€ int32 ํ…์„œ๊ฐ€ 2๊ฐœ ์ด์ƒ์ธ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.

REGISTER_OP("MinLengthIntListExample") .Attr("N: int >= 2") .Input("in: N * int32") .Output("out: int32");

๊ฐ™์€ ๊ตฌ๋ฌธ์ด "list(type)" attrs์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

REGISTER_OP("MinimumLengthPolymorphicListExample") .Attr("T: list(type) >= 3") .Input("in: T") .Output("out: T");

์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ

์œ„์˜ ๋‚ด์šฉ์„ ์š”์•ฝํ•˜๋ฉด, op ๋“ฑ๋ก์—๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

REGISTER_OP("MultipleInsAndOuts") .Input("y: int32") .Input("z: float") .Output("a: string") .Output("b: int32");

๊ฐ ์ž…๋ ฅ ๋˜๋Š” ์ถœ๋ ฅ ์‚ฌ์–‘์˜ ํ˜•์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

<name>: <io-type-expr>

<name>์€ ๋ฌธ์ž๋กœ ์‹œ์ž‘ํ•˜๋ฉฐ ์˜์ˆซ์ž์™€ ๋ฐ‘์ค„๋กœ ๊ตฌ์„ฑ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. <io-type-expr>์€ ๋‹ค์Œ ์œ ํ˜• ํ‘œํ˜„์‹ ์ค‘์˜ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.

  • <type>, <type>์€ ์ง€์›๋˜๋Š” ์ž…๋ ฅ ์œ ํ˜•์ž…๋‹ˆ๋‹ค(์˜ˆ: float, int32, string). ํŠน์ • ์œ ํ˜•์˜ ๋‹จ์ผ ํ…์„œ๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

    tf.DType์„ ์ฐธ์กฐํ•˜์„ธ์š”.

    REGISTER_OP("BuiltInTypesExample") .Input("integers: int32") .Input("complex_numbers: complex64");
  • <attr-type>, <attr-type>์€ ์œ ํ˜•์ด type ๋˜๋Š” list(type)(๊ฐ€๋Šฅํ•œ ์œ ํ˜• ์ œํ•œ์ด ์žˆ๋Š”)์ธ Attr์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. ์ด ๊ตฌ๋ฌธ์€ ๋‹คํ˜• ops๋ฅผ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค.

    REGISTER_OP("PolymorphicSingleInput") .Attr("T: type") .Input("in: T"); REGISTER_OP("RestrictedPolymorphicSingleInput") .Attr("T: {int32, int64}") .Input("in: T");

    ์œ ํ˜•์ด list(type)์ธ attr์„ ์ฐธ์กฐํ•˜๋ฉด ํ…์„œ ์‹œํ€€์Šค๋ฅผ ๋ฐ›์•„๋“ค์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    REGISTER_OP("ArbitraryTensorSequenceExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); REGISTER_OP("RestrictedTensorSequenceExample") .Attr("T: list({int32, int64})") .Input("in: T") .Output("out: T");

    ์ถœ๋ ฅ out์—์„œ ํ…์„œ์˜ ์ˆ˜ ๋ฐ ์œ ํ˜•์€ ์ž…๋ ฅ in์—์„œ์™€ ๊ฐ™์€๋ฐ, ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์˜ ์œ ํ˜•์ด ๋ชจ๋‘ T์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

  • ์œ ํ˜•์ด ๊ฐ™์€ ํ…์„œ ์‹œํ€€์Šค์˜ ๊ฒฝ์šฐ: <number>*<type>์—์„œ <number>๋Š” ์œ ํ˜•์ด int์ธ Attr์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. <type>์€ tf.DType์ด๊ฑฐ๋‚˜ ์œ ํ˜•์ด type์ธ attr์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ์˜ ์˜ˆ๋กœ, ์ด op๋Š” int32 ํ…์„œ์˜ ๋ชฉ๋ก์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค.

    REGISTER_OP("Int32SequenceExample") .Attr("NumTensors: int") .Input("in: NumTensors * int32")

    ์ด op๋Š” ๋ชจ๋“  ์œ ํ˜•์˜ ํ…์„œ ๋ชฉ๋ก์„ ํ—ˆ์šฉํ•˜๋Š”๋ฐ, ์ด๋•Œ ํ…์„œ์˜ ์œ ํ˜•์€ ๋ชจ๋‘ ๊ฐ™์Šต๋‹ˆ๋‹ค.

    REGISTER_OP("SameTypeSequenceExample") .Attr("NumTensors: int") .Attr("T: type") .Input("in: NumTensors * T")
  • ํ…์„œ์— ๋Œ€ํ•œ ์ฐธ์กฐ: Ref(<type>), <type>์€ ์ด์ „ ์œ ํ˜• ์ค‘์˜ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.

์ž…๋ ฅ์˜ ์œ ํ˜•์— ์‚ฌ์šฉ๋œ ๋ชจ๋“  attr๊ฐ€ ์œ ์ถ”๋ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ, ์œ ์ถ”๋œ attr์€ (T ๋˜๋Š” N๊ณผ ๊ฐ™์€) ๋Œ€๋ฌธ์ž ์ด๋ฆ„์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด, ์ž…๋ ฅ, ์ถœ๋ ฅ ๋ฐ attr์˜ ์ด๋ฆ„์€ ํ•จ์ˆ˜ ๋งค๊ฐœ๋ณ€์ˆ˜(์˜ˆ: num_outputs)์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ช…๋ช…์— ๊ด€ํ•œ ์ด์ „ ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ž์„ธํ•œ ๋‚ด์šฉ์€ tensorflow/core/framework/op_def_builder.h๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์ด์ „ ๋ฒ„์ „๊ณผ์˜ ํ˜ธํ™˜์„ฑ

๋ฉ‹์ง„ ์‚ฌ์šฉ์ž ์ง€์ • op๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ๋‹ค๋ฅธ ์‚ฌ์šฉ์ž์™€ ๊ณต์œ ํ–ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์—ฌ ์—ฐ์‚ฐ์„ ์‚ฌ์šฉํ•˜๋Š” ํ–‰๋ณตํ•œ ๊ณ ๊ฐ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ op๋ฅผ ๋ณ€๊ฒฝํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์œผ๋กœ, ๊ธฐ์กด์˜ ํ™•์ธ๋œ(checked-in) ์‚ฌ์–‘์— ๋Œ€ํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. op์˜ ์‚ฌ์–‘์„ ๋ณ€๊ฒฝํ•œ ํ›„ ์ด์ „ ์‚ฌ์–‘์—์„œ ์ƒ์„ฑ๋œ ์ด์ „์˜ ์ง๋ ฌํ™”๋œ GraphDef ํ”„๋กœํ† ์ฝœ ๋ฒ„ํผ๊ฐ€ ์†์ƒ๋˜๋ฉด ์•ˆ ๋ฉ๋‹ˆ๋‹ค. GraphDef ํ˜ธํ™˜์„ฑ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์—ฌ๊ธฐ์— ์„ค๋ช…๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด์ „ ๋ฒ„์ „๊ณผ์˜ ํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๋ช‡ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  1. ์—ฐ์‚ฐ์— ์ถ”๊ฐ€๋œ ์ƒˆ attrs์—๋Š” ๊ธฐ๋ณธ๊ฐ’์ด ์ •์˜๋˜์–ด ์žˆ์–ด์•ผ ํ•˜๋ฉฐ, ํ•ด๋‹น ๊ธฐ๋ณธ๊ฐ’์„ ๊ฐ€์ง„ op๋Š” ์›๋ž˜ ๋™์ž‘์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‹คํ˜•์ด ์•„๋‹Œ ์—ฐ์‚ฐ์—์„œ ๋‹คํ˜• ์—ฐ์‚ฐ์œผ๋กœ ๋ณ€๊ฒฝํ•˜๋ ค๋ฉด, ๊ธฐ๋ณธ์ ์œผ๋กœ ์›๋ž˜ ์„œ๋ช…์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์ƒˆ ์œ ํ˜• attr์— ๊ธฐ๋ณธ๊ฐ’์„ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์—ฐ์‚ฐ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฝ์šฐ,

    REGISTER_OP("MyGeneralUnaryOp") .Input("in: float") .Output("out: float");

    ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋Š” ๋‹คํ˜• ์—ฐ์‚ฐ์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    REGISTER_OP("MyGeneralUnaryOp") .Input("in: T") .Output("out: T") .Attr("T: numerictype = DT_FLOAT");
  2. attr์— ๋Œ€ํ•œ ์ œ์•ฝ ์กฐ๊ฑด์„ ๋œ ์ œํ•œ์ ์œผ๋กœ ์•ˆ์ „ํ•˜๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, {int32, int64}์—์„œ {int32, int64, float} ๋˜๋Š” type๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜๋Š” {"apple", "orange"}์—์„œ {"apple", "banana", "orange"} ๋˜๋Š” string๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  3. ๋ชฉ๋ก ์œ ํ˜•์˜ ๊ธฐ๋ณธ๊ฐ’์ด ์ด์ „ ์„œ๋ช…๊ณผ ์ผ์น˜ํ•˜๋Š” ํ•œ ๋‹จ์ผ ์ž…๋ ฅ/์ถœ๋ ฅ์„ ๋ชฉ๋ก ์ž…๋ ฅ/์ถœ๋ ฅ์œผ๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  4. ๊ธฐ๋ณธ๊ฐ’์ด ๋น„์–ด ์žˆ์œผ๋ฉด ์ƒˆ ๋ชฉ๋ก ์ž…๋ ฅ/์ถœ๋ ฅ์„ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  5. op ์ด๋ฆ„ ์•ž์— ํ”„๋กœ์ ํŠธ ๊ณ ์œ ์˜ ์ด๋ฆ„์„ ๋ถ™์—ฌ์„œ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋“  ์ƒˆ๋กœ์šด ops์— ๋„ค์ž„์ŠคํŽ˜์ด์Šค๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ดํ›„ ๋ฒ„์ „์˜ TensorFlow์— ํฌํ•จ๋  ์ˆ˜ ์žˆ๋Š” ops์™€ ํ•ด๋‹น op๊ฐ€ ์ถฉ๋Œํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

  6. ๋ฏธ๋ฆฌ ๊ณ„ํšํ•˜์„ธ์š”! op์˜ ํ–ฅํ›„ ์šฉ๋„๋ฅผ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ์„œ๋ช…์„ ์ผ๋ถ€ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์€ ํ˜ธํ™˜ ๊ฐ€๋Šฅํ•œ ๋ฐฉ์‹์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค(์˜ˆ: ๊ฐ™์€ ์œ ํ˜•์˜ ๋ชฉ๋ก์„ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ๋ชฉ๋ก์œผ๋กœ ๋งŒ๋“ค๊ธฐ).

์•ˆ์ „ํ•˜๊ฑฐ๋‚˜ ์•ˆ์ „ํ•˜์ง€ ์•Š์€ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์˜ ์ „์ฒด ๋ชฉ๋ก์€ tensorflow/core/framework/op_compatibility_test.cc ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋„๋ก ์—ฐ์‚ฐ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ, ์ƒˆ ์˜๋ฏธ ์ฒด๊ณ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒˆ ์ด๋ฆ„์œผ๋กœ ์ƒˆ ์—ฐ์‚ฐ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

๋˜ํ•œ, ์ด๋Ÿฌํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ GraphDef ํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ƒ์„ฑ๋œ Python ์ฝ”๋“œ๋Š” ์ด์ „ ํ˜ธ์ถœ์ž์™€ ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” ๋ฐฉ์‹์œผ๋กœ ๋ณ€๊ฒฝ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Python API๋Š” ์ƒˆ๋กœ์šด ์„ ํƒ์  ์ธ์ˆ˜๋ฅผ ๋์— ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ์ œ์™ธํ•˜๊ณ  ์ด์ „ ์„œ๋ช…์„ ์œ ์ง€ํ•จ์œผ๋กœ์จ ์†์œผ๋กœ ์ž‘์„ฑํ•œ Python ๋ž˜ํผ๋ฅผ ์‹ ์ค‘ํ•˜๊ฒŒ ๋ณ€๊ฒฝํ•˜์—ฌ ํ˜ธํ™˜์„ฑ์„ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ, ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ TensorFlow์˜ ์ฃผ์š” ๋ฒ„์ „์ด ๋ณ€๊ฒฝ๋  ๋•Œ๋งŒ ์ˆ˜ํ–‰๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ GraphDef๋ฒ„์ „ ์˜๋ฏธ ์ฒด๊ณ„๋ฅผ ์ค€์ˆ˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

GPU ์ง€์›

์„œ๋กœ ๋‹ค๋ฅธ ์œ ํ˜•์˜ ์ปค๋„์„ ๋“ฑ๋กํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์„œ๋กœ ๋‹ค๋ฅธ OpKernel์„ ๊ตฌํ˜„ํ•˜๊ณ  CPU ๋ฐ GPU์šฉ ์ปค๋„์„ ๊ฐ๊ฐ ๋“ฑ๋กํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. tensorflow/core/kernels/์— GPU๋ฅผ ์ง€์›ํ•˜๋Š” ์ปค๋„์˜ ๋ช‡ ๊ฐ€์ง€ ์˜ˆ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ถ€ ์ปค๋„์—๋Š” .cc ํŒŒ์ผ์˜ CPU ๋ฒ„์ „, _gpu.cu.cc๋กœ ๋๋‚˜๋Š” ํŒŒ์ผ์˜ GPU ๋ฒ„์ „ ๋ฐ .h ํŒŒ์ผ์—์„œ ๊ณตํ†ต์œผ๋กœ ๊ณต์œ ๋˜๋Š” ์ฝ”๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, tf.pad๋Š” tensorflow/core/kernels/pad_op.cc์— GPU ์ปค๋„์„ ์ œ์™ธํ•œ ๋ชจ๋“  ๊ฒƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. GPU ์ปค๋„์€ tensorflow/core/kernels/pad_op_gpu.cu.cc์— ์žˆ์œผ๋ฉฐ, ๊ณต์œ  ์ฝ”๋“œ๋Š” tensorflow/core/kernels/pad_op.h์— ์ •์˜๋œ ํ…œํ”Œ๋ฆฟ ํ˜•์‹์˜ ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค. ์ฝ”๋“œ๋ฅผ ์ด ๋ฐฉ์‹์œผ๋กœ ๊ตฌ์„ฑํ•˜๋Š” ๋ฐ๋Š” ๋‘ ๊ฐ€์ง€ ์ด์œ ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. CPU์™€ GPU ๊ตฌํ˜„ ๊ฐ„์— ๊ณตํ†ต ์ฝ”๋“œ๋ฅผ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ GPU ๊ตฌํ˜„์„ ๋ณ„๋„์˜ ํŒŒ์ผ์— ๋„ฃ์–ด GPU ์ปดํŒŒ์ผ๋Ÿฌ๋กœ๋งŒ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

pad์˜ GPU ์ปค๋„ ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๋”๋ผ๋„ CPU ๋ฉ”๋ชจ๋ฆฌ์— ์—ฌ์ „ํžˆ "paddings" ์ž…๋ ฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋˜๋Š” ์ถœ๋ ฅ์ด CPU์—์„œ ์œ ์ง€๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ‘œ์‹œํ•˜๋ ค๋ฉด, ์ปค๋„ ๋“ฑ๋ก์— HostMemory() ํ˜ธ์ถœ์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

#define REGISTER_GPU_KERNEL(T) \ REGISTER_KERNEL_BUILDER(Name("Pad") \ .Device(DEVICE_GPU) \ .TypeConstraint<T>("T") \ .HostMemory("paddings"), \ PadOp<GPUDevice, T>)

GPU ๊ธฐ๊ธฐ์šฉ ์ปค๋„ ์ปดํŒŒ์ผํ•˜๊ธฐ

CUDA ์ปค๋„์„ ์‚ฌ์šฉํ•˜์—ฌ op๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ์˜ˆ๋Š” cuda_op_kernel.cu.cc๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. tf_custom_op_library์€ CUDA ์ปค๋„(*.cu.cc ํŒŒ์ผ)์„ ํฌํ•จํ•˜๋Š” ์†Œ์Šค ํŒŒ์ผ์˜ ๋ชฉ๋ก์„ ์ง€์ •ํ•  ์ˆ˜์žˆ๋Š” gpu_srcs ์ธ์ˆ˜๋ฅผ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค. TensorFlow์˜ ๋ฐ”์ด๋„ˆ๋ฆฌ ์„ค์น˜์—์„œ ์‚ฌ์šฉํ•˜๋ ค๋ฉด, CUDA ์ปค๋„์„ NVIDIA์˜ nvcc ์ปดํŒŒ์ผ๋Ÿฌ๋กœ ์ปดํŒŒ์ผํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ cuda_op_kernel.cu.cc ๋ฐ cuda_op_kernel.cc๋ฅผ ๋™์ ์œผ๋กœ ๋กœ๋“œ ๊ฐ€๋Šฅํ•œ ๋‹จ์ผ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ช…๋ น ์‹œํ€€์Šค์ž…๋‹ˆ๋‹ค.

nvcc -std=c++14 -c -o cuda_op_kernel.cu.o cuda_op_kernel.cu.cc \ ${TF_CFLAGS[@]} -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC g++ -std=c++14 -shared -o cuda_op_kernel.so cuda_op_kernel.cc \ cuda_op_kernel.cu.o ${TF_CFLAGS[@]} -fPIC -lcudart ${TF_LFLAGS[@]}

์œ„์—์„œ ์ƒ์„ฑ๋œ cuda_op_kernel.so๋Š” tf.load_op_library ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Python์—์„œ ํ‰์†Œ์™€ ๊ฐ™์ด ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ /usr/local/lib64์— ์„ค์น˜๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ, ์œ„์˜ ๋‘ ๋ฒˆ์งธ(g++) ๋ช…๋ น์—์„œ ๊ฒฝ๋กœ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, CUDA๊ฐ€ /usr/local/cuda-8.0์— ์„ค์น˜๋˜์–ด ์žˆ๋Š” ๊ฒฝ์šฐ, -L /usr/local/cuda-8.0/lib64/๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

์ฐธ๊ณ : ์ผ๋ถ€ Linux ์„ค์ •์—์„œ๋Š” nvcc ์ปดํŒŒ์ผ ๋‹จ๊ณ„์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์˜ต์…˜์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. -D_MWAITXINTRIN_H_INCLUDED๋ฅผ nvcc ๋ช…๋ น์ค„์— ์ถ”๊ฐ€ํ•˜์—ฌ mwaitxintrin.h์˜ ์˜ค๋ฅ˜๋ฅผ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค.

Python์—์„œ ๊ทธ๋ž˜๋””์–ธํŠธ ๊ตฌํ˜„ํ•˜๊ธฐ

ops์˜ ๊ทธ๋ž˜ํ”„์—์„œ TensorFlow๋Š” ์ž๋™ ๋ฏธ๋ถ„(์—ญ์ „ํŒŒ)์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ์กด op์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ƒˆ ops๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ops์— ๋Œ€ํ•ด ์ž๋™ ๋ฏธ๋ถ„์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด, ops์˜ ์ถœ๋ ฅ์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ์ง€์ •๋œ ops์˜ ์ž…๋ ฅ์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๋ฅผ ๋“ฑ๋กํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ˆ˜ํ•™์ ์œผ๋กœ, op๊ฐ€ (y = f(x))๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒฝ์šฐ, ๋“ฑ๋ก๋œ ๊ทธ๋ž˜๋””์–ธํŠธ op๋Š” (y)์— ๋Œ€ํ•œ ์†์‹ค (L)์˜ ๊ทธ๋ž˜๋””์–ธํŠธ (\partial L/ \partial y)๋ฅผ ์—ฐ์‡„ ๊ทœ์น™์„ ํ†ตํ•ด (x)์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ (\partial L/ \ partial x)๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

โˆ‚Lโˆ‚x=โˆ‚Lโˆ‚yโˆ‚yโˆ‚x=โˆ‚Lโˆ‚yโˆ‚fโˆ‚x.\frac{\partial L}{\partial x} = \frac{\partial L}{\partial y} \frac{\partial y}{\partial x} = \frac{\partial L}{\partial y} \frac{\partial f}{\partial x}.

ZeroOut์˜ ๊ฒฝ์šฐ, ์ž…๋ ฅ์˜ ํ•œ ํ•ญ๋ชฉ๋งŒ ์ถœ๋ ฅ์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋ฏ€๋กœ ์ž…๋ ฅ์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ๋Š” "์›-ํ•ซ" ํฌ์†Œ ํ…์„œ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค.

from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops from tensorflow.python.ops import sparse_ops @ops.RegisterGradient("ZeroOut") def _zero_out_grad(op, grad): """The gradients for `zero_out`. Args: op: The `zero_out` `Operation` that we are differentiating, which we can use to find the inputs and outputs of the original op. grad: Gradient with respect to the output of the `zero_out` op. Returns: Gradients with respect to the input of `zero_out`. """ to_zero = op.inputs[0] shape = array_ops.shape(to_zero) index = array_ops.zeros_like(shape) first_grad = array_ops.reshape(grad, [-1])[0] to_zero_grad = sparse_ops.sparse_to_dense([index], shape, first_grad, 0) return [to_zero_grad] # List of one Tensor, since we have one input

tf.RegisterGradient๋กœ ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๋ฅผ ๋“ฑ๋กํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์„ธ๋ถ€ ์‚ฌํ•ญ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ์ถœ๋ ฅ์ด ํ•˜๋‚˜์ธ op์˜ ๊ฒฝ์šฐ, ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๋Š” tf.Operation, op ๋ฐ tf.Tensor grad๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ํ…์„œ op.inputs[i], op.outputs[i] ๋ฐ grad์—์„œ ์ƒˆ ops๋ฅผ ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  attrs์— ๋Œ€ํ•œ ์ •๋ณด๋Š” tf.Operation.get_attr์„ ํ†ตํ•ด ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ถœ๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ op์ธ ๊ฒฝ์šฐ, ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๋Š” op ๋ฐ grads๋ฅผ ์‚ฌ์šฉํ•˜๊ณ , ์ด๋•Œ grads๋Š” ๊ฐ ์ถœ๋ ฅ์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ์˜ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜์˜ ๊ฒฐ๊ณผ๋Š” ๊ฐ ์ž…๋ ฅ์— ๋Œ€ํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” Tensor ๊ฐ์ฒด์˜ ๋ชฉ๋ก์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • ์ธ๋ฑ์Šค๋กœ ์‚ฌ์šฉ๋˜๋Š” ์ •์ˆ˜ ์ž…๋ ฅ๊ณผ ๊ฐ™์ด ์ผ๋ถ€ ์ž…๋ ฅ์— ๋Œ€ํ•ด ์ž˜ ์ •์˜๋œ ๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, ๋ฐ˜ํ™˜๋˜๋Š” ํ•ด๋‹น ๊ทธ๋ž˜๋””์–ธํŠธ๋Š” None์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ถ€๋™ ์†Œ์ˆ˜์  ํ…์„œ x ๋ฐ ์ •์ˆ˜ ์ธ๋ฑ์Šค i๋ฅผ ์‚ฌ์šฉํ•˜๋Š” op์˜ ๊ฒฝ์šฐ, ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๋Š” [x_grad, None]๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

  • op์— ์˜๋ฏธ ์žˆ๋Š” ๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ, ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๋“ฑ๋กํ•  ํ•„์š”๊ฐ€ ์—†์œผ๋ฉฐ, op์˜ ๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์€ ํ•œ ๋ฌธ์ œ ์—†์Šต๋‹ˆ๋‹ค. ๊ฒฝ์šฐ์— ๋”ฐ๋ผ op์— ์ž˜ ์ •์˜๋œ ๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ์—†์–ด๋„ ๊ทธ๋ž˜๋””์–ธํŠธ ๊ณ„์‚ฐ์— ๊ด€์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋•Œ ops.NotDifferentiable์„ ์‚ฌ์šฉํ•˜์—ฌ ์ž๋™์œผ๋กœ 0์„ ๋’ค๋กœ ์ „ํŒŒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๊ฐ€ ํ˜ธ์ถœ๋  ๋•Œ ํ…์„œ ๋ฐ์ดํ„ฐ ์ž์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ ops์˜ ๋ฐ์ดํ„ฐ ํ๋ฆ„ ๊ทธ๋ž˜ํ”„๋งŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ชจ๋“  ๊ณ„์‚ฐ์€ ๊ทธ๋ž˜ํ”„ ์‹คํ–‰ ์‹œ๊ฐ„์— ์‹คํ–‰๋˜๋„๋ก ๋‹ค๋ฅธ tensorflow ops๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

op ์œ ํ˜•์— ๋Œ€ํ•œ ์‚ฌ์šฉ์ž ์ •์˜ ๊ทธ๋ž˜๋””์–ธํŠธ๋ฅผ ๋“ฑ๋กํ•  ๋•Œ ์œ ํ˜• ํžŒํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด ๋ฐ์ดํ„ฐ ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ๋ฅผ ํ†ตํ•ด ์ฝ”๋“œ์˜ ๊ฐ€๋…์„ฑ, ๋””๋ฒ„๊น… ๊ฐ€๋Šฅ์„ฑ, ์œ ์ง€ ๊ด€๋ฆฌ ์šฉ์ด์„ฑ ๋ฐ ๊ฒฌ๊ณ ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ•จ์ˆ˜์—์„œ op๋ฅผ ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์‚ฌ์šฉํ•  ๋•Œ ๊ทธ๋ž˜๋””์–ธํŠธ ํ•จ์ˆ˜๊ฐ€ tf.Operation์„ ๋งค๊ฐœ๋ณ€์ˆ˜ ์œ ํ˜•์œผ๋กœ ์‚ฌ์šฉํ•˜๋„๋ก ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

C++์˜ ํ˜•์ƒ ํ•จ์ˆ˜

TensorFlow API์— "๋„ํ˜• ์œ ์ถ”"๋ผ๋Š” ํŠน์„ฑ์ด ์žˆ์–ด ๊ทธ๋ž˜ํ”„๋ฅผ ์‹คํ–‰ํ•˜์ง€ ์•Š๊ณ ๋„ ํ…์„œ ๋„ํ˜•์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋„ํ˜• ์œ ์ถ”๋Š” C++ REGISTER_OP ์„ ์–ธ์—์„œ ๊ฐ op ์œ ํ˜•์— ๋“ฑ๋ก๋œ "๋„ํ˜• ํ•จ์ˆ˜"์— ์˜ํ•ด ์ง€์›๋˜๋ฉฐ ๋‘ ๊ฐ€์ง€ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ž…๋ ฅ์˜ ๋„ํ˜•์ด ๊ทธ๋ž˜ํ”„ ์ƒ์„ฑ ์ค‘์— ํ˜ธํ™˜๋˜๋Š”์ง€ ํ™•์ธํ•˜๊ณ  ์ถœ๋ ฅ์˜ ๋„ํ˜•์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

ํ˜•์ƒ ํ•จ์ˆ˜๋Š” shape_inference::InferenceContext ํด๋ž˜์Šค์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์œผ๋กœ ์ •์˜๋ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ZeroOut์˜ ํ˜•์ƒ ํ•จ์ˆ˜์—์„œ

.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->input(0)); return Status::OK(); });

c->set_output (0, c->input (0));์€ ์ฒซ ๋ฒˆ์งธ ์ถœ๋ ฅ์˜ ํ˜•์ƒ์ด ์ฒซ ๋ฒˆ์งธ ์ž…๋ ฅ์˜ ํ˜•์ƒ์œผ๋กœ ์„ค์ •๋˜์–ด์•ผ ํ•จ์„ ์„ ์–ธํ•ฉ๋‹ˆ๋‹ค. ์œ„์˜ ์˜ˆ์ œ์—์„œ์™€ ๊ฐ™์ด ์ธ๋ฑ์Šค์— ์˜ํ•ด ์ถœ๋ ฅ์ด ์„ ํƒ๋œ ๊ฒฝ์šฐ, set_output์˜ ๋‘ ๋ฒˆ์งธ ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ShapeHandle ๊ฐ์ฒด์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ ์ƒ์„ฑ์ž๋กœ ๋นˆ ShapeHandle ๊ฐ์ฒด๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค idx๋ฅผ ๊ฐ€์ง„ ์ž…๋ ฅ์— ๋Œ€ํ•œ ShapeHandle ๊ฐ์ฒด๋Š” c->input(idx)๋กœ ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

shape_inference::UnchangedShape์™€ ๊ฐ™์ด ๋งŽ์€ ops์— ์ ์šฉ๋˜๋Š” ๊ณตํ†ต ํ˜•์ƒ ํ•จ์ˆ˜๊ฐ€ ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ์œผ๋ฉฐ, common_shape_fns.h์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๊ณ , ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32") .SetShapeFn(::tensorflow::shape_inference::UnchangedShape);

ํ˜•์ƒ ํ•จ์ˆ˜๋Š” ์ž…๋ ฅ์˜ ํ˜•์ƒ์„ ์ œํ•œํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฒกํ„ฐ ํ˜•์ƒ ์ œ์•ฝ ์กฐ๊ฑด์ด์žˆ๋Š” ZeroOut์˜ ๊ฒฝ์šฐ, ํ˜•์ƒ ํ•จ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle input; TF_RETURN_IF_ERROR(c->WithRank(c->input(0), 1, &input)); c->set_output(0, input); return Status::OK(); });

WithRank ํ˜ธ์ถœ์€ ์ž…๋ ฅ ํ˜•์ƒ c->input(0) ์ด ์ •ํ™•ํžˆ 1์ฐจ์›์˜ ํ˜•์ƒ์ธ์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค(๋˜๋Š” ์ž…๋ ฅ ํ˜•์ƒ์„ ์•Œ ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ, ์ถœ๋ ฅ ํ˜•์ƒ์€ ์•Œ ์ˆ˜ ์—†๋Š” 1์ฐจ์›์˜ ๋ฒกํ„ฐ๊ฐ€ ๋จ).

์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ์ธ ๋‹คํ˜• op์ธ ๊ฒฝ์šฐ, InferenceContext์˜ ๋ฉค๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒ€์‚ฌํ•  ํ˜•์ƒ์˜ ์ˆ˜๋ฅผ ๊ฒฐ์ •ํ•˜๊ณ  Merge์˜ ๋ฉค๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ˜•์ƒ์ด ๋ชจ๋‘ ํ˜ธํ™˜๋˜๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค(๋˜๋Š” ๊ธธ์ด๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์•ก์„ธ์Šค ์†์„ฑ๊ณผ op์˜ ์†์„ฑ์— ๋Œ€ํ•œ ์•ก์„ธ์Šค๋ฅผ ์ œ๊ณตํ•˜๋Š” InferenceContext::GetAttr).

.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { ::tensorflow::shape_inference::ShapeHandle input; ::tensorflow::shape_inference::ShapeHandle output; for (size_t i = 0; i < c->num_inputs(); ++i) { TF_RETURN_IF_ERROR(c->WithRank(c->input(i), 2, &input)); TF_RETURN_IF_ERROR(c->Merge(output, input, &output)); } c->set_output(0, output); return Status::OK(); });

ํ˜•์ƒ ์œ ์ถ”๋Š” ์„ ํƒ์ ์ธ ํŠน์„ฑ์ด๋ฉฐ ํ…์„œ์˜ ํ˜•์ƒ์€ ๋™์ ์œผ๋กœ ๋ณ€ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ํ˜•์ƒ ํ•จ์ˆ˜๋Š” ๋ชจ๋“  ์ž…๋ ฅ์˜ ๋ถˆ์™„์ „ํ•œ ํ˜•์ƒ ์ •๋ณด์— ๋Œ€ํ•ด ๊ฒฌ๊ณ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. InferenceContext์˜ Merge ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋‘ ๊ฐ€์ง€ ํ˜•์ƒ ์ค‘ ํ•˜๋‚˜ ๋˜๋Š” ๋‘˜ ๋‹ค์— ์™„์ „ํ•œ ์ •๋ณด๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ์—๋„ ํ˜ธ์ถœ์ž๊ฐ€ ๋‘ ํ˜•์ƒ์ด ๊ฐ™์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜•์ƒ ํ•จ์ˆ˜๋Š” ๋ชจ๋“  ํ•ต์‹ฌ TensorFlow ops์— ๋Œ€ํ•ด ์ •์˜๋˜๋ฉฐ ๋‹ค์–‘ํ•œ ์‚ฌ์šฉ ์˜ˆ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

InferenceContext ํด๋ž˜์Šค์—๋Š” ํ˜•์ƒ ํ•จ์ˆ˜ ์กฐ์ž‘์„ ์ •์˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋งŽ์€ ํ•จ์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํŠน์ • ์ฐจ์›์— InferenceContext::Dim ๋ฐInferenceContext::WithValue๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋งค์šฐ ํŠน์ •ํ•œ ๊ฐ’์ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ , ์ถœ๋ ฅ ์ฐจ์›์ด InferenceContext::Add ๋ฐ InferenceContext::Multiply๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋‘ ์ž…๋ ฅ ์ฐจ์›์˜ ํ•ฉ/๊ณฑ์ž„์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค์–‘ํ•œ ํ˜•์ƒ ์กฐ์ž‘์— ๋Œ€ํ•ด์„œ๋Š” InferenceContext ํด๋ž˜์Šค๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ๋‹ค์Œ ์˜ˆ์ œ๋Š” ์ฒซ ๋ฒˆ์งธ ์ถœ๋ ฅ์˜ ํ˜•์ƒ์„ (n, 3)์œผ๋กœ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ์ฒซ ๋ฒˆ์งธ ์ž…๋ ฅ์˜ ํ˜•์ƒ์€ (n, ...)์ž…๋‹ˆ๋‹ค.

.SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) { c->set_output(0, c->Matrix(c->Dim(c->input(0), 0), 3)); return Status::OK(); });

๋ณต์žกํ•œ ํ˜•์ƒ ํ•จ์ˆ˜๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ, ๋‹ค์–‘ํ•œ ์ž…๋ ฅ ํ˜•์ƒ์˜ ์กฐํ•ฉ์ด ์˜ˆ์ƒ๋˜๋Š” ์ถœ๋ ฅํ˜•์ƒ์˜ ์กฐํ•ฉ์„ ์ƒ์„ฑํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ํ…Œ์ŠคํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ผ๋ถ€ ํ•ต์‹ฌ ops ํ…Œ์ŠคํŠธ์—์„œ ํ…Œ์ŠคํŠธ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์˜ˆ์ œ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (INFER_OK ๋ฐ INFER_ERROR์˜ ๊ตฌ๋ฌธ์ด ์•ฝ๊ฐ„ ๊นŒ๋‹ค๋กญ์ง€๋งŒ, ํ…Œ์ŠคํŠธ์—์„œ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ํ˜•์ƒ ์‚ฌ์–‘์„ ๊ฐ„๊ฒฐํ•˜๊ฒŒ ํ‘œํ˜„ํ•˜์„ธ์š”. ์ง€๊ธˆ์€ ํ•ด๋‹น ํ…Œ์ŠคํŠธ์˜ ์ฃผ๋ณ€ ์ฃผ์„์„ ์ฐธ์กฐํ•˜์—ฌ ํ˜•์ƒ ๋ฌธ์ž์—ด ์‚ฌ์–‘์„ ์ดํ•ดํ•˜์„ธ์š”.)

์‚ฌ์šฉ์ž ์ •์˜ op์šฉ pip ํŒจํ‚ค์ง€ ๋นŒ๋“œํ•˜๊ธฐ

op์— ๋Œ€ํ•œ pip ํŒจํ‚ค์ง€๋ฅผ ๋นŒ๋“œํ•˜๋ ค๋ฉด, tensorflow/custom-op ์˜ˆ์ œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ์ด ๊ฐ€์ด๋“œ๋Š” ์†Œ์Šค์—์„œ TensorFlow๋ฅผ ๋นŒ๋“œํ•˜๋Š” ๋Œ€์‹  TensorFlow pip ํŒจํ‚ค์ง€์—์„œ ์‚ฌ์šฉ์ž ์ •์˜ op๋ฅผ ๋นŒ๋“œํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.