<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>The Beautiful Future</title>
    <link>https://thebeautifulfuture.tistory.com/</link>
    <description></description>
    <language>ko</language>
    <pubDate>Fri, 29 May 2026 01:10:37 +0900</pubDate>
    <generator>TISTORY</generator>
    <ttl>100</ttl>
    <managingEditor>Small Octopus</managingEditor>
    <item>
      <title>Face2Face</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Face2Face</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;multi-linear PCA model 사용 (아래 모델들에 기반을 둠..)&lt;br /&gt;[3] BASEL&lt;br /&gt;[1] The Digital Emily Project: photoreal facial modeling and animation.&lt;br /&gt;[9] Facewarehouse&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;397&quot; data-origin-height=&quot;70&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/q5fEx/btr1ILO4jn7/7iUk8rChBRiLM89WVV4sO0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/q5fEx/btr1ILO4jn7/7iUk8rChBRiLM89WVV4sO0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/q5fEx/btr1ILO4jn7/7iUk8rChBRiLM89WVV4sO0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fq5fEx%2Fbtr1ILO4jn7%2F7iUk8rChBRiLM89WVV4sO0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;397&quot; height=&quot;70&quot; data-origin-width=&quot;397&quot; data-origin-height=&quot;70&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(1) 수식은 geometric shape&lt;br /&gt;(2) 수식은 skin reflectance&lt;br /&gt;a_id 은 3xn 크기, a_alb 은 3xn 크기, E_id는 3nx80, E_exp는 3nx76, E_alb 는 3nx80 크기 이다.&lt;br /&gt;메쉬는 53,000 버텍스와&amp;nbsp; 106,000 페이스로 이뤄져있다.&lt;br /&gt;rigid transformation \( \Phi \), full perspective transformation \( \Pi \), illumination \( \gamma \).&lt;br /&gt;&lt;span style=&quot;background-color: #fffdfb;&quot;&gt;P = { &lt;/span&gt;alpha, beta, delta, R, t, k }&lt;br /&gt;Illumination is approximated by the first tree bands of Spherical Harmonics(SH) basis function.&lt;br /&gt;Labertian surface and smooth distant illumination, neglecting self-shadowing.&lt;br /&gt;[23] &amp;nbsp;A signal-processing framework for inverse rendering 2001.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;442&quot; data-origin-height=&quot;89&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ufnvO/btr1IshvOLP/JMNaV75XF1u0jdo6etvLX0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ufnvO/btr1IshvOLP/JMNaV75XF1u0jdo6etvLX0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ufnvO/btr1IshvOLP/JMNaV75XF1u0jdo6etvLX0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FufnvO%2Fbtr1IshvOLP%2FJMNaV75XF1u0jdo6etvLX0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;442&quot; height=&quot;89&quot; data-origin-width=&quot;442&quot; data-origin-height=&quot;89&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;photo-consistancy Ecol, facial feature alignment Elan, statistical regularizer Ereg.&lt;br /&gt;w_col = 1, w_lan = 10, w_reg = 2.5e-5&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;392&quot; data-origin-height=&quot;66&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bbOyNS/btr1UukFz8g/58aMW3QxwBZzv5oy8yFw2K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bbOyNS/btr1UukFz8g/58aMW3QxwBZzv5oy8yFw2K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bbOyNS/btr1UukFz8g/58aMW3QxwBZzv5oy8yFw2K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbbOyNS%2Fbtr1UukFz8g%2F58aMW3QxwBZzv5oy8yFw2K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;392&quot; height=&quot;66&quot; data-origin-width=&quot;392&quot; data-origin-height=&quot;66&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;photo-consistancy, least-squares 대신에 outlier에 강인한 l2,1-norm [12]을 사용하였다.&amp;nbsp;&lt;br /&gt;[12] &lt;span&gt;R1-pca:&lt;span&gt; &lt;/span&gt;&lt;/span&gt;rotational invariant l1-norm principal component analysis for robust subspace factorization.&lt;br /&gt;color distance: l2, enforce sparsity summation over all pixels: l1&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;425&quot; data-origin-height=&quot;59&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/lwV9I/btr1JsBR8RY/cDlMNG084nKMMxkLnSGAYK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/lwV9I/btr1JsBR8RY/cDlMNG084nKMMxkLnSGAYK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/lwV9I/btr1JsBR8RY/cDlMNG084nKMMxkLnSGAYK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FlwV9I%2Fbtr1JsBR8RY%2FcDlMNG084nKMMxkLnSGAYK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;425&quot; height=&quot;59&quot; data-origin-width=&quot;425&quot; data-origin-height=&quot;59&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;SOTA facial landmark tracking algorithm&amp;nbsp;&lt;br /&gt;[24] Deformable model fitting&amp;nbsp;by&amp;nbsp;regularized&amp;nbsp;landmark&amp;nbsp;mean-shift.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;425&quot; data-origin-height=&quot;59&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/daoIN2/btr1Kww1EX8/esGfj2mCo4Nz8QytSGek6K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/daoIN2/btr1Kww1EX8/esGfj2mCo4Nz8QytSGek6K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/daoIN2/btr1Kww1EX8/esGfj2mCo4Nz8QytSGek6K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdaoIN2%2Fbtr1Kww1EX8%2FesGfj2mCo4Nz8QytSGek6K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;425&quot; height=&quot;59&quot; data-origin-width=&quot;425&quot; data-origin-height=&quot;59&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;statistically close to the mean.&lt;br /&gt;degenerations of facial geometry and reflectance. guides the optimization strategy out of local minima.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Data-parallel Optimization Strategy&lt;br /&gt;data-parallel GPU based Iteratively Reweighted Least Square (IRLS) solver.&lt;br /&gt;IRLS의 키아이디어는 문제를 변화시키는 것이다. 매 이터레이션 마다&amp;nbsp;&lt;br /&gt;논리니어스퀘어 문제는 두개의 컴퍼넌놈 놈으로 나눠진다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;326&quot; data-origin-height=&quot;61&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/etIsXX/btr1I8QNKfl/FjwKbFQzY9ziD0ST8eAsG0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/etIsXX/btr1I8QNKfl/FjwKbFQzY9ziD0ST8eAsG0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/etIsXX/btr1I8QNKfl/FjwKbFQzY9ziD0ST8eAsG0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FetIsXX%2Fbtr1I8QNKfl%2FFjwKbFQzY9ziD0ST8eAsG0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;326&quot; height=&quot;61&quot; data-origin-width=&quot;326&quot; data-origin-height=&quot;61&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;r 은 일반적인 레지듀얼(위에서 정의한 에너지펑션?), Pold는 이전 스텝에서 계산된 결과.&lt;br /&gt;Gauss-Newton [29] &amp;nbsp;Real-time expression transfer for facial reenactment TOG 2015.&lt;br /&gt;매 IRLS 이터에서 GN 스텝을 적용, 아래 수식을 풀었다.&lt;br /&gt;\( \textbf{J}^T\textbf{J} \delta^* =&amp;nbsp; -\textbf{J}^T\textbf{F} \)&lt;br /&gt;PCG에 기반하여 최적의 선형 파라미터 delta*를 구한다.&lt;br /&gt;자코비안 J 와 \( -&lt;span&gt;\textbf{J}^T\textbf{F}&lt;span&gt; \)는 [29]처럼 미리 계산되고 저장된다.&lt;br /&gt;[29] &amp;nbsp;Real-time expression transfer for facial reenactment.&lt;br /&gt;[33], [29] 에서 제안된것처럼 old descent direction d 와 PCG 솔버 안의 J^TJ을 연속된 행렬곱으로 계산한다.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Supplemental Material&lt;br /&gt;&amp;nbsp;preconditioned conjugate&amp;nbsp;gradient&amp;nbsp;(PCG)&amp;nbsp;method&lt;br /&gt;parallel prefix scan을 이용해서 신더싸이즈된 이미지에서 보이는 픽셀을&amp;nbsp; 모은다.&amp;nbsp;&lt;br /&gt;레지듀얼 벡터 F의 자코비안 계산과 에너지의 그래디언트 J^TF 는 지피유로 페러럴하게 계산된다.&lt;br /&gt;병렬처리는 모든 부분 미분과 그래디언트 시작점이 독립적으로 계산되기때문에 가능하다.&lt;br /&gt;자코비안의 값들은 계산되어 글로벌 메모리에 저장된다.&amp;nbsp;&lt;br /&gt;모든 지역 픽셀단위그래디언트를 합칠때, 투 스테이지 리덕션이 사용된다.&lt;br /&gt;PCG 메쏘드를 이용한 델타 엑스 파라미터 업데이트를 위해&amp;nbsp;&lt;br /&gt;자코비안과 그래디언트를 이용해서 \( \textbf{J}^T\textbf{J} \delta^* =&amp;nbsp; -\textbf{J}^T\textbf{F} \) 문제를 푼다.&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/204</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Face2Face#entry204comment</comments>
      <pubDate>Fri, 3 Mar 2023 15:35:05 +0900</pubDate>
    </item>
    <item>
      <title>Denoising Diffusion Probabilistic Models(DDPM)</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Denoising-Diffusion-Probabilistic-ModelsDDPM</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=_JQSMhqXw-4&quot;&gt;https://www.youtube.com/watch?v=_JQSMhqXw-4&lt;/a&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;고려대학교 산업경영공학과 김정섭&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;text to image generation&lt;br /&gt;EBMs Flow-based models GANs VAEs&lt;br /&gt;DALL-E ( VAE 기반 활용, OpenAI January 2021 )&lt;br /&gt;GLIDE ( diffusion, OpenAI December 2021 )&lt;br /&gt;DALL-E 2 ( diffusion, OpenAI April 2022 )&lt;br /&gt;Imagen ( diffusion, Google Brain May 2022 )&lt;br /&gt;&lt;br /&gt;Diffusion 이란&lt;br /&gt;물리 통계 동역학 Thermodynamics&lt;br /&gt;Deep Unsupervised Learning using Nonequilibrium Thermodynamics ICML 2015. (시초)&lt;br /&gt;Diffusion process&lt;br /&gt;&lt;br /&gt;Markov Chain&lt;br /&gt;Markov 성질: t+1 상태의 확률은 오직 t 의 상태에 의존한다.&lt;br /&gt;&lt;br /&gt;Normalize Flow&lt;br /&gt;MLP 기반 확률적 생성 모형, laten varialbe 기반 확률적 생성모형, z 획득에 변수 변환 공식을 활용&lt;br /&gt;$ p_x(x) = p_z(z) \vert \frac{dz}{dx}&amp;nbsp; \vert $&lt;br /&gt;&lt;br /&gt;Overview of generative models&lt;br /&gt;GAN: Adversarial training&lt;br /&gt;VAE: maximize variational lower bound&lt;br /&gt;Flow-based models: Invertible transform of distributions&lt;br /&gt;Diffusion models: Gradually add Gaussian noise and then reverse&lt;br /&gt;반복적인 변화를 활용한다는 점에서 Flow-based models과 유사&lt;br /&gt;분포에 대한 변분적 추론을 통한 학습을 진행한다는 점은 VAE와 유사&lt;br /&gt;최근에 Diffusion 모델의 학습에 Adversarial training을 활용하기도 함 Diffusion-GAN 2022.&lt;br /&gt;&lt;br /&gt;Laten variable model&amp;nbsp;&lt;br /&gt;simple distribution(tractable gaussian) &amp;rarr;complex distribution(visual/audio pattern)&lt;br /&gt;결국 생성 모델로부터 원하는 것은 간단한 분포 z를 특정한 패턴을 갖는 분포로 변환(mapping, transformation, sampling)하는 것.&lt;br /&gt;&lt;br /&gt;VAE&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>DNN/diffusion</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/203</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Denoising-Diffusion-Probabilistic-ModelsDDPM#entry203comment</comments>
      <pubDate>Tue, 28 Feb 2023 23:55:25 +0900</pubDate>
    </item>
    <item>
      <title>taobao Text/Speech-Driven Full-Body Animation</title>
      <link>https://thebeautifulfuture.tistory.com/entry/taobao-textspeech-driven</link>
      <description>&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;904&quot; data-origin-height=&quot;594&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/AX5HX/btr2xTer0e3/EZO7xlQqX527ZWS0kT65P1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/AX5HX/btr2xTer0e3/EZO7xlQqX527ZWS0kT65P1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/AX5HX/btr2xTer0e3/EZO7xlQqX527ZWS0kT65P1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FAX5HX%2Fbtr2xTer0e3%2FEZO7xlQqX527ZWS0kT65P1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;904&quot; height=&quot;594&quot; data-origin-width=&quot;904&quot; data-origin-height=&quot;594&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;divided into two major parts including lip movements in the lower face and diverse expressions in the upper face.&lt;br /&gt;multi-pathway framework to generate movements of two facial parts respectively.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;use the facial motion capture device to collect 3 hours human talking data with diverse expressions.&lt;br /&gt;record both video data as well as 3D face parameter sequences under the definition in ARKit with 52 blendshsape.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Lip movement generation&lt;/b&gt;&lt;br /&gt;&lt;b&gt;cross-modal transformer encoder&lt;/b&gt; to utilize both speech and textual information.&lt;br /&gt;for the modeling of txtual information, we extract phoneme alignment annotation according to&amp;nbsp;&lt;br /&gt;the speech and textual scripts by time alignment analyzer such as &lt;b&gt;Montreal Forced Aligner toolkit&lt;/b&gt;.&lt;br /&gt;Ph = {pht}, t=1,...,T.&amp;nbsp;&lt;br /&gt;concatenates &lt;b&gt;MFCCs&lt;/b&gt; and &lt;b&gt;MFB&lt;/b&gt; features denoted as Au = {aut), t=1,...,T.&lt;br /&gt;transformer encoder takes a sequence of concatenated phoneme embedding and audio features as input&lt;br /&gt;with a window size of 25 fps, whose duration is 1 second.&lt;br /&gt;the transformer encoder can effectively model the temporal context information&lt;br /&gt;with a multi-head self-attention mechanism across different modalities.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Objective for Lip consists of two terms, including a shape term and motion term.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;스크린샷 2023-02-20 오전 11.52.37.png&quot; data-origin-width=&quot;758&quot; data-origin-height=&quot;118&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b6msQX/btrZQcHa1rZ/vpuHWcVFpjCtKgZxT5qqw1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b6msQX/btrZQcHa1rZ/vpuHWcVFpjCtKgZxT5qqw1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b6msQX/btrZQcHa1rZ/vpuHWcVFpjCtKgZxT5qqw1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb6msQX%2FbtrZQcHa1rZ%2FvpuHWcVFpjCtKgZxT5qqw1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;533&quot; height=&quot;83&quot; data-filename=&quot;스크린샷 2023-02-20 오전 11.52.37.png&quot; data-origin-width=&quot;758&quot; data-origin-height=&quot;118&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;bt is 3D facial parameters.&lt;br /&gt;articulation correction by given phoneme label. the mouth should be closed during the pronunciation fo b/p/m.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Expression generation&lt;/b&gt;&lt;br /&gt;the facial expression in &lt;b&gt;the upper face&lt;/b&gt; mainly lies in the movements of eye and eyeborws.&lt;br /&gt;which are related to &lt;u&gt;speech rhythm&lt;/u&gt; and &lt;u&gt;intention of the speaker&lt;/u&gt; with &lt;b&gt;longer-time dependencies&lt;/b&gt;.&lt;br /&gt;- rhythmic expression movement by learning-based framework.&lt;br /&gt;an &lt;b&gt;audio encoder&lt;/b&gt; to model the&lt;b&gt; current speech signals&lt;/b&gt; as well as&lt;br /&gt;a &lt;b&gt;motion encoder&lt;/b&gt; to model the &lt;b&gt;history expressions&lt;/b&gt;.&lt;br /&gt;a &lt;b&gt;transformer decoder&lt;/b&gt; is adopted to predict the final expr movements according to &lt;b&gt;audio and history motion info&lt;/b&gt;.&lt;br /&gt;since synthesizing expression is a one-to-many mapping, use SSIM loss to explore the structural similarity between the predicted expression and ground truth.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;340&quot; data-origin-height=&quot;68&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/xCFZL/btrZ4cmzh8Z/Kxk28su9LmuC4u1zFKW57k/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/xCFZL/btrZ4cmzh8Z/Kxk28su9LmuC4u1zFKW57k/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/xCFZL/btrZ4cmzh8Z/Kxk28su9LmuC4u1zFKW57k/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FxCFZL%2FbtrZ4cmzh8Z%2FKxk28su9LmuC4u1zFKW57k%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;405&quot; height=&quot;81&quot; data-origin-width=&quot;340&quot; data-origin-height=&quot;68&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;mu adn sigma are the mean and standard deviation of generated 3D facial parameter sequence.&lt;br /&gt;cov is the covariance.&amp;nbsp;&lt;br /&gt;- intention-driven facial expression based on the semantic tags&lt;br /&gt;semantic tags are extracted from textual scripts via sentiment analysis.&lt;br /&gt;semantic tags include happiness, sadness, emphasis, fear and etc.&lt;br /&gt;Actors are asked to performers more than 50 intention-based expression according to the semantic tags.&lt;br /&gt;fusing the generated rhythmic expression with the proper intention-driven expression triggered by the semantic tags.&lt;br /&gt;integrate the lip movements to form the final expressive and diverse facial animation.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Body Animation&lt;/b&gt;&lt;br /&gt;a graph based on an existing motion database.&lt;br /&gt;motion segments according to the features of the given text/speech.&lt;br /&gt;&lt;b&gt;Motion graph construction&lt;/b&gt;&lt;br /&gt;semantic motions: 24 kinds of actions( such as numbers, orientations, and special semantics)&lt;br /&gt;non-semantic motions: declarative actions( upper body movements of standing, body center shifting movements, foot stepping movements)&lt;br /&gt;node denotes &lt;u&gt;a motion segment&lt;/u&gt;, edge denotes &lt;u&gt;the cost of transition&lt;/u&gt; between two nodes.&lt;br /&gt;&lt;b&gt;obtain graph nodes&lt;/b&gt;&lt;br /&gt;dividing each long sequence in database to obtain many small motion segments&lt;br /&gt;dividing points: &lt;u&gt;local minima of the motion strength&lt;/u&gt;,&amp;nbsp;&lt;br /&gt;&lt;b&gt;build graph edges&lt;/b&gt;&lt;br /&gt;connection relationship between motion segments&lt;br /&gt;&lt;u&gt;transition cost&lt;/u&gt; based on the distances between &lt;u&gt;salient joint positions&lt;/u&gt; and &lt;u&gt;movement speeds&lt;/u&gt;.&lt;br /&gt;a graph edge can be created if the transition cost between adjacent nodes is below a threshold sigma.&lt;br /&gt;semantic motions in the motion graph need to be obtained manually.&lt;br /&gt;&lt;b&gt;graph-based retrival and optimization&amp;nbsp;&lt;/b&gt;&lt;br /&gt;rules: special semantic text and phonetic rhythm&lt;br /&gt;given a section of text/speech P, analyze the input, divide it into many &lt;u&gt;phrases&lt;/u&gt; (Pi, i=1, ..., n).&lt;br /&gt;(Pi, i=1, ..., n) according to &lt;u&gt;text structure&lt;/u&gt; and find the &lt;u&gt;special semantic text&lt;/u&gt; in the section.&lt;br /&gt;meaningful motion segments, the &lt;u&gt;semantic text&lt;/u&gt; and &lt;u&gt;similarity of rhythm&lt;/u&gt; between motion segment and speech phrase.&lt;br /&gt;motion and &lt;u&gt;phonetic rhythm&lt;/u&gt; are obtained by &lt;u&gt;motion strength&lt;/u&gt;&lt;br /&gt;motion str: &lt;u&gt;Dancing to music. Advances in Neural Information Processing Systems, 32, 2019.&lt;br /&gt;&lt;/u&gt;phonetic rhythm: librosa (Audio and music signal analysis in python.&amp;nbsp;In&amp;nbsp;Proceedings&amp;nbsp;of&amp;nbsp;the&amp;nbsp;14th&amp;nbsp;python&amp;nbsp;in&amp;nbsp;science&amp;nbsp;con-&lt;br /&gt;ference, volume 8, pages 18&amp;ndash;25. Citeseer, 2015.)&lt;br /&gt;&lt;br /&gt;to assign a motion node to each text/speech phrase so that the cost is minimized&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;397&quot; data-origin-height=&quot;119&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b4Rsvi/btrZ2zCNlci/bknDb0BKjxiX0VwRMiMEkk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b4Rsvi/btrZ2zCNlci/bknDb0BKjxiX0VwRMiMEkk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b4Rsvi/btrZ2zCNlci/bknDb0BKjxiX0VwRMiMEkk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb4Rsvi%2FbtrZ2zCNlci%2FbknDb0BKjxiX0VwRMiMEkk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;397&quot; height=&quot;119&quot; data-origin-width=&quot;397&quot; data-origin-height=&quot;119&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Ct( i, i + 1) is the transition cost between adjacent nodes.&lt;br /&gt;Cp(i) accounts for the loss of Cs and Cr.&lt;br /&gt;&lt;span&gt;&lt;span&gt;Cs(i) &lt;/span&gt;&lt;/span&gt;&amp;nbsp;special semantic text.&lt;br /&gt;Cr(i) phonetic rhythm.&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/202</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/taobao-textspeech-driven#entry202comment</comments>
      <pubDate>Mon, 20 Feb 2023 11:58:39 +0900</pubDate>
    </item>
    <item>
      <title>확률 기초</title>
      <link>https://thebeautifulfuture.tistory.com/entry/%ED%99%95%EB%A5%A0-%EA%B8%B0%EC%B4%88</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;참고자료: 패턴인식 (오일석)&lt;/b&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;terms&lt;/b&gt;&lt;br /&gt;probability, random variable, probability density function, conditional probability, joint probability&lt;br /&gt;marginal probability, prior probability, likelihood, posterior probability, bayes rule, confidence.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;probability이란 어떤 사건들의 집합을 정의하기에 따라서 다라진다.&lt;/b&gt;&lt;br /&gt;각 사건들은 probability를 가지고 있다. probability는 0보다 크고 모든 경우의 합은 1 이다.&lt;br /&gt;1. 동전 던지기 -&amp;gt; 앞면, 뒷면 -&amp;gt; 0.5, 0.5&lt;br /&gt;2. 주사위 던지기 -&amp;gt; 1, 2, 3, 4, 5, 6 -&amp;gt; 1/6, ..., 1/6&lt;br /&gt;3. 날씨 -&amp;gt; 맑음, 비, 눈&lt;br /&gt;4. 로또 -&amp;gt; 1~45 중 6개 중복 없이, 순서 없음&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;random variable이란 사건들을 하나의 변수로 표시하는 것이다.&lt;/b&gt;&lt;br /&gt;동전 던지기: \( X \in \{앞면, 뒷면\} \)&lt;br /&gt;$ P(X=앞면) = 0.5, P(X=뒷면) = 0.5 $&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;random variable이 2가지인 경우&lt;/b&gt;&lt;br /&gt;아래 그림에는 주머니 한개와 바구니 두개가 있다.&lt;br /&gt;주머니에서 바구니 선택 사건(event) 이후 선택된 바구니에서 공 선택 사건(event)이 일어나는 실험이다.&lt;br /&gt;즉 이 시스템에는 순서가 있는것이다.&amp;nbsp;&lt;br /&gt;아래와 같이 X, Y 두개의 random variable이 있다.&lt;br /&gt;$ X \in \{A,B\}, Y \in \{하양, 파랑\} $&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignLeft&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;759&quot; data-origin-height=&quot;351&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cnhB3P/btrXyp82oD9/upckFs4YwcoQcLoBjPhSuK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cnhB3P/btrXyp82oD9/upckFs4YwcoQcLoBjPhSuK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cnhB3P/btrXyp82oD9/upckFs4YwcoQcLoBjPhSuK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcnhB3P%2FbtrXyp82oD9%2FupckFs4YwcoQcLoBjPhSuK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;467&quot; height=&quot;351&quot; data-origin-width=&quot;759&quot; data-origin-height=&quot;351&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;prior probability&lt;/b&gt;&lt;br /&gt;일련의 두 사건 중 주머니에서 바구니를 선택하는 사건은 두번째 사건이 일어나기 전에 일어나는&amp;nbsp;&lt;br /&gt;사건이기 때문에 사전 확률이다.&lt;br /&gt;주머니에서 A가 뽑힐 확률 \( P(X=A) = P(A) = 7 \div 10\)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;conditional probability&lt;/b&gt;&lt;br /&gt;바구니 A에서 하얀 공이 뽑힐 확률, 주머니에서 이미 A가 선택되었다는 조건 하의(given) 확률, 조건부 확률.&lt;br /&gt;실제 확률 값에 영향을 미치지 않고 수식적으로 조건을 표시해줌.&lt;br /&gt;$ P(하양|A) = 2 \div 10 $&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;joint probability&lt;/b&gt;&lt;br /&gt;주머니에서 A가 뽑히고 바구니 A에서 하양이 뽑힐 확률.&lt;br /&gt;두 사건이 연속적으로 일어날 확률을 구해야한다. 곱셈이다.&lt;br /&gt;$ P(X=A, Y=하양) = P(A, 하양) = P(A)P(하양|A) = 7 \div 10 \times 2 \div 10 = 7\div50 $&lt;br /&gt;joint probability는 \( P(X, Y) = P(Y, X) \) 가 성립한다.(product rule)&lt;br /&gt;$ P(X=A, Y=하양) = P(A)P(하양|A) =&amp;nbsp; P(Y=하양, X=A) = P(하양)P(A|하양)$&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;marginal probability&lt;/b&gt;&lt;br /&gt;최종적으로 하얀공이 뽑힐 확률.&lt;br /&gt;최종적으로 하얀공이 뽑히는 경우는 아래 두 joint probability를 생각할수 있다.&lt;br /&gt;1. 주머에서 A가 뽑히고 바구니 A에서 하얀공이 뽑히는 경우&lt;br /&gt;2. 주머에서 B가 뽑히고 바구니 B에서 하얀공이 뽑히는 경우&lt;br /&gt;두사건이 연속적이지는 않고 최종적으로 하얀공이 뽑히는 경우의 수가 증가했다고 볼수있다.&amp;nbsp;&lt;br /&gt;그래서 두 joint probability를 더하면된다.&lt;br /&gt;$ P(Y=하양) = P(A, 하양) + P(B, 하양) $&lt;br /&gt;$ P(B, 하양) = P(B)P(하양|B) = 3 \div 10 \times 9 \div 15 = 9\div50 $&lt;br /&gt;$ P(Y=하양) = 7\div50 + 9\div50 = 8\div25$&lt;br /&gt;최종적으로 검은공이 뽑힐 확률.&lt;br /&gt;$ P(Y=검정) = P(A, 검정) + P(B, 검정) = 1 - P(Y=하양) = 17\div25 $&lt;br /&gt;$ P(A)P(검정|A) + P(B)P(검정|B) = 7 \div 10 \times 8 \div 10 + 3 \div 10 \times 6 \div 15 = 17\div25 $&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;independent&lt;/b&gt;&lt;br /&gt;random variable이 서로 영향을 미치지 못하는 경우.&lt;br /&gt;\( P(X,Y) = P(X)P(Y) \) 를 만족해야한다.&lt;br /&gt;위 예제의 경우, 결론부터 이야기하자면 주머니(\(X\)) 가 공색깔(\(Y\))에 영향을 주기때문에 독립이 아니다.&lt;br /&gt;\( P(X,Y) \) 는 joint probability 이다.&lt;br /&gt;\( P(X) \) 는 바구니 선택 probability 이고 &lt;span&gt;\( P(Y) \)&lt;span&gt; 는 marginal probability으로 계산되기 때문이다.&lt;br /&gt;$ P(X=A, Y=하양) = 7\div50 $&lt;br /&gt;$ P(X=A)P(Y=하양) =&amp;nbsp; 7 \div 10 \times 8 \div 25 = 28 \div 125$&lt;br /&gt;\( P(X=A, Y=하양) \neq P(X=A)P(Y=하양) \), 비독립이다.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;likelihood&amp;nbsp;&lt;/b&gt;&lt;br /&gt;likelihood는 공색의 확률을 구하는 문제가 아니라 바구니의 확률을 구하는 문제의 관점이다.&amp;nbsp;&lt;br /&gt;즉, 하얀 공이 나왔는데 어느 바구니에서 나왔는가?&lt;br /&gt;주머니를 고려하지 않고 각 바구니에서 하얀공이 나올 확률이 높은 바구니를 생각할수있다.&lt;br /&gt;likelihood 우도는 conditional probability와 같다. 그러나 합이 1이 아니기 때문에 우도 함수라한다.&lt;br /&gt;$ P(하양|A) = 2 \div 10, P(하양|B) = 9 \div 15, P(하양|A) +P(하양|B) = 12 \div 15 $&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;posterior probability&lt;/b&gt;&lt;br /&gt;&quot;하얀 공이 나왔는데 어느 바구니에서 나왔는가? &quot;&lt;br /&gt;주머니만 고려한경우는 \( P(A) = 7 \div 10 &amp;gt; P(B) = 3 \div 10 \) 이기때문에 A 주머니라고 생각할 수 있다.&lt;br /&gt;그러나 주머니와 바구니 둘다 고려해야한다.&lt;br /&gt;문제의 관점을 바꿔서 생각해보면 하얀 공이라는 조건이 주어졌고 바구니의 확률을 구하는 문제이다.&lt;br /&gt;사후에 활률을 구하기때문에 posterior probability라고 한다.&lt;br /&gt;$ P(A|하양), P(B|하양), P(X|Y)$&lt;br /&gt;둘 중 더 큰 값을 갖는 바구니를 선택하면 된다. 그럼 위 두 값을 어떻게 계산할것이냐?&lt;br /&gt;joint probability의 product rule에서 bayes rule이 유도되어 풀수 있다.&lt;br /&gt;$ P(X, Y) = P(Y, X) $&lt;br /&gt;$ P(X)P(Y|X) = P(Y)P(X|Y) $&lt;br /&gt;$ P(X)P(Y|X) \div P(Y) = P(X|Y=하양) $&lt;br /&gt;여기서 \( P(Y) \)는 위에서 구했던 marginal probability로 구할 수 있다. (하얀 공이 나올 활률)&lt;br /&gt;그리고 \( P(X) \)는 prior probability 이다. (주머니에서 바구니 선택 확률)&lt;br /&gt;마지막으로 &lt;span&gt;\(&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;P(Y|X) \) 는 likelihood 이다. (각 주머니에서 공색의 확률)&lt;br /&gt;&amp;nbsp;$ P(X|Y) = \frac{ likelihood \times prior \ probability}{marginal \ probability} $&lt;br /&gt;$ P(A|하양) = \frac{ P(하양|A)P(A) }{P(하양)} = \frac{( 2 \div 10 )( 7 \div 10 )}{ 8 \div 25} = 0.4375 $&lt;br /&gt;$ P(B|하양) = \frac{ P(하양|B)P(B) }{P(하양)} = \frac{( 9 \div 15 )( 3 \div 10 )}{ 8 \div 25} = 0.5625 $&lt;br /&gt;0.5625의 신뢰도(confidence)로 주머니 B에서 하얀공이 나왔다고 할 수 있다.&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>수학</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/201</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/%ED%99%95%EB%A5%A0-%EA%B8%B0%EC%B4%88#entry201comment</comments>
      <pubDate>Mon, 30 Jan 2023 10:29:12 +0900</pubDate>
    </item>
    <item>
      <title>Facial Expression Retargeting from Human toAvatar Made Easy</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Facial-Expression-Retargeting-from-Human-toAvatar-Made-Easy</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;IEEE Computer Transactions Graphics &amp;nbsp;on Visualization and 2020&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;484&quot; data-origin-height=&quot;249&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dzsObr/btrXa0bv7d1/vjLHEU4nHRnAkJOtycsGck/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dzsObr/btrXa0bv7d1/vjLHEU4nHRnAkJOtycsGck/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dzsObr/btrXa0bv7d1/vjLHEU4nHRnAkJOtycsGck/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdzsObr%2FbtrXa0bv7d1%2FvjLHEU4nHRnAkJOtycsGck%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;484&quot; height=&quot;249&quot; data-origin-width=&quot;484&quot; data-origin-height=&quot;249&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;DT: [7] Deformation transfer for triangle meshes, TOG 2004.&lt;br /&gt;BS: [2] Facial retargeting with automatic range of motion alignment, TOG 2017.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Facial expression retargeting is a cross-domain problem.&lt;br /&gt;blendshapes are not orthogonal: [6] Practice and theory of blendshape facial models, EuroG 2014.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;two drawbacks of blendshape-based representation.&lt;br /&gt;1. difficulty in representing expressions outside of the linear span.(exaggerated or unseen expressions)&lt;br /&gt;2. creating the blendshapes for avatars still requires a tedious and time-consuming modeling process.&lt;br /&gt;[17] Facewarehouse: A 3d facial expression database for visual computing, VCG 2013.&lt;br /&gt;[32] Direct manipulation blendshapes, CGA 2010.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;data-driven shape analysis&amp;nbsp;methods&lt;br /&gt;[43] Variational autoencoders for deforming 3d mesh models, CVPR 2018.&lt;br /&gt;[44] Automatic unpaired shape deformation transfer, TOG 2018.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;사용된 캐릭터&lt;br /&gt;Mery, Chubby, Conan : Face&amp;nbsp;rigs:&amp;nbsp;meryprojet.com,&amp;nbsp;Tri&amp;nbsp;Nguyen,&amp;nbsp;&lt;a href=&quot;http://www.highend3d.com&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;www.highend3d.com&lt;/a&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;Network 구조&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-origin-width=&quot;996&quot; data-origin-height=&quot;399&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/uISZD/btrXc9eDL3H/aSBlqDMH0gFI350oM2rkQK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/uISZD/btrXc9eDL3H/aSBlqDMH0gFI350oM2rkQK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/uISZD/btrXc9eDL3H/aSBlqDMH0gFI350oM2rkQK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FuISZD%2FbtrXc9eDL3H%2FaSBlqDMH0gFI350oM2rkQK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;996&quot; height=&quot;399&quot; data-origin-width=&quot;996&quot; data-origin-height=&quot;399&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/198</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Facial-Expression-Retargeting-from-Human-toAvatar-Made-Easy#entry198comment</comments>
      <pubDate>Thu, 26 Jan 2023 12:45:30 +0900</pubDate>
    </item>
    <item>
      <title>Attention Mesh: High-fidelity Face Mesh Prediction in Real-time</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Attention-Mesh-High-fidelity-Face-Mesh-Prediction-in-Real-time</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #000000;&quot;&gt;Grishchenko, I., Ablavatski, A., Kartynnik, Y., Raveendran, K., Grundmann, M.: Attention Mesh: High-fidelity Face Mesh Prediction in Real-time. In: CVPR Workshops (2020)&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style=&quot;background-color: #ffffff; color: #000000;&quot;&gt;Google Research&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #000000;&quot;&gt;50 FPS on a Pixel2 phone.&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #000000;&quot;&gt;puppeteering 인형극&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;span style=&quot;background-color: #ffffff; color: #000000;&quot;&gt;기여한점: 30% 속도향상을 가져오면서 multi-stage cascade approach와 같은 성능&lt;/span&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3D 얼굴모델을 사용하지 않고 다이레트로 좌표를 예측.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;[5] 연구에 기반을 두고 있음.&amp;nbsp; 얼굴 검출 후 리그레션하는 두 단계 구조.&lt;br /&gt;[5] Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs, 2019.&lt;br /&gt;하지만 얼굴 전체를 리그레션하면 인지적으로 중요한 얼굴 파트에서 품질저하가 있다.&lt;br /&gt;그래서 얼굴파트단위 크롬해서 각 특화된 네트워크로 추론.&lt;br /&gt;하지만 이렇게 여러개의 모델을 &lt;b&gt;각 이미지 인풋으로 학습&lt;/b&gt; 및 사용하는 것은 비효율적이다. &lt;b&gt;feature level 공유 시간절약&lt;/b&gt;&lt;br /&gt;그리고 이렇게 여러모델을 사용하면 모델간 데이터 전달을 위해 &lt;b&gt;CPU, GPU간 동기화&lt;/b&gt;가 필요하게되고 이것도 코스트다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;region-specific heads는 &amp;nbsp; spatial transformers[4] 사용하여 feature maps을 트랜스폼한다.&lt;br /&gt;[4] Spatial transformer networks. NIPS 2015.&lt;br /&gt;이 방식으로 cascaded approach와 비교하여 하나의 모델로 같은 성능의 속도향상을 보인다.&lt;br /&gt;이 구조를 attention mesh로 명명한다.&lt;br /&gt;추가적으로 분리된 모델들과 비교하여 내부적으로 일관성있게 연결되어 있어 학습하기 편하다는 장점이다.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;[7]과 비슷한 모델구조를 사용했다.&lt;br /&gt;A deep regression&amp;nbsp;architecture&amp;nbsp;with&amp;nbsp;two-stage&amp;nbsp;re-initialization&amp;nbsp;for&amp;nbsp;high&lt;br /&gt;performance facial landmark detection. CVPR 2017.&lt;br /&gt;[7]은 spatial transformers를 사용 다양한 얼굴 검출기에서 제공된 초기값에 신뢰성있게 네트워 구축.&lt;br /&gt;이 논문과 [7]의 목적이 다르지만 하나의 모델로 salient face regions이 성능향상을 보인다.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-12-28 14-15-22.png&quot; data-origin-width=&quot;1375&quot; data-origin-height=&quot;702&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/dbD0qN/btrUNVDuEsf/0BZJ8qwBD0XPl2vUlCPeJ0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/dbD0qN/btrUNVDuEsf/0BZJ8qwBD0XPl2vUlCPeJ0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/dbD0qN/btrUNVDuEsf/0BZJ8qwBD0XPl2vUlCPeJ0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FdbD0qN%2FbtrUNVDuEsf%2F0BZJ8qwBD0XPl2vUlCPeJ0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;1375&quot; height=&quot;702&quot; data-filename=&quot;Screenshot from 2022-12-28 14-15-22.png&quot; data-origin-width=&quot;1375&quot; data-origin-height=&quot;702&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;검출 또는 트랙킹된 256x256 얼굴 이미지 입력 , 64x64x32 얼굴 feature map 출력&lt;br /&gt;feature map 은 왼눈, 오눈, 입, 얼굴 전체 총 네개의 브렌치로 나눠져 인퍼런스됨.&lt;br /&gt;얼굴 전체는 &lt;span&gt;64x64x32&lt;span&gt; feature 전체를 받아서 인퍼런스 나머지는 24x24x32 크기로 크롬된 feature map&lt;br /&gt;으로 부터 인퍼런스한다.&lt;br /&gt;얼굴 전체 모듈은 &lt;/span&gt;&lt;/span&gt;478개의 3D landmark를 예측하고 각 서브모듈의 ROI을 정의한다.&lt;br /&gt;나머지 눈/입 모듈은 어텐션 메커리즘으로 구해진 24x24xx32 feature map에서 예측한다.&lt;br /&gt;눈 모듈은 feature map이 6x6 까지 줄어든 이후에 동공을 별도로 예측한다. 이것은 눈의 feature를&lt;br /&gt;재사용하게해주며 고정적인 눈 랜드마크에서 보다 동공 랜드마크를 동적이게 해준다.&lt;br /&gt;&lt;br /&gt;각 서브모듈은 네크웍의 용량이 각 얼굴 부위에 헌신하게 해주며 품질을 높인다.&lt;br /&gt;보다 높은 품질향상을 위해, 눈과 입을 수평으로 정렬하고 일정 크기로 정규화했다.&lt;br /&gt;attention mesh network을 두가지 절차로 학습했다.&lt;br /&gt;1. 각 서브모듈을 GT landmark 좌표 기준으로 독립적으로 학습했다.&lt;br /&gt;2. 모델 자체에서 구해진 landmark 좌표 기준으로 서브모듈들을 재학습했다.&lt;br /&gt;&lt;br /&gt;Attention mechanism&lt;br /&gt;[2] Draw: A recurrent neural network for image generation, 2015.&lt;br /&gt;[4] Spatial transformer networks. NIPS 2015.&lt;br /&gt;위 어텐션 기법들은 feature space에서 grid of 2D points들을 뽑아낸다.&lt;br /&gt;그리고 미분가능하게 feature를 뽑아낸다. 2D gaussian kernel 또는 affine transformation interpolation을&lt;br /&gt;통해서. 이 방법은 네트워크를 E2E로 학습가능하게 하며 attention mechanism에 의해 사용된 feature가&amp;nbsp;&lt;br /&gt;풍부해지게 한다. 우리는 [4]의 transformer module을 사용하였다.&lt;br /&gt;affine transform으로 정의되서 sampled grid of points를 zoom, rotate, translate, skew할 수 있다.&lt;br /&gt;이 affine transform는 supervised로 구축 수도 있고 face mesh submodel의 출력으로부터 계산되어질수도 있다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-12-28 14-14-31.png&quot; data-origin-width=&quot;620&quot; data-origin-height=&quot;643&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bydCjD/btrUJj6DDkC/jud7L4Xkacq47gz1VJpfxk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bydCjD/btrUJj6DDkC/jud7L4Xkacq47gz1VJpfxk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bydCjD/btrUJj6DDkC/jud7L4Xkacq47gz1VJpfxk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FbydCjD%2FbtrUJj6DDkC%2Fjud7L4Xkacq47gz1VJpfxk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;422&quot; height=&quot;438&quot; data-filename=&quot;Screenshot from 2022-12-28 14-14-31.png&quot; data-origin-width=&quot;620&quot; data-origin-height=&quot;643&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;3만장의 이미지에 2D 랜드마크를 직접 어노테이션했고, synthetic model을 이용해서 z값을 근사화했다.&lt;br /&gt;&lt;br /&gt;평가를 위해, 얼굴 파트 단위로 학습된 모델과 비교하였다. 기본 메쉬, 눈, 입 순차적으로 동작된다(cascade).&lt;br /&gt;&lt;br /&gt;Performance (속도 비교)&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-12-28 14-28-10.png&quot; data-origin-width=&quot;589&quot; data-origin-height=&quot;269&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/n9wPH/btrUNro2VyR/hlaKA8pCnpQSQ0OK5F5uxk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/n9wPH/btrUNro2VyR/hlaKA8pCnpQSQ0OK5F5uxk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/n9wPH/btrUNro2VyR/hlaKA8pCnpQSQ0OK5F5uxk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fn9wPH%2FbtrUNro2VyR%2FhlaKA8pCnpQSQ0OK5F5uxk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;315&quot; height=&quot;144&quot; data-filename=&quot;Screenshot from 2022-12-28 14-28-10.png&quot; data-origin-width=&quot;589&quot; data-origin-height=&quot;269&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;attention mesh model이 cascade of seperate face and region model보다 25% 빠르다.&amp;nbsp;&lt;br /&gt;[6] &amp;nbsp;On-device neural net inference with mobile gpus. 2019.&lt;br /&gt;을 사용하여 TFLite GPU 속도를 측정하였다.&amp;nbsp;&lt;br /&gt;각 region model: 8.82+4.18+4.7 = 17.7 ms 에 CPU-GPU sync에 소비되는시간 4.7 ms을 추가해야한다.&lt;br /&gt;&lt;br /&gt;Mesh quality&lt;br /&gt;3D interocular distance로 오차 정규화함.&lt;br /&gt;attention mesh model이 cascade of model을 눈 영역은 능가했고 입 영역은 상응하는 수준임을 보인다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-12-28 14-42-44.png&quot; data-origin-width=&quot;481&quot; data-origin-height=&quot;184&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/4dX6a/btrUIZ8dmyu/MONo8LaA8GEn542z3oepz0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/4dX6a/btrUIZ8dmyu/MONo8LaA8GEn542z3oepz0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/4dX6a/btrUIZ8dmyu/MONo8LaA8GEn542z3oepz0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F4dX6a%2FbtrUIZ8dmyu%2FMONo8LaA8GEn542z3oepz0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;288&quot; height=&quot;110&quot; data-filename=&quot;Screenshot from 2022-12-28 14-42-44.png&quot; data-origin-width=&quot;481&quot; data-origin-height=&quot;184&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;AR Makeup&lt;br /&gt;잘못된 landmark위치는 uncanny vally 에 쉽게 빠지게한다.&lt;br /&gt;base mesh와 attention mesh with submoduel을 비교.&lt;br /&gt;80명 대상으로 A/B test를 10개의 이미지에 했고 46%의 AR samples이 실제 립스틱 바른 이미지로&lt;br /&gt;분류되었고 38%dml 실제 이미지가 AR로 분류되었다.&lt;br /&gt;&lt;br /&gt;Puppeteering(퍼피티어링, 인형극)&lt;br /&gt;인형극 또는 trigger로 사용될 수도 있다.&lt;br /&gt;[3] Dual laplacian morphing for triangular meshes. Computer Animation and Virtual Worlds 2007.&lt;br /&gt;Laplacian mesh editing to morph a canonical mesh into the predicted mesh.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;결론&lt;br /&gt;리얼타입 통합 정확한 face mesh predection.&lt;br /&gt;differentiable attention mechanism.&amp;nbsp;&lt;br /&gt;독립적 지역 특화 모델 돌리는 대신 salient face region 마다 연산가능하게 했다.&lt;br /&gt;&lt;br /&gt;StackOverflow&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Q.&lt;br /&gt;How to convert Mediapipe Face Mesh to Blendshape weight (&lt;a href=&quot;https://stackoverflow.com/questions/68169684/how-to-convert-mediapipe-face-mesh-to-blendshape-weight&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;https://stackoverflow.com/questions/68169684/how-to-convert-mediapipe-face-mesh-to-blendshape-weight&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;A1.&lt;br /&gt;(두가지 방향성이 있다.)&lt;br /&gt;Blendshape generation can be divided into two methods:&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(landmark로 부터 바로 구하는 방법)&lt;br /&gt;Direct math from mesh landmarks:&lt;/p&gt;
&lt;ol style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;kalidokit,&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;https://github.com/yeemachine/kalidokit&quot;&gt;https://github.com/yeemachine/kalidokit&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Mefamo,&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;https://github.com/JimWest/MeFaMo&quot;&gt;https://github.com/JimWest/MeFaMo&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(network을 사용하는 방법)&lt;br /&gt;AI model:&lt;/p&gt;
&lt;ol style=&quot;list-style-type: disc;&quot; data-ke-list-type=&quot;disc&quot;&gt;
&lt;li&gt;mocap4face,&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;https://github.com/facemoji/mocap4face&quot;&gt;https://github.com/facemoji/mocap4face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;AvatarWebKit,&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;https://github.com/Hallway-Inc/AvatarWebKit&quot;&gt;https://github.com/Hallway-Inc/AvatarWebKit&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;(데이터 페어를 만들어라.)&lt;br /&gt;With the rapid development of supervised learning, collecting face and 52-bs paired datasets seems the best way to solve this problem.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;==== update 2022.11.21 =====&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;NVIDIA has released&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;a href=&quot;https://github.com/NVIDIA/MAXINE-AR-SDK&quot;&gt;maxine-ar-sdk&lt;/a&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;to compute face blendshapes. The predicted blendshpaes are slightly different from Arkit 52. I have successfully compiled it and run it well on windows with RTX-20 or RTX-30 cards.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;If anyone really needs one mediapipe-based solution, just comments. I can contribute to label CC face datasets for fine-tuning your own models with NMAXINE-AR-SDK.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;A2.&lt;br /&gt;My approach: First sample many &lt;b&gt;pairs of random blendshapes&lt;/b&gt; -&amp;gt; face mesh (detecting face mesh on 3D model), and then learning an &lt;b&gt;inverse model&lt;/b&gt; from that. (A simple neuronet would do)&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Therefore you end up with a model that can give blendshapes given a face mesh.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;The catch, which is also mentioned in the above blurb, is that you wanna handle different face mesh inputs. In the above blurb it seems that they sample the 3D model but t&lt;b&gt;ransform the sampled mesh into the canonical face mesh&lt;/b&gt;, and hence end up with a canonical inverse model. At inference you transform a given mesh into the canonical face mesh as well.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Another solution might be to directly transform your different people's face meshes into the 3D model's mesh.&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/197</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Attention-Mesh-High-fidelity-Face-Mesh-Prediction-in-Real-time#entry197comment</comments>
      <pubDate>Wed, 28 Dec 2022 11:11:06 +0900</pubDate>
    </item>
    <item>
      <title>COMA</title>
      <link>https://thebeautifulfuture.tistory.com/entry/COMA</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;Generating 3D faces using Convolutional Mesh Autoencoders ECCV 2018.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Abstract&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;기존 방법은 선형 서브 공간 또는 고차원 텐서 일반화를 사용했다. 이 선형성 때문에 극한 변형과 비선형 표정을 캡쳐할 수 없었다.&lt;br /&gt;그리서 얼굴 비선형 표현할수 있는 모델을 제안하며 spectral convolutions을 mesh surface에 적용함으로써 가능하다.&lt;br /&gt;계층적 mesh 표현이 가능한 mesh sampling operation을 사용해서 shape과 &lt;span&gt;expression의 &lt;/span&gt;비선형 변형을 멀티스케일로 캡쳐한다.&lt;br /&gt;variational setting으로 우리의 모델은 multivariate Gaussian distribution으로 다양한 리얼 3D faces을 뽑아낼수 있다.&lt;br /&gt;학습 데이터셋은 12명의 20,466 mesh가 사용되었고 제한적인 데이터양에 비해서 75%의 적은양의 파라미터를 사용하면서 50% 적은 리컨스트럭션 에러를 보인다.&amp;nbsp;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;
&lt;h3 data-ke-size=&quot;size23&quot;&gt;3. Mesh Operators&lt;/h3&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;F = ( V, A), V &amp;isin; R^nx3, A &amp;isin; {0,1}^nxn&lt;br /&gt;Aij =1 에지 커텍션 연결됨, Aij=0 에지 커넥션 연결안됨.&lt;br /&gt;non-nromalized Laplacian: L = D - A, Dii = sum_j Aij&lt;br /&gt;[15] &amp;nbsp;Spectral graph theory. No. 92, American Mathematical Soc. (1997) &amp;lt;-- graph fourier transform&lt;br /&gt;diagonalized Laplacian: L = U&amp;and;U^T, &lt;br /&gt;fourier basis: U&lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&amp;isin;&lt;span&gt; R^nxn, U=[u_1, u_2, ... , u_n-1]&lt;br /&gt;eigen vector of L : u_i&amp;nbsp;&lt;br /&gt;&amp;and; = diag([&lt;span style=&quot;background-color: #ffffff; color: #111111;&quot;&gt;&amp;lambda;_1, ..., &lt;span style=&quot;background-color: #ffffff; color: #111111;&quot;&gt;&amp;lambda;_n-1&lt;/span&gt;&lt;/span&gt;] ) &amp;isin; R^nxn&lt;br /&gt;mesh vertices : x &amp;isin; R^nx3&lt;br /&gt;graph fourier transform: x_w = U^T x&lt;br /&gt;inverse graph fourier transform: x = U^T x&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;3.1 Fast spectral convolutions&lt;br /&gt;컨볼루션은 퓨리에 스페이스에서 하다마다프로덕트로 정의된다.&amp;nbsp; x&amp;nbsp;&amp;lowast;&amp;nbsp;y&amp;nbsp;=&amp;nbsp;U&amp;nbsp;((U&amp;nbsp;T&amp;nbsp;x)&amp;nbsp;(U&amp;nbsp;T&amp;nbsp;y))&lt;br /&gt;U 매트릭스가 스파스하지 않기때문에 연산량이 많다. recursive Chebyshev polynomial [17, 23]&lt;br /&gt;을 이용해서 메쉬 필터링 커널 g 쎄타를 정의 할수 있다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-10-06 15-18-57.png&quot; data-origin-width=&quot;256&quot; data-origin-height=&quot;101&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/T64Uo/btrNWqX417m/Nw3VyYgz3RyQCASYR8QHBK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/T64Uo/btrNWqX417m/Nw3VyYgz3RyQCASYR8QHBK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/T64Uo/btrNWqX417m/Nw3VyYgz3RyQCASYR8QHBK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FT64Uo%2FbtrNWqX417m%2FNw3VyYgz3RyQCASYR8QHBK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;256&quot; height=&quot;101&quot; data-filename=&quot;Screenshot from 2022-10-06 15-18-57.png&quot; data-origin-width=&quot;256&quot; data-origin-height=&quot;101&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;스케일드 라플라시안 L̃ = 2L/&amp;lambda;max &amp;minus; In ,&amp;nbsp;&lt;br /&gt;&amp;theta; &amp;isin; RK is a vector of Chebyshev coefficients.&lt;br /&gt;Tk &amp;isin; Rn&amp;times;n is the Chebyshev polynomial of order&amp;nbsp;k&amp;nbsp;that&amp;nbsp;can&amp;nbsp;be&amp;nbsp;computed&amp;nbsp;recursively&amp;nbsp;as&amp;nbsp;Tk&amp;nbsp;(x)&amp;nbsp;=&amp;nbsp;2xTk&amp;minus;1&amp;nbsp;(x)&amp;nbsp;&amp;minus;&amp;nbsp;Tk&amp;minus;2(x)&lt;br /&gt;T0 = 1 and T1 = x. The spectral convolution can then be defined as in [17]&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-10-06 15-20-58.png&quot; data-origin-width=&quot;299&quot; data-origin-height=&quot;96&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/L86nn/btrNTAtP0v6/hjB79FAHK4TQAsAZXUAxKk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/L86nn/btrNTAtP0v6/hjB79FAHK4TQAsAZXUAxKk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/L86nn/btrNTAtP0v6/hjB79FAHK4TQAsAZXUAxKk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FL86nn%2FbtrNTAtP0v6%2FhjB79FAHK4TQAsAZXUAxKk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;299&quot; height=&quot;96&quot; data-filename=&quot;Screenshot from 2022-10-06 15-20-58.png&quot; data-origin-width=&quot;299&quot; data-origin-height=&quot;96&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;yj는 y &amp;isin; Rn&amp;times;Fout 의 j 번째 특징을 계산한다.&amp;nbsp;&lt;br /&gt;입력 &amp;nbsp;x &amp;isin; Rn&amp;times;Fin 은 Fin 개의 특징을 가지고 있다.&lt;br /&gt;face mesh 는 Fin = 3 개의 버텍스와 대응하는 포지션을 가지고 있다.&lt;br /&gt;&amp;nbsp;Each&amp;nbsp;convolutional&amp;nbsp;layer&amp;nbsp;has&amp;nbsp;Fin&amp;nbsp;&amp;times;&amp;nbsp;Fout&amp;nbsp;vectors&amp;nbsp;of&amp;nbsp;Chebyshev&lt;br /&gt;coefficients,&amp;nbsp;&amp;theta;i,j&amp;nbsp;&amp;isin;&amp;nbsp;RK&amp;nbsp;,&amp;nbsp;as&amp;nbsp;trainable&amp;nbsp;parameters&lt;br /&gt;&lt;br /&gt;3.2 Mesh Sampling&lt;br /&gt;지역적 전역적 문맥을 캡쳐하기위해 hierarchical multi-scale representation를 사용한다.&lt;br /&gt;지역적 문맥은 얕은 layer에서 캡쳐하고 전역적 문맥은 깊은 layer에서 캡쳐한다.&lt;br /&gt;mesh를 nx3 tensor로 생각할 수 있다. 하지만 conv를 적용하면 디멘션이 달라진다.&lt;br /&gt;mesh sampling operation을 적용하면 이웃 vertex 컨테스트를 유지한다.&lt;br /&gt;&amp;nbsp;quadric matrices [20] Surface simplification using quadric error metrics. Computer&amp;nbsp;graphics&amp;nbsp;and&amp;nbsp;interactive&lt;br /&gt;techniques 1997.&lt;br /&gt;&lt;br /&gt;m개의 vertex를 가지는 mesh를 down-sampling 한다고 하면&lt;br /&gt;down-sample transform metrices Qd &amp;isin; {0,1}^nxm, up-sample transform matrices Qu &amp;isin; R^mxn, m &amp;gt; n.&lt;br /&gt;다운 샘플링은 정점 쌍을 반복적으로 축소하여 얻습니다. quadric matrices [20]를 이용하여 표면 오차 근사를 유지하도록 축소.&lt;br /&gt;아래그림 (a) 에서 빨간점이 축소된다. 남은 파란 점들이 원본메쉬의 서브셋이다. Vd &amp;sub; V.&lt;br /&gt;q: 원본 vertex, m개&lt;br /&gt;p: down sample된 vertex, n개&lt;br /&gt;Qd (p, q) &amp;isin; {0, 1} 은 down-sampling되는 동안 q vertex를 살릴지 버릴지를 나타낸다.&lt;br /&gt;무손실 다운샘플링 업샘플링은 일반곡면에 구현불가능하기때문에 다운샘플링하면서 업샘플링 매트릭스를 구축한다.&lt;br /&gt;Vd에 convolution이 적용된다(b -&amp;gt; c). (c)에 남은 vertex들은 업샘플링하는동안 유지된다(c-&amp;gt;d).&lt;br /&gt;다운샘플링되었던 빨강 vertex들은 다운샘플된 메쉬 면에 barycentric coordinates를 이용해서 맵핑된다.&lt;br /&gt;(b)에서 버려진 빨강 v는 가장 가까운 tri (i, j, k)로 프로젝션되어 부터 barycentric 가중치 wivi + wjvj + wkvk로 표현된다.&lt;br /&gt;이 가중치는 Qu에 업데이트되어서 &amp;nbsp;Qu (q, i) = wi , Qu (q, j) = wj , and Qu(q, k) = wk , and Qu (q, l) = 0 otherwise. &lt;br /&gt;Vu = Qu Vd.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-09-23 18-57-41.png&quot; data-origin-width=&quot;944&quot; data-origin-height=&quot;423&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/bn1COB/btrMTEXqgy9/4Ee2iSkYftC7QaSPUKAjik/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/bn1COB/btrMTEXqgy9/4Ee2iSkYftC7QaSPUKAjik/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/bn1COB/btrMTEXqgy9/4Ee2iSkYftC7QaSPUKAjik/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbn1COB%2FbtrMTEXqgy9%2F4Ee2iSkYftC7QaSPUKAjik%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;944&quot; height=&quot;423&quot; data-filename=&quot;Screenshot from 2022-09-23 18-57-41.png&quot; data-origin-width=&quot;944&quot; data-origin-height=&quot;423&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Chebyshev&amp;nbsp;convolutional&amp;nbsp;filters&amp;nbsp;with&amp;nbsp;K&amp;nbsp;=&amp;nbsp;6&amp;nbsp;Chebyshev&amp;nbsp;polynomials.&lt;br /&gt;[21] biased ReLU[21], Deep sparse rectifier neural networks, Artificial&amp;nbsp;Intelligence&amp;nbsp;and&amp;nbsp;Statistics&amp;nbsp;(2011)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/196</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/COMA#entry196comment</comments>
      <pubDate>Fri, 23 Sep 2022 15:58:51 +0900</pubDate>
    </item>
    <item>
      <title>3D Shape Regression for Real-time Facial Animation TOG2013</title>
      <link>https://thebeautifulfuture.tistory.com/entry/3D-Shape-Regression-for-Real-time-Facial-Animation-TOG2013</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;얼굴 검출기 없이 주어진 카메라에 얼굴이 크기 범위로 나온다는 가정으로 학습되고 사용되어질수 있게 알고리즘이 설계되었다.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User-specific face model&lt;/b&gt;&lt;br /&gt;-- 15 rigid motion&lt;br /&gt;yaw: -90, -60, -30, 0, 30, 60, 90&lt;br /&gt;pitch: -30, -15, 15, 30&amp;nbsp;&lt;br /&gt;roll: -30, -15, 15, 30&amp;nbsp;&lt;br /&gt;-- 45 non-rigid motion&lt;br /&gt;yaw: -30,&amp;nbsp; 0,&amp;nbsp; 30&lt;br /&gt;mouth strech, smile, brow raise, disgust, anger,&lt;br /&gt;squeeze left/right eye, jaw left/right, grin,&lt;br /&gt;chin raise, lip pucker, lip funnel, cheek blowing, eye closed.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;User-specific Blendshape Generation&lt;/b&gt;&lt;br /&gt;FaceWarehouse contain 150 individuals with 46 FACS blendshapes.&lt;br /&gt;11K mesh vertices x 50 identity knobs x 47 expression knobs.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 21-20-15.png&quot; data-origin-width=&quot;190&quot; data-origin-height=&quot;35&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/3GLKB/btrKHUAVzFL/MjOW4OqluLb3Rq8rUIqZok/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/3GLKB/btrKHUAVzFL/MjOW4OqluLb3Rq8rUIqZok/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/3GLKB/btrKHUAVzFL/MjOW4OqluLb3Rq8rUIqZok/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F3GLKB%2FbtrKHUAVzFL%2FMjOW4OqluLb3Rq8rUIqZok%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;190&quot; height=&quot;35&quot; data-filename=&quot;Screenshot from 2022-08-27 21-20-15.png&quot; data-origin-width=&quot;190&quot; data-origin-height=&quot;35&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;카메라 내부 파라미터는 알고있다고 가정, User-specific face image가 주워졌을때&lt;br /&gt;3D 모델의 vertex 사영과 2D 이미지 랜드마크 사이의 거리를 coordinate-descent method로 최적화.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 21-21-40.png&quot; data-origin-width=&quot;446&quot; data-origin-height=&quot;60&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/xgolj/btrKF0JaNg5/zPg0sLyezcKOi0t6M3IJS0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/xgolj/btrKF0JaNg5/zPg0sLyezcKOi0t6M3IJS0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/xgolj/btrKF0JaNg5/zPg0sLyezcKOi0t6M3IJS0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fxgolj%2FbtrKF0JaNg5%2FzPg0sLyezcKOi0t6M3IJS0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;446&quot; height=&quot;60&quot; data-filename=&quot;Screenshot from 2022-08-27 21-21-40.png&quot; data-origin-width=&quot;446&quot; data-origin-height=&quot;60&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;1. for each input image, find Mi, Widi, Wexpi.&lt;br /&gt;2. refine Wid, which should be same for all images. (fixing Mi and Wexp,i)&lt;br /&gt;모든 이미지의 한 사람을 위한 Wid를 찾기위한 최적화식&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 21-51-30.png&quot; data-origin-width=&quot;485&quot; data-origin-height=&quot;64&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/nGxzF/btrKF8gd32F/PP5r3HTEjW6ceVPe2gxJzK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/nGxzF/btrKF8gd32F/PP5r3HTEjW6ceVPe2gxJzK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/nGxzF/btrKF8gd32F/PP5r3HTEjW6ceVPe2gxJzK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FnGxzF%2FbtrKF8gd32F%2FPP5r3HTEjW6ceVPe2gxJzK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;485&quot; height=&quot;64&quot; data-filename=&quot;Screenshot from 2022-08-27 21-51-30.png&quot; data-origin-width=&quot;485&quot; data-origin-height=&quot;64&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;위 두 과정이 수렴할때 까지 반복된다(3번 정도면 수렴). Yang et al. 2011의 알고리즘을 써서 버텍스 인덱스를 알맞게 업데이트 해준다. Wid가 구해지면 Expression Blendshape을 구축할 수있다. FaceWarehouse에 있는 표정모드 중 47개를 사용. di는 i만 1인 원핫 벡터이다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-02-54.png&quot; data-origin-width=&quot;317&quot; data-origin-height=&quot;32&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cq54aJ/btrKHxlI56l/KOMDk2ywNO6KbuLU4PfsFk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cq54aJ/btrKHxlI56l/KOMDk2ywNO6KbuLU4PfsFk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cq54aJ/btrKHxlI56l/KOMDk2ywNO6KbuLU4PfsFk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fcq54aJ%2FbtrKHxlI56l%2FKOMDk2ywNO6KbuLU4PfsFk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;317&quot; height=&quot;32&quot; data-filename=&quot;Screenshot from 2022-08-27 22-02-54.png&quot; data-origin-width=&quot;317&quot; data-origin-height=&quot;32&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;br /&gt;&lt;b&gt;Training Set Construction&lt;/b&gt;&lt;br /&gt;3D shape regressor학습을 위한 3D landmark 학습셋이 필요하다.&lt;br /&gt;최적화 과정을 통해서 학습셋을 만든다. 이제 blendshape alpha 값(expression coefficent)을 찾아내는 문제로 변형되었다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-06-00.png&quot; data-origin-width=&quot;347&quot; data-origin-height=&quot;68&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ckeI9h/btrKGSjA4qv/CD30kUndVk6AwQnVHTVFK1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ckeI9h/btrKGSjA4qv/CD30kUndVk6AwQnVHTVFK1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ckeI9h/btrKGSjA4qv/CD30kUndVk6AwQnVHTVFK1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FckeI9h%2FbtrKGSjA4qv%2FCD30kUndVk6AwQnVHTVFK1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;347&quot; height=&quot;68&quot; data-filename=&quot;Screenshot from 2022-08-27 22-06-00.png&quot; data-origin-width=&quot;347&quot; data-origin-height=&quot;68&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;레귤러 텀으로 사전에 정의된 표정들에 대해서 어느정도 정답이 있다고 볼수 있다.&amp;nbsp;&lt;br /&gt;Li el al 2010에서 사용되었던 a*의 값과 유사해야한다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-08-08.png&quot; data-origin-width=&quot;146&quot; data-origin-height=&quot;35&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/rZIon/btrKLhoCMwD/uEjMGPQKCWZ4Xetu5AAUP0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/rZIon/btrKLhoCMwD/uEjMGPQKCWZ4Xetu5AAUP0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/rZIon/btrKLhoCMwD/uEjMGPQKCWZ4Xetu5AAUP0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FrZIon%2FbtrKLhoCMwD%2FuEjMGPQKCWZ4Xetu5AAUP0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;146&quot; height=&quot;35&quot; data-filename=&quot;Screenshot from 2022-08-27 22-08-08.png&quot; data-origin-width=&quot;146&quot; data-origin-height=&quot;35&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;위 두가지 텀을 합쳐서 아래와 같은 수식을 풀어내면 된다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-08-37.png&quot; data-origin-width=&quot;182&quot; data-origin-height=&quot;40&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/ecQ3zp/btrKGP8cNhK/rNxG6Mq6TgWeOPh2DkDeX1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/ecQ3zp/btrKGP8cNhK/rNxG6Mq6TgWeOPh2DkDeX1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/ecQ3zp/btrKGP8cNhK/rNxG6Mq6TgWeOPh2DkDeX1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FecQ3zp%2FbtrKGP8cNhK%2FrNxG6Mq6TgWeOPh2DkDeX1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;182&quot; height=&quot;40&quot; data-filename=&quot;Screenshot from 2022-08-27 22-08-37.png&quot; data-origin-width=&quot;182&quot; data-origin-height=&quot;40&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이 식을 coordinate-descent method 방법을 사용하여 두 파라미터을 번갈아 고정하여 반복 최적화했다.&lt;br /&gt;a의 초기값을 a*로 하였다. &lt;br /&gt;M을 계산 할 때는 POSIT algorithm을 사용하였다.&amp;nbsp;&lt;br /&gt;a를 계산 할 때는 BFGS solver기반의 gradient projection algorithm을 사용하였다. (0~1사이로 제한)&lt;br /&gt;Wreg의 값을 10으로 고정하여 사용하였다.&amp;nbsp;&lt;br /&gt;매 최적화 반복마다 버텍스의 인덱스를 업데이트 하였다.&lt;br /&gt;카메라 좌표계의 3D mesh을 아래식으로 계산할 수 있다. 그리고 3D landmark를 뽑아낸다 {S_i^o}.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-15-34.png&quot; data-origin-width=&quot;158&quot; data-origin-height=&quot;22&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/uVDxk/btrKHV0X0JO/rxMlAwmcEutrqZYST3twv1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/uVDxk/btrKHV0X0JO/rxMlAwmcEutrqZYST3twv1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/uVDxk/btrKHV0X0JO/rxMlAwmcEutrqZYST3twv1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FuVDxk%2FbtrKHV0X0JO%2FrxMlAwmcEutrqZYST3twv1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;158&quot; height=&quot;22&quot; data-filename=&quot;Screenshot from 2022-08-27 22-15-34.png&quot; data-origin-width=&quot;158&quot; data-origin-height=&quot;22&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;Data Augmentation&lt;/b&gt;&lt;br /&gt;3D shape을 카메라 코디네이트에서 x,y,z로 translation했다. 이미지당 m-1개의 부가적인 shaped얻을 수 있게,&lt;br /&gt;원본 포함 이미지당 m개씩(Sij), 1&amp;lt;= j &amp;lt;= m. S_i^o = Si0 이된다. &lt;br /&gt;이미지를 직접 변화 해서 학습하기 보다 M변환 매트릭스에&amp;nbsp; 저장을&amp;nbsp; 해서 원본으로 복원 될수 있게했다.&lt;br /&gt;즉 3D shape 이동변환과 이에 대응하는 M을 같이 저장했고 대응하는 이미지는 그대로이다.&lt;br /&gt;&amp;nbsp;&lt;br /&gt;Temperal Inital Shape Dataset&lt;br /&gt;실시간으로 동작할때 우리는 이전 프레임의 값으로 부터 초기 3D shape을 시작할 수 있기때문에 학습셋에도&amp;nbsp;&lt;br /&gt;이전 프레임에서 계산된것 같은 효과의 3D shape으로 쌍을 지어줬다(S_ij^c).&amp;nbsp;&lt;br /&gt;60개의 원본 3D shape중에서 G개의 가장 유사한 shape을 선택하고 (Sig, 1 &amp;lt;= g &amp;lt;= G)&lt;br /&gt;Data Augmentation 스텝에서 구해진 것중에서 랜덤하게 H개를 선택했다 (Sigjh, 1 &amp;lt;= h &amp;lt;= H).&lt;br /&gt;이 과정은 총 GH개의 초기 3D shape을 만들어준다. Sij를 위해서. 각 학습 샘플은 아래 수식과 같이 나타내어진다.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 22-42-01.png&quot; data-origin-width=&quot;144&quot; data-origin-height=&quot;23&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/8Xvxe/btrKHZI30lG/JrCoZJynZRWKqkuXxA91Lk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/8Xvxe/btrKHZI30lG/JrCoZJynZRWKqkuXxA91Lk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/8Xvxe/btrKHZI30lG/JrCoZJynZRWKqkuXxA91Lk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2F8Xvxe%2FbtrKHZI30lG%2FJrCoZJynZRWKqkuXxA91Lk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;144&quot; height=&quot;23&quot; data-filename=&quot;Screenshot from 2022-08-27 22-42-01.png&quot; data-origin-width=&quot;144&quot; data-origin-height=&quot;23&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;실제로 사용된 n = 60, m = 9, G = 5, H = 4 이다.&lt;br /&gt;&lt;br /&gt;Camera Calibration&lt;br /&gt;일반적인 캘리브레이션 대신 사용자 설정 이미지로부터 캘리브레이션을 할 수 있는 방법.&lt;br /&gt;가장 심플한 핀홀 모델을 가정, fx=fy=f, cx=c_imgx, cy=c_imgy, shear=0 그럼 f만 구하면된다.&lt;br /&gt;f 값을 조절하면서 User-specific Blendshape Generation 생성 방법으로 fitting 해보면서 적은 값이 나오는 f를 사용할 수 있다.&lt;br /&gt;이 논문에서 만족할만한 결과를 보여줬다.&lt;br /&gt;&lt;br /&gt;Face Tracking&lt;br /&gt;3D Shape regression 결과로 부터 변환 M과 expression coefficient를 뽑는 방법.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 23-20-46.png&quot; data-origin-width=&quot;322&quot; data-origin-height=&quot;71&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/b2Y9du/btrKMEKFYvg/DO60JGP9Y1a7pnQdn2TEqK/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/b2Y9du/btrKMEKFYvg/DO60JGP9Y1a7pnQdn2TEqK/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/b2Y9du/btrKMEKFYvg/DO60JGP9Y1a7pnQdn2TEqK/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fb2Y9du%2FbtrKMEKFYvg%2FDO60JGP9Y1a7pnQdn2TEqK%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;322&quot; height=&quot;71&quot; data-filename=&quot;Screenshot from 2022-08-27 23-20-46.png&quot; data-origin-width=&quot;322&quot; data-origin-height=&quot;71&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;이 과정에서는 버텍스를 업데이트 할 필요가 없는데, 3D Shape regression 결과가 이미지 위에 보이는 좌표가 아니라&amp;nbsp;&lt;br /&gt;실제 3D의 좌표라고 생각하면 되기때문이다. 그래서 실제로 내부파라미터(Q)도 안곱해지고 있고 vk, k는 고정 인덱스.&lt;br /&gt;animation prior GMM for temporal coherence in tracking, Weise et al 2011과 같은 방법.&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 23-26-33.png&quot; data-origin-width=&quot;187&quot; data-origin-height=&quot;19&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/c8agvh/btrKMGIuLnZ/ZbcHZgJ5OvUrRE59d2HQ1K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/c8agvh/btrKMGIuLnZ/ZbcHZgJ5OvUrRE59d2HQ1K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/c8agvh/btrKMGIuLnZ/ZbcHZgJ5OvUrRE59d2HQ1K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fc8agvh%2FbtrKMGIuLnZ%2FZbcHZgJ5OvUrRE59d2HQ1K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;187&quot; height=&quot;19&quot; data-filename=&quot;Screenshot from 2022-08-27 23-26-33.png&quot; data-origin-width=&quot;187&quot; data-origin-height=&quot;19&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 23-26-49.png&quot; data-origin-width=&quot;282&quot; data-origin-height=&quot;59&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/cnJnJs/btrKGQ0reuo/GBYQFzcmKBFS1dOuYKu9g0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/cnJnJs/btrKGQ0reuo/GBYQFzcmKBFS1dOuYKu9g0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/cnJnJs/btrKGQ0reuo/GBYQFzcmKBFS1dOuYKu9g0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FcnJnJs%2FbtrKGQ0reuo%2FGBYQFzcmKBFS1dOuYKu9g0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;282&quot; height=&quot;59&quot; data-filename=&quot;Screenshot from 2022-08-27 23-26-49.png&quot; data-origin-width=&quot;282&quot; data-origin-height=&quot;59&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 23-27-02.png&quot; data-origin-width=&quot;173&quot; data-origin-height=&quot;36&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/upTWm/btrKHcvo5Os/5t8IRlHMx2RXMXq1uck7U1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/upTWm/btrKHcvo5Os/5t8IRlHMx2RXMXq1uck7U1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/upTWm/btrKHcvo5Os/5t8IRlHMx2RXMXq1uck7U1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FupTWm%2FbtrKHcvo5Os%2F5t8IRlHMx2RXMXq1uck7U1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;173&quot; height=&quot;36&quot; data-filename=&quot;Screenshot from 2022-08-27 23-27-02.png&quot; data-origin-width=&quot;173&quot; data-origin-height=&quot;36&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-27 23-27-08.png&quot; data-origin-width=&quot;212&quot; data-origin-height=&quot;44&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/UY4Sa/btrKIsEfCxQ/3nK8dUtk70t28mzvay2d3K/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/UY4Sa/btrKIsEfCxQ/3nK8dUtk70t28mzvay2d3K/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/UY4Sa/btrKIsEfCxQ/3nK8dUtk70t28mzvay2d3K/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FUY4Sa%2FbtrKIsEfCxQ%2F3nK8dUtk70t28mzvay2d3K%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;212&quot; height=&quot;44&quot; data-filename=&quot;Screenshot from 2022-08-27 23-27-08.png&quot; data-origin-width=&quot;212&quot; data-origin-height=&quot;44&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Wprior는 1로 사용됨.&amp;nbsp;&lt;br /&gt;1. 이전 프레임에서 계산된 a 값을 초기값으로 사용하여 regression된 S와 Blendshape S 사이의 M을 계산.&lt;br /&gt;이문제는 3D registration문제이고 SVD on cross-covariance matrix [Besl and McKay 1992] 의 방법으로 품.&lt;br /&gt;2. 이젠 M을 고정하고 expression coefficient를 위해 iterative gradient solver로 최적화를 품.&lt;br /&gt;Eprior에 대한 사전 gradient를 풀어놨음. &lt;span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;gradient projection algorithm 기반 &lt;/span&gt;BFGS solver로 품 0~1 사이로 제한하면서.&lt;br /&gt;&lt;br /&gt;위 두 스텝을 수렴할때까지 반복, 2번이면 충분했다.&amp;nbsp;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;Evaluation&amp;nbsp;and&amp;nbsp;Comparison&lt;br /&gt;수작업으로 2D위치가 어노테이션된 키넥트에서 구해진 3D값과 비교하였다. 키넥트 뎁스와 프로젝션 매트리스을 사용.&lt;br /&gt;오차는 1센티 이하였다.&lt;br /&gt;2D regression 결과에 있어서 &amp;nbsp;Face alignment by explicit shape regression과 optical floaw 기반 Face transfer with multilinear models 두 방법과 정성적으로 비교.&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/195</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/3D-Shape-Regression-for-Real-time-Facial-Animation-TOG2013#entry195comment</comments>
      <pubDate>Sat, 27 Aug 2022 21:26:06 +0900</pubDate>
    </item>
    <item>
      <title>Facial Retargeting with Automatic Range of Motion Alignment</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Facial-Retargeting-with-Automatic-Range-of-Motion-Alignment</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;TOG 2017&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;b&gt;INTRODUCTION&lt;/b&gt;&lt;br /&gt;facial animation retargeting address the general problem of animation transfer between vitual charactors, with the transfer of performance capture to virtual characters being the main application.&lt;br /&gt;Recent developments in vision- and depth-sensor-based facial motion capture&lt;br /&gt;---Cao et al. 2014;&amp;nbsp;&lt;br /&gt;Displaced Dynamic Expression Regressionfor Real-time Facial Tracking and Animation TOG 2014.&lt;br /&gt;---Ichim et al. 2015;&lt;br /&gt;Dynamic 3D Avatar Creation from Hand-held Video Input TOG 2015.&lt;br /&gt;---Li et al. 2013;&lt;br /&gt;Realtime Facial Animation with On-the-fly Correctives, TOG 2013.&lt;br /&gt;---Thies et al. 2016;&lt;br /&gt;Face2Face: Real-Time Face Capture and Reenactment of RGB Video, CVPR 2016.&lt;br /&gt;Weise et al. 2011;&lt;br /&gt;Realtime Performance-based Facial Animation TOG 2011.&lt;br /&gt;made accurate captures of an actor, traditionally limited to big film or game studios, affordable to a much broader audience.&amp;nbsp;&lt;br /&gt;current real-time capture systems typically adapt a realistic generic blendshape model to the actor.&lt;br /&gt;&lt;b&gt;since the modified and the original character have semantically equivalent blendshapes, the captured actor performance is then transferred between the characters by directly mapping the blendshape weights&lt;/b&gt;.&lt;br /&gt;The special case of equvalent blendshapes between two characters is often named &lt;u&gt;parallel parameterization&lt;/u&gt; in retargeting context.&lt;br /&gt;In practice, it is uncommon to encounter facial rigs with a complete set of semantically equivalent blendshapes.&lt;br /&gt;creating facial rigs for animation is time consuming and requires highly skilled artists.&lt;br /&gt;therefore, a rig is carefully designed to fit the animation needs, only modeling the necessary expressions. &lt;br /&gt;in addition, expressive digital characters are often stylized and exaggerate the facial proportions of humans.&lt;br /&gt;An effective retargeting method must either transfer animation from &lt;u&gt;facial motion capture markers to a blendshape rig&lt;/u&gt; or between &lt;u&gt;faces with different blendshape sets&lt;/u&gt;.&lt;br /&gt;several retargeting approaches generate their own parallel parameterization, by transferring the blendshapes of the character face rig to align with the actor's proportions.&lt;br /&gt;However, especially for stylized characters the step often fails, due to differences in &lt;u&gt;range of motion&lt;/u&gt; or the shortcomings of current methods.&amp;nbsp;&lt;br /&gt;The subsequent blendshape estimation becomes erroneous, which has been addressed so far by incorporating additional priors.&lt;br /&gt;의미론적으로 캐릭터의 페이스 리그와 대응되는 연기자의 얼굴 모션 학습 시퀀스로부터 특정 연기자의 블렌드쉐입을 생성하는 방법을 제안한다.&lt;br /&gt;언수퍼바이스드 한 방법으로 우리는 학습 시퀀스가 충분히 연기자의 모션 범위를 표현할 수 있다는 것을 보인다.&amp;nbsp;&lt;br /&gt;페이스 리그와 연기자가 매우 얼굴의 비율이 달라도 parallel parameterization이 가능함을 보인다.&lt;br /&gt;주요 관찰은 얼굴의 모션은 다른 스타일 레벨이더라도 얼굴모션은 FACS에 따라 유사하다는 것이다.&lt;br /&gt;FACS는 얼굴 표정을 얼굴 근육을 기저로 설명한다. &lt;br /&gt;그리고 이 시스템은 블렌드쉐입 스타일라이즈 또는 리어리스틱 캐릭터의 생성과정에서 일반적으로 참고된다.&amp;nbsp;&lt;br /&gt;새로운 매니폴드 얼라인먼트 접근법에 기반하여 그리고 새로운 에너지 유사도 측정법에 기반하여 우리는 성공적으로 모션의 범위를 연기자와 캐릭터 리그 사이에서 정렬했다. 이것은 결론적으로 정확한 리타게팅으로 연결된다.&lt;br /&gt;우리의 두번째 기여한점은 prior energy based on physically inspired deformations. 이것은 리얼타임 환경에서도 효율적으로 계산되어질 수 있다.&amp;nbsp;&lt;br /&gt;우리의 기하학적 사전 지식은&amp;nbsp; 정확한&amp;nbsp;병렬&amp;nbsp;매개변수화의&amp;nbsp;경우에도&amp;nbsp;남아&amp;nbsp;있는&amp;nbsp;몇&amp;nbsp;가지&amp;nbsp;아티팩트를&amp;nbsp;해결합니다.&lt;br /&gt;현재 SOTA offlne 방법과 대등하거나 낫다&lt;br /&gt;--- &amp;nbsp;Seol et al. 2012&lt;br /&gt;Spacetime Expression Cloning for Blendshapes. TOG 2012.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;RELATED WORK&lt;/b&gt;&lt;br /&gt;As a key element of human-centerd applications, research on virtual faces and face animation has been an active field of research for decades, resulting in a wide range of publications on this topic. For a general overview we recommend the&amp;nbsp;&lt;br /&gt;--- Parke and Wanters 2008, BOOK.&lt;br /&gt;Computer Facial Animation. AK Peters Ltd.&lt;br /&gt;--- &amp;nbsp;Orvalho et al. 2012, surveys focusing on rigging&amp;nbsp;&lt;br /&gt;A Facial Rigging Survey. &amp;nbsp;In Eurographics State of the Art Reports.&lt;br /&gt;--- Lewis et al. 2014. , Blendshape animation&lt;br /&gt;Practice and Theory of Blendshape Facial Models, &amp;nbsp;In Eurographics State of the Art&amp;nbsp;Reports.&lt;br /&gt;&lt;br /&gt;Cross-Mapping&lt;br /&gt;의미론적으로 대응되는 캡쳐된 얼굴 표정과 타겟 리그를 직접적으로 학습한다.&lt;br /&gt;그리고 새로운 포즈의를 예제기반으로 합성한다.&lt;br /&gt;--- Buck el al, 2000, piece-wise linear mapping&lt;br /&gt;--- Wang et al. 2004, locally linear embedding&lt;br /&gt;--- Deng et al. 2006. RBFs.&lt;br /&gt;--- Song et al. 2011. kCCA.&lt;br /&gt;--- Kholgade et al. 2011. simplicial basis.&lt;br /&gt;--- Bouaziz and Pauly 2014. Gaussian Process Laten Variable Models.&lt;br /&gt;크로스 맵핑의 장점은 심지어 다른 눈의 개수를 가져도 어느 캐릭터든지 적용가능하다는 것이다.&amp;nbsp;&lt;br /&gt;하지만 이 방법의 단점은 주어진 학습 예제의 품질과 개수에 따라 성능이 달라진다는것이다.&lt;br /&gt;종종 15-20 개의 대응 예제가 충분한 결과를 위해 요구된다.&lt;br /&gt;40개의 블렌드 쉐입에 600-800개의 파라미터가 반드시 수동 정의 되어야한다.&lt;br /&gt;학습 예제가 일관된경우, 결과 표정은 복잡한 보간 학습 데이터의 보간으로 남는다?...&lt;br /&gt;이는 종종 학습 예제와 매우 다른 부정확한 결과를 보여준다.&lt;br /&gt;&lt;br /&gt;Parallel Parameterization&lt;br /&gt;semantically equivalent facial rigs를 만들어서 간단하게 애니메이션을 다른 캐릭터로 전달할수 있다.&lt;br /&gt;이 일은 노동 집약적 업무이다. 탁월한 모델링 스킬과 얼굴의 의학적 지식을 알아야한다. 시간이 많이 든다.&lt;br /&gt;&lt;br /&gt;이 과정을 자동으로 하기위해 몇몇 접근방법들은 generic face model 에서 neutral face target 으로 전달하는 방법을 제안했다.&lt;br /&gt;--- Noh and Neumann 2001, dense correspondences, trasfer per-vertex displacements for each expression.&lt;br /&gt;--- &amp;nbsp;Sumner and Popović 2004, deformation gradients.&lt;br /&gt;--- &amp;nbsp;Orvalho et al. 2008; Seol et al. 2012, 2011, Radial Basis Function.&lt;br /&gt;--- &amp;nbsp;Li et al. 2010. ranging from incorporating examples.&lt;br /&gt;--- Saito 2013. contact constraints.&lt;br /&gt;--- &amp;nbsp;Xu et al. 2014, interactive editing&lt;br /&gt;--- Bouaziz et al. 2013. Ichim et al. 2015. Seol et al. 2016. iterative refinement schemes for real humans.&lt;br /&gt;만약 소스와 타겟의 형상이 많이 다르면 실패한다. 과한 표정이 전달되거나 약한 표정이 전달된다.&lt;br /&gt;--- Seol et al. [2012] 는 propotional mismatch 문제를 속도를 이용해서 해결하려고했다.&lt;br /&gt;우리는 연기자의 모션 범위를 자동으로 적응하게하여 성능을 향상했다.&lt;br /&gt;연기자의 모션 시퀀스와 희박한 대응 점이 주어지면 우리의 방법은 페이스 리그의 블렌드 쉐입을 연기자의 공간으로 자동으로 전달한다.&lt;br /&gt;&lt;br /&gt;Manifold-based Techiques.&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&amp;nbsp;&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/194</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Facial-Retargeting-with-Automatic-Range-of-Motion-Alignment#entry194comment</comments>
      <pubDate>Sat, 20 Aug 2022 00:09:21 +0900</pubDate>
    </item>
    <item>
      <title>Realtime Performance-Based Facial Animation</title>
      <link>https://thebeautifulfuture.tistory.com/entry/Realtime-Performance-Based-Facial-Animation</link>
      <description>&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;ACM transactions on graphics (TOG) 2011&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Abstract&lt;br /&gt;키넥트를 이용해서 실시간으로 사용자가 Performance-based character animation을 캐릭터에 적용할 수 있는 기술.&lt;br /&gt;키넥트는 노이즈가 많다. 효율적으로 적은 해상도 이미지와 노이즈 3D 데이터를 실제같은 표정으로 바꾸기 위해&lt;br /&gt;기하학정보와 텍스쳐정보를 등록하여 사용 및 사전에 기록된 애니매이션 priors를 같이 사용하여&amp;nbsp; 하나의 최적화 문제를 푼다.&lt;br /&gt;&lt;i&gt;줄어든 파라미터 공간에서 공식화된 &lt;/i&gt;maximum a posteriori estimation을 푼다.&lt;br /&gt;compelling 설득력있는 삼차원 얼굴 다이나믹스 재구성 될수 있음을 마커나 intrusive lighting, scanning hardware 없이 가능함을 보인다.&amp;nbsp;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Overview&lt;br /&gt;track rigid and non-rigid motion of user's face&lt;br /&gt;map the extracted tracking parameters to suiable animation controls&lt;br /&gt;solve the parameters of user specific expression model given the observed 2D and 3D data.&lt;br /&gt;a suitable probabilistic prior from prerecorded animation sequences that define the space of realistic facial expressions.&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Blendshape Representation&lt;br /&gt;사람의 블렌드 쉐입 가중치는 다른 캐릭터간에 전달되도록 충분한 추상력를 제공한다는 것이 근본적 가설이다.&lt;/i&gt;&lt;i&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Acquisition Hardware&lt;br /&gt;Kinect 사용, low resolution and high noise levels of input data is the primary challenge.&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Realtime Tracking&lt;br /&gt;rigid motion과 non-rigid motion을 분리했다.&lt;br /&gt;&lt;/i&gt;&lt;i&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Rigid Tracking&lt;br /&gt;이전 프레임의 메쉬를 뎁스맵에 ICP를 이용해서 point-plane constraints를 줘서 얼라인했다.&lt;br /&gt;얼라인먼트를 안정화하기위해 볼위쪽만 가지고 얼굴에서 변화가 심한곳을 빼고 pre-segmented template을 사용했다.&lt;br /&gt;트랜슬레이션과 회전에 하이프리퀀시 플릭커링 필터를 사용했다.&lt;br /&gt;&lt;br /&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Non-rigid Tracking&lt;br /&gt;가능한 가까이 사용자의 표정과 유사하며 현실적인 사람표정의 공간안에 들어있는 블렌드쉐입을 만드는게 목표이다.&lt;br /&gt;블렌드쉐입 파라미터는 현실적인&amp;nbsp; 표정을 분별하지 못하고 무의미한 형상을 만들기 쉽다.&lt;br /&gt;기하학적 조건과 텍스쳐 조건으로 피팅하면 노이즈 때문에 만족할 만한 결과를 얻기는 힘들다.&lt;br /&gt;&lt;/i&gt;&lt;i&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Statistical Model&lt;br /&gt;unrealistic face pose를 막기위해 블렌드쉐입 웨이트를 regularize 한다.&lt;br /&gt;dynamic expression prior는 이미 존재하는 블렌드쉐입 애니메이션 $$ \textbf{A}=\{A_1, ..., A_l\} $$&lt;br /&gt;으로 부터 계산된다.&lt;br /&gt;$$ A_j = \{a^1_j, ..., a^k_j\}, a^i_j \in \mathbb{R}^m $$&lt;br /&gt;m-dimensional blendshape space. n 크기 윈도우 안 연속되는 프레임을 고려해 temporal coherence를 이용함.&lt;br /&gt;얼굴의 기하학적 구조와 모션에 효과적이다.&lt;br /&gt;&lt;/i&gt;&lt;i&gt;&lt;/i&gt;&lt;i&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;MAP Estimation&lt;br /&gt;$$ input \ data: D_i = (G_i, I_i), depth \ map: G_i, color \ image: I_i $$&lt;br /&gt;$$ blendshape \ weights: \textbf{x}_i \in \mathbb{R}^m $$&lt;br /&gt;&lt;/i&gt;&lt;i&gt;$$ previous \ blendshape \ weights: X_n^i =&amp;nbsp; \{ x_{i-1}, ..., &lt;i&gt;x_{i-n}&lt;/i&gt; \}&amp;nbsp; $$&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;br /&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-15 00-10-06.png&quot; data-origin-width=&quot;202&quot; data-origin-height=&quot;38&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/CGfKO/btrJItcPtb3/P5qNrXtHXmYiC5sYb7w3L1/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/CGfKO/btrJItcPtb3/P5qNrXtHXmYiC5sYb7w3L1/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/CGfKO/btrJItcPtb3/P5qNrXtHXmYiC5sYb7w3L1/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FCGfKO%2FbtrJItcPtb3%2FP5qNrXtHXmYiC5sYb7w3L1%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;202&quot; height=&quot;38&quot; data-filename=&quot;Screenshot from 2022-08-15 00-10-06.png&quot; data-origin-width=&quot;202&quot; data-origin-height=&quot;38&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;Using Bayes' rule&lt;br /&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-15 00-11-53.png&quot; data-origin-width=&quot;262&quot; data-origin-height=&quot;32&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/oYp8a/btrJE18xE6q/nUNXu9AvamMxT2gkfuTpvk/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/oYp8a/btrJE18xE6q/nUNXu9AvamMxT2gkfuTpvk/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/oYp8a/btrJE18xE6q/nUNXu9AvamMxT2gkfuTpvk/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2FoYp8a%2FbtrJE18xE6q%2FnUNXu9AvamMxT2gkfuTpvk%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;262&quot; height=&quot;32&quot; data-filename=&quot;Screenshot from 2022-08-15 00-11-53.png&quot; data-origin-width=&quot;262&quot; data-origin-height=&quot;32&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt; Assuming&amp;nbsp;that&amp;nbsp;D&amp;nbsp;is&amp;nbsp;conditionally&amp;nbsp;independent&amp;nbsp;of&amp;nbsp;Xn&amp;nbsp;given&amp;nbsp;x&lt;br /&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;figure class=&quot;imageblock alignCenter&quot; data-ke-mobileStyle=&quot;widthOrigin&quot; data-filename=&quot;Screenshot from 2022-08-15 00-12-37.png&quot; data-origin-width=&quot;239&quot; data-origin-height=&quot;57&quot;&gt;&lt;span data-url=&quot;https://blog.kakaocdn.net/dn/buqb84/btrJCoqyPoU/suw1HPaQCZcZ17CUKucuC0/img.png&quot; data-phocus=&quot;https://blog.kakaocdn.net/dn/buqb84/btrJCoqyPoU/suw1HPaQCZcZ17CUKucuC0/img.png&quot;&gt;&lt;img src=&quot;https://blog.kakaocdn.net/dn/buqb84/btrJCoqyPoU/suw1HPaQCZcZ17CUKucuC0/img.png&quot; srcset=&quot;https://img1.daumcdn.net/thumb/R1280x0/?scode=mtistory2&amp;fname=https%3A%2F%2Fblog.kakaocdn.net%2Fdn%2Fbuqb84%2FbtrJCoqyPoU%2Fsuw1HPaQCZcZ17CUKucuC0%2Fimg.png&quot; onerror=&quot;this.onerror=null; this.src='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png'; this.srcset='//t1.daumcdn.net/tistory_admin/static/images/no-image-v1.png';&quot; loading=&quot;lazy&quot; width=&quot;239&quot; height=&quot;57&quot; data-filename=&quot;Screenshot from 2022-08-15 00-12-37.png&quot; data-origin-width=&quot;239&quot; data-origin-height=&quot;57&quot;/&gt;&lt;/span&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p data-ke-size=&quot;size16&quot;&gt;&lt;i&gt;&amp;nbsp;&lt;/i&gt;&lt;/p&gt;</description>
      <category>논문</category>
      <author>Small Octopus</author>
      <guid isPermaLink="true">https://thebeautifulfuture.tistory.com/193</guid>
      <comments>https://thebeautifulfuture.tistory.com/entry/Realtime-Performance-Based-Facial-Animation#entry193comment</comments>
      <pubDate>Sun, 14 Aug 2022 10:39:44 +0900</pubDate>
    </item>
  </channel>
</rss>