Look at these images. Now, tell me which Obama here is real.
﻿看看这些图像。 现在，告诉我哪个是真的奥马巴。
(Video) Barack Obama: To help families refinance their homes, to invest in things like high-tech manufacturing, clean energy and the infrastructure that creates good new jobs.
巴拉克·奥巴马：帮助家庭对他们的房屋重做贷款， 投资高科技制造业， 清洁能源 和带来良好就业机会的基础设施。
Supasorn Suwajanakorn: Anyone? The answer is none of them.
有人知道吗？ 答案是：都不是。
(Laughter)
（笑声）
None of these is actually real. So let me tell you how we got here. My inspiration for this work was a project meant to preserve our last chance for learning about the Holocaust from the survivors. It's called New Dimensions in Testimony, and it allows you to have interactive conversations with a hologram of a real Holocaust survivor.
这些都不是真的。 那让我来告诉你们是怎么回事。 我这个工作的灵感来自于 一个试图保存我们从幸存者那里 了解到的关于大屠杀 的项目。 这个项目叫做证词新维度 (New Dimensions in Testimony)， 它可以让你与真实大屠杀幸存者的全息图 进行互动对话。
(Video) Man: How did you survive the Holocaust?
你是怎么在大屠杀中幸存下来的？
(Video) Hologram: How did I survive? I survived, I believe, because providence watched over me.
我怎么幸存下来？ 我幸存下来， 我相信， 是因为上帝眷顾我。
SS: Turns out these answers were prerecorded in a studio. Yet the effect is astounding. You feel so connected to his story and to him as a person. I think there's something special about human interaction that makes it much more profound and personal than what books or lectures or movies could ever teach us.
原来这些答案是预先在工作室录制的。 但效果令人吃惊。 你会对他的故事， 他这个人感同身受。 我想人类互动的特别之处 让它比图书，演讲或电影 告诉我们的 要更加深刻和真实。
So I saw this and began to wonder, can we create a model like this for anyone? A model that looks, talks and acts just like them? So I set out to see if this could be done and eventually came up with a new solution that can build a model of a person using nothing but these: existing photos and videos of a person. If you can leverage this kind of passive information, just photos and video that are out there, that's the key to scaling to anyone.
所以我就开始想， 我们能不能为每个人做个模型？ 这个模型的样子， 谈话和举止就跟真人无异。 于是我开始探索这个能不能搞定， 并最终找到了一个新的解决方案， 只需使用下面这些东西就能构建人的模型： 个人现存的照片和视频。 如果你能利用这种被动信息， 只需公开的照片和视频， 这是扩展到其他人的关键。
By the way, here's Richard Feynman, who in addition to being a Nobel Prize winner in physics was also known as a legendary teacher. Wouldn't it be great if we could bring him back to give his lectures and inspire millions of kids, perhaps not just in English but in any language? Or if you could ask our grandparents for advice and hear those comforting words even if they're no longer with us? Or maybe using this tool, book authors, alive or not, could read aloud all of their books for anyone interested.
顺便说一句，这是理查德·费曼， 他除了是诺贝尔物理学奖得主 也是位传奇教师。 这岂不是很棒？ 如果能够把他带回来 讲课并激励成千上万的小孩， 用英语或者其他任何语言？ 或者你也可以征求祖父母的意见， 听听那些让人宽慰的言语， 即便他们已经离开我们了。 或者使用这个工具，图书的作者， 不管是活着的还是去世的， 可以为任何有兴趣的人朗读他们的书本。
The creative possibilities here are endless, and to me, that's very exciting. And here's how it's working so far.
这里的创意可能是无限的， 对我而言，这非常让人兴奋。 这是目前它的工作原理。
First, we introduce a new technique that can reconstruct a high-detailed 3D face model from any image without ever 3D-scanning the person. And here's the same output model from different views. This also works on videos, by running the same algorithm on each video frame and generating a moving 3D model. And here's the same output model from different angles.
首先我们引入一种新的技术 可以从任何图像中 重建一个高细节的3D人脸模型， 而且无需经对真人进行3D扫描。 这是不同视角下的同一输出模型。 这也可以应用于视频， 通过对每一幅视频 使用同样的算法 产生移动的3D模型。 这是不同视角下的同一输出模型。
It turns out this problem is very challenging, but the key trick is that we are going to analyze a large photo collection of the person beforehand. For George W. Bush, we can just search on Google, and from that, we are able to build an average model, an iterative, refined model to recover the expression in fine details, like creases and wrinkles. What's fascinating about this is that the photo collection can come from your typical photos. It doesn't really matter what expression you're making or where you took those photos. What matters is that there are a lot of them. And we are still missing color here, so next, we develop a new blending technique that improves upon a single averaging method and produces sharp facial textures and colors. And this can be done for any expression.
这些问题富有挑战性， 但关键技巧在于我们需要提前 分析一个人的大量照片集。 对乔治·沃克·布什， 我们只需要搜索谷歌， 这样，我们就能建立一个平均模型， 一个迭代，精炼的模型来恢复表达的细节， 比如折痕和皱纹。 迷人的是 照片集可以来自你的特定照片。 你做何表情或者你在哪里拍照 并不那么关键。 关键的是数量要足够多。 这里我们仍然缺少肤色， 所以下一步， 我们开发了一种新的混合技术 改善了平均模型，   并产生尖锐的面部纹理和肤色。 这可以用于做任何表情。
Now we have a control of a model of a person, and the way it's controlled now is by a sequence of static photos. Notice how the wrinkles come and go, depending on the expression. We can also use a video to drive the model.
现在我们可以 对一个人的模型进行控制， 它现在被控制的方式是 一系列静态的照片。 注意皱纹是如何产生和消失的， 这取决于你的表情。 我们也可以使用视频来驱动模型。
(Video) Daniel Craig: Right, but somehow, we've managed to attract some more amazing people.
丹尼尔·克雷格：没错，但不管怎样， 我们能够吸引到更多优秀的人才。
SS: And here's another fun demo. So what you see here are controllable models of people I built from their internet photos. Now, if you transfer the motion from the input video, we can actually drive the entire party.
这是另一个有趣的演示。 所以你们看到的是 我使用人们的互联网图像 建立的个人控制模型。 现在，如果你从视频中传递表情动作， 我们可以让整个派对动起来。
George W. Bush: It's a difficult bill to pass, because there's a lot of moving parts, and the legislative processes can be ugly.
布什：这是个难以通过的法案， 因为有太多可供商榷的部分， 立法过程可能让人奔溃。
(Applause)
（鼓掌）
SS: So coming back a little bit, our ultimate goal, rather, is to capture their mannerisms or the unique way each of these people talks and smiles. So to do that, can we actually teach the computer to imitate the way someone talks by only showing it video footage of the person? And what I did exactly was, I let a computer watch 14 hours of pure Barack Obama giving addresses. And here's what we can produce given only his audio.
那么回到正题， 我们的最终目标， 不如说，是捕捉他们的言谈举止， 或者每一个人交谈或微笑的独特之处。 所以这样， 我们能不能只向电脑展示这个人的录像 就能教会电脑 去模仿人们谈话的方式？ 而我做的事情是，我让电脑 看了14个小时的奥巴马演讲。 这是我们只通过他的音频生产出来的内容。
(Video) BO: The results are clear. America's businesses have created 14.5 million new jobs over 75 straight months.
结果非常明显。 在过去75个月中，美国企业已经创造了 1450万新的工作机会。
SS: So what's being synthesized here is only the mouth region, and here's how we do it. Our pipeline uses a neural network to convert and input audio into these mouth points.
所以这里合成的只是嘴巴部分， 这是我们做的方法。 我们的处理系统使用神经网络 来转换和输入音频到这些嘴巴的位置。
(Video) BO: We get it through our job or through Medicare or Medicaid.
我们通过我们的工作或者医疗保险 或补助来实现这一目标。
SS: Then we synthesize the texture, enhance details and teeth, and blend it into the head and background from a source video.
然后我们合成纹理， 增强细节和牙齿， 并将其与源视频中的 头部和背景混合在一起。
(Video) BO: Women can get free checkups, and you can't get charged more just for being a woman. Young people can stay on a parent's plan until they turn 26.
女性可以获得免费的检查， 你不会因为是女性而需要支付更高的费用。 年轻人可以在父母计划中呆到26岁。
SS: I think these results seem very realistic and intriguing, but at the same time frightening, even to me. Our goal was to build an accurate model of a person, not to misrepresent them. But one thing that concerns me is its potential for misuse. People have been thinking about this problem for a long time, since the days when Photoshop first hit the market. As a researcher, I'm also working on countermeasure technology, and I'm part of an ongoing effort at AI Foundation, which uses a combination of machine learning and human moderators to detect fake images and videos, fighting against my own work. And one of the tools we plan to release is called Reality Defender, which is a web-browser plug-in that can flag potentially fake content automatically, right in the browser.
我觉得这些结果看起来非常真实和有趣， 但同时，也让我担忧，即便是我。 我们的目标是构建人的精准模型， 而非歪曲他们。 但让我担忧的是它被错误使用的可能。 人们思考这个问题很长时间了， 从Photoshop进入市场那天就开始了。 作为一名研究人员， 我也在研究对抗技术， 我是人工智能基金会持续努力的一份子， 它结合了机器学习和人工模型 来识别假图像和视频， 与我们自己的工作做斗争。 我们打算发布的一个工具叫做真相卫士， 是个浏览器插件 可以用来自动标记潜在假内容， 在浏览器中就可以使用。
(Applause)
（掌声）
Despite all this, though, fake videos could do a lot of damage, even before anyone has a chance to verify, so it's very important that we make everyone aware of what's currently possible so we can have the right assumption and be critical about what we see.
此外， 假视频可以带来很大危害， 甚至在人们有机会验证它之前， 所以让大家意识到这可能是什么 非常重要， 这样我们才能得到正确的推断， 并对看到的保持谨慎。
There's still a long way to go before we can fully model individual people and before we can ensure the safety of this technology. But I'm excited and hopeful, because if we use it right and carefully, this tool can allow any individual's positive impact on the world to be massively scaled and really help shape our future the way we want it to be.
在个人完全建模 以及确保技术的安全性方面， 仍有很长的路要走。 但我兴奋且充满希望， 因为如果我们正确地使用它， 这个工具可以让 每个人对世界积极的影响 得到大规模的普及 并真正帮助塑造我们想要的未来。
Thank you.
谢谢。
(Applause)
（掌声）