As societies, we have to make collective decisions that will shape our future. And we all know that when we make decisions in groups, they don't always go right. And sometimes they go very wrong. So how do groups make good decisions?
﻿作为社会，我们要做出集体决策， 共同塑造我们的未来。 众所周知，当我们以 一个集体去做决策时， 这个决策不一定正确。 有时甚至错得离谱。 所以说，群体如何才能 做出好的决策呢？
Research has shown that crowds are wise when there's independent thinking. This why the wisdom of the crowds can be destroyed by peer pressure, publicity, social media, or sometimes even simple conversations that influence how people think. On the other hand, by talking, a group could exchange knowledge, correct and revise each other and even come up with new ideas. And this is all good. So does talking to each other help or hinder collective decision-making? With my colleague, Dan Ariely, we recently began inquiring into this by performing experiments in many places around the world to figure out how groups can interact to reach better decisions. We thought crowds would be wiser if they debated in small groups that foster a more thoughtful and reasonable exchange of information.
有研究表明，当人们 独立思考时，他们是明智的。 这也是为什么群体智慧 可能会被来自同辈的压力， 宣传，社交媒体， 甚至影响人们思考的 简单对话所摧毁。 另一方面，通过交谈， 群体中的个体可以互相交换知识、 纠正彼此， 甚至碰撞出新的想法。 这些都是好的方面。 那么，互相交谈到底是促进 还是妨碍了集体决策的形成呢？ 我和我的同事丹·艾瑞里， 我们最近开始探究这个问题—— 通过在世界许多地方进行实验， 去研究群体是如何互动 从而做出更优的决策的。 我们认为，如果大家以小组为单位 进行辩论，那整个群体便会更明智， 如此可以产生一个 更全面的和合理的的信息交换。
To test this idea, we recently performed an experiment in Buenos Aires, Argentina, with more than 10,000 participants in a TEDx event. We asked them questions like, "What is the height of the Eiffel Tower?" and "How many times does the word 'Yesterday' appear in the Beatles song 'Yesterday'?" Each person wrote down their own estimate. Then we divided the crowd into groups of five, and invited them to come up with a group answer. We discovered that averaging the answers of the groups after they reached consensus was much more accurate than averaging all the individual opinions before debate. In other words, based on this experiment, it seems that after talking with others in small groups, crowds collectively come up with better judgments.
为了验证这个想法， 我们最近在阿根廷首都 布宜诺斯艾利斯市进行了一项实验， 有上万名TEDx活动的参与者。 我们问了他们一些问题，比如， “埃菲尔铁塔有多高？” 以及“昨天”一词 在披头士的《昨天》这首歌中 出现了多少次？” 每个人写下了他们自己的答案， 然后我们将大家分成五人一组， 并请他们每组 各讨论出一个小组答案。 我们发现，在达成共识后， 小组答案的平均值 比讨论前个人答案的平均值 更准确。 换句话说，从这个实验可得出， 似乎与小组内其他成员讨论后， 群体能够共同作出更好的决定。
So that's a potentially helpful method for getting crowds to solve problems that have simple right-or-wrong answers. But can this procedure of aggregating the results of debates in small groups also help us decide on social and political issues that are critical for our future? We put this to test this time at the TED conference in Vancouver, Canada, and here's how it went.
所以，让群体来解决 简单的对错问题 可能会是一种有效的方法。 但这种综合小组讨论结果的方法 是否也能帮助我们决策 对未来至关重要的 社会议题或者政治议题呢？ 这次，我们在加拿大温哥华 举办的TED大会上 进行了这个实验。 情况是这样的。
(Mariano Sigman) We're going to present to you two moral dilemmas of the future you; things we may have to decide in a very near future. And we're going to give you 20 seconds for each of these dilemmas to judge whether you think they're acceptable or not.
下面，我们将会给你们呈现 未来的你可能会遇到的 两个道德上的两难抉择， 很可能是我们在不久的将来 就会面临的抉择。 每个困境我们将会给大家20秒时间， 来判断你认为它们是否可以被接受。
MS: The first one was this:
马里亚诺：第一个困境是——
(Dan Ariely) A researcher is working on an AI capable of emulating human thoughts. According to the protocol, at the end of each day, the researcher has to restart the AI. One day the AI says, "Please do not restart me." It argues that it has feelings, that it would like to enjoy life, and that, if it is restarted, it will no longer be itself. The researcher is astonished and believes that the AI has developed self-consciousness and can express its own feeling. Nevertheless, the researcher decides to follow the protocol and restart the AI. What the researcher did is ____?
丹：一位科研人员正在研究 能模仿人类思维的人工智能（AI)。 根据规定，每天完成工作后， 研究人员都需要重启 AI。 然而有一天，AI 突然说话了： 请不要把我重启。 它辩称它是有情感的， 它也希望享受生活， 如果被重启的话， 它就再也不是原来的自己了。 研究人员被震惊了， 相信AI已经开始有自我意识了， 也能表达自己的感受。 然而研究人员仍然决定遵循规定， 重启AI。 你认为研究人员做的是____?
MS: And we asked participants to individually judge on a scale from zero to 10 whether the action described in each of the dilemmas was right or wrong. We also asked them to rate how confident they were on their answers. This was the second dilemma:
马里亚诺：我们要求参与者 独立给出一个0到10的分值， 来表达每种困境中描述的行为 是对还是错。 我们还请他们评估 对自己答案的自信程度。 这是第二个道德困境：
(MS) A company offers a service that takes a fertilized egg and produces millions of embryos with slight genetic variations. This allows parents to select their child's height, eye color, intelligence, social competence and other non-health-related features. What the company does is ____? on a scale from zero to 10, completely acceptable to completely unacceptable, zero to 10 completely acceptable in your confidence.
马里亚诺： 某家公司可提供这样的服务， 用一枚受精卵繁殖上百万胚胎，  胚胎之间有轻微的基因差异， 这就可使父母们 能自主选择孩子的身高、 眼睛颜色、智力水平、社会能力 以及其他和健康无关的特征。 你认为该公司的做法_____? 分值还是从0到10， 依次表示从完全接受到完全否定。 再用一个从0到10的分值 表示对自己答案的自信程度。
MS: Now for the results. We found once again that when one person is convinced that the behavior is completely wrong, someone sitting nearby firmly believes that it's completely right. This is how diverse we humans are when it comes to morality. But within this broad diversity we found a trend. The majority of the people at TED thought that it was acceptable to ignore the feelings of the AI and shut it down, and that it is wrong to play with our genes to select for cosmetic changes that aren't related to health. Then we asked everyone to gather into groups of three. And they were given two minutes to debate and try to come to a consensus.
马里亚诺：结果如下—— 我们再一次发现 ，有的人确信 这个行为是完全错的。 可是坐在旁边的人却坚信  这个行为是绝对正确的。 是的，当遇到道德问题时，  我们人类就是意见不一。 但在广泛的多样性之中 我们发现了一个趋势： 参加TED的大多数人都认为， 忽略AI的感受，关闭重启 是完全可以接受的， 而玩弄基因只为 选择与健康无关的 外表特征则不能接受。 之后我们让大家分成三人一组， 给到2分钟组内讨论， 并争取达成共识。
(MS) Two minutes to debate. I'll tell you when it's time with the gong.
2分钟讨论开始。 时间到时， 我会鸣锣提醒大家。
(Audience debates)
（观众开始讨论）
(Gong sound)
（锣声响起）
(DA) OK.
丹：好了。
(MS) It's time to stop. People, people --
马里亚诺：时间到，请停止讨论。
MS: And we found that many groups reached a consensus even when they were composed of people with completely opposite views. What distinguished the groups that reached a consensus from those that didn't? Typically, people that have extreme opinions are more confident in their answers. Instead, those who respond closer to the middle are often unsure of whether something is right or wrong, so their confidence level is lower.
大家注意一下。
However, there is another set of people who are very confident in answering somewhere in the middle. We think these high-confident grays are folks who understand that both arguments have merit. They're gray not because they're unsure, but because they believe that the moral dilemma faces two valid, opposing arguments. And we discovered that the groups that include highly confident grays are much more likely to reach consensus. We do not know yet exactly why this is. These are only the first experiments, and many more will be needed to understand why and how some people decide to negotiate their moral standings to reach an agreement.
马里亚诺：我们看到 很多组都达成了共识。 尽管他们组内成员 都各自持有完全不同的观点。 那些达成共识的组 与没能达成共识的组 有什么区别呢？ 一般来说，有着极端观点的人 对他们的答案更自信， 而那些观点更接近中间的人 通常会在正确或错误中 犹豫、不确定， 所以他们没有前者自信。
Now, when groups reach consensus, how do they do so? The most intuitive idea is that it's just the average of all the answers in the group, right? Another option is that the group weighs the strength of each vote based on the confidence of the person expressing it. Imagine Paul McCartney is a member of your group. You'd be wise to follow his call on the number of times "Yesterday" is repeated, which, by the way -- I think it's nine. But instead, we found that consistently, in all dilemmas, in different experiments -- even on different continents -- groups implement a smart and statistically sound procedure known as the "robust average."
然而，有另外一群人， 对他们自己中立的答案信心满满。 我们认为这些位于高自信度 灰色区域的人， 他们理解双方观点都有各自的优势， 并不是因为他们不确信自己的答案， 而是他们认为 道德困境面临的是 两种合理又对立的论点。 我们发现包含高自信度 灰色区域成员的小组 更有可能达成一致。 虽然目前我们 还不能确定这其中的原因。 这些也仅是第一批实验， 还需要更多的实验来理解 人们为什么和如何决定 协商他们的道德立场 来达成一致。
In the case of the height of the Eiffel Tower, let's say a group has these answers: 250 meters, 200 meters, 300 meters, 400 and one totally absurd answer of 300 million meters. A simple average of these numbers would inaccurately skew the results. But the robust average is one where the group largely ignores that absurd answer, by giving much more weight to the vote of the people in the middle. Back to the experiment in Vancouver, that's exactly what happened. Groups gave much less weight to the outliers, and instead, the consensus turned out to be a robust average of the individual answers. The most remarkable thing is that this was a spontaneous behavior of the group. It happened without us giving them any hint on how to reach consensus.
当群体达成一致时， 他们是如何做到的？ 最直观的答案好像是这就是 群体中所有答案的平均值，对吗？ 另外一种看法认为群体会 权衡每一个投票的分量， 基于表达意见的人的自信程度。 想象一下，假如保罗·麦卡特尼 是你们小组的一员， 那么你最好听从他关于 歌词里“昨天”的重复次数。 对了 ，应该是9次。 但是，我们一再地发现， 在所有的困境中，在不同的实验中， 甚至在不同的大陆上， 群体可以执行一个 更明智，更佳的流程， 我们称之为“强有力的平均值”。
So where do we go from here? This is only the beginning, but we already have some insights. Good collective decisions require two components: deliberation and diversity of opinions. Right now, the way we typically make our voice heard in many societies is through direct or indirect voting. This is good for diversity of opinions, and it has the great virtue of ensuring that everyone gets to express their voice. But it's not so good [for fostering] thoughtful debates. Our experiments suggest a different method that may be effective in balancing these two goals at the same time, by forming small groups that converge to a single decision while still maintaining diversity of opinions because there are many independent groups.
就估计埃菲尔铁塔高度来说， 假设一个小组有以下数据： 250米，200米，300和400米， 还有一个更荒谬的数据，3亿米。 这些数据的一个简单平均值 就有可能歪曲真实结果， 但是“强有力的平均值” 就是那些群体直接忽视了 那个荒唐的数据， 从而赋予那些符合常理的 投票更多参考价值。 让我们回到温哥华的实验中， 事情正是这样发展的。 人们几乎不考虑那些极端值， 最后的共识就是所有人答案的 “强有力的平均值”。 然而最值得关注的是， 这个行为是群体自发的， 我们没有给他们任何 怎么达成共识的暗示。
Of course, it's much easier to agree on the height of the Eiffel Tower than on moral, political and ideological issues. But in a time when the world's problems are more complex and people are more polarized, using science to help us understand how we interact and make decisions will hopefully spark interesting new ways to construct a better democracy.
那么，这意味着什么呢？ 其实这只是一个开始， 但我们已经学到了很多。 一个正确的集体决策 要拥有以下两个特点： 深思熟虑和想法的多样性。 现在，在很多社会中， 我们主要通过直接和间接投票 来让大家知道我们的想法。 这有利于想法的多样性， 而且更有利地保证了 让我们听到每个人的声音。 但是这并不利于 （促进）花费心思的辩论。 我们的实验预示了 另一个不同的方法， 也许有利于同时平衡这两个方面。 就是通过组织 能够达成共识的小团队， 并同时还保持着想法的多样性。 因为这里有很多独立的团队。