障碍 期权

Three weeks ago we announced the release of the Obstacle Tower Environment, a new benchmark for Artificial Intelligence (AI) research built using our ML-Agents toolkit. One week ago we followed that up with the launch of the Obstacle Tower Challenge, a contest that offers researchers and developers the chance to compete to train the best-performing agents on this new task. The reception so far from the community has been great. I wanted to take the time to talk a little more about our motivation for the challenge, and what we hope it will promote.

三周前,我们宣布发布障碍塔环境 ,这是使用我们的ML-Agents工具箱构建的人工智能(AI)研究的新基准。 一周前,我们随后发起了障碍塔挑战赛,该竞赛为研究人员和开发人员提供了机会,以竞争的方式培训表现最佳的代理以完成这项新任务。 到目前为止,离社区的接待很棒。 我想花些时间谈论我们挑战的动机以及我们希望它将推动什么。

The idea for the Obstacle Tower came from looking at the current field of benchmarks being used in Artificial Intelligence research today. Despite the great theoretical and engineering work being put into developing new algorithms, many researchers were still focused on using decades-old home console games such as Pong, Breakout, or Ms. PacMan. Aside from containing crude graphics and gameplay mechanics, these games are also completely deterministic, meaning that a player (or computer) could memorize a series of button presses, and even be able to solve them blindfolded. Given these drawbacks, we wanted to start from scratch and build a procedurally generated environment that we believe can be a benchmark that pushes modern AI algorithms to their limits. Specifically, we wanted to focus on AI agents vision, control, planning, and generalization abilities.

障碍塔的想法来自对当今人工智能研究中使用的基准测试的当前领域的考察。 尽管开发新算法方面投入了大量的理论和工程工作,但许多研究人员仍专注于使用具有数十年历史的家用游戏机,例如Pong,Breakout或PacMan女士。 除了包含原始图形和游戏玩法机制之外,这些游戏还具有完全确定性,这意味着玩家(或计算机)可以记住一系列按钮,甚至可以蒙住眼睛解决它们。 鉴于这些缺点,我们希望从头开始,构建一个程序生成的环境,我们认为这可以成为将现代AI算法推向极限的基准。 具体来说,我们希望专注于AI代理的愿景,控制,计划和概括能力。

We believe that the Obstacle Tower has the potential to help contribute to research into AI, specifically a sub-field called Deep Reinforcement Learning (Deep RL), which focuses on agents which learn from trial-and-error experience. Our own internal tests have shown that even the current state-of-the-art algorithms in Deep RL are only able to solve on average a few test floors of Obstacle Tower. The graph below is taken from our paper, and shows that the top Deep RL algorithms (PPO and Rainbow) are still nowhere near the average human player when it comes to learning to play a deterministic version of the game (No Generalization) let alone a version where things look and play differently than what they were trained on (Weak and Strong Generalization).

我们认为障碍塔有潜力帮助对AI的研究做出贡献,特别是一个名为“深度强化学习”(Deep RL)的子领域,该领域专注于从试验和错误经验中学习的特工。 我们自己的内部测试表明,即使Deep RL中当前的最新算法也只能平均解决Obstacle Tower的几个测试楼层。 下图摘自我们的论文,表明在学习玩确定性版本的游戏(No Generalization)时,顶级的Deep RL算法(PPO和Rainbow)仍远未达到普通人类玩家的水平,更不用说了。版本的外观和玩法与所接受的训练有所不同(弱和强泛化)。

At Unity, we think that the research being conducted on AI has benefits not only to the broader technology community but also to game developers and players. Smarter AI means better NPCs, more thorough playtesting, and ultimately more engaging player experiences. That is why we decided to launch the Obstacle Tower Challenge. To invite the best minds in Deep RL research and beyond to make an effort to solve the tower, and have those insights contribute to a wider world.

在Unity,我们认为针对AI进行的研究不仅有益于更广泛的技术社区,也有益于游戏开发商和玩家。 智慧的AI意味着更好的NPC,更全面的游戏测试以及最终更引人入胜的玩家体验。 这就是为什么我们决定发起障碍塔挑战赛的原因。 邀请Deep RL研究及以后的顶尖人才努力解决塔楼问题,并使这些见解为更广阔的世界做出贡献。

To help us evaluate entries, we have teamed up with AICrowd, a platform for hosting Machine Learning challenges. The challenge is taking place in, with the Round 1 submission deadline of March 31st and participants in the contest will submit trained agents, which will be evaluated on a special test set of Obstacle Tower levels. To enter the contest, learn more about the process, and to get started, go here.

为了帮助我们评估参赛作品,我们与AICrowd合作, AICrowd是一个托管机器学习挑战的平台。 挑战正在进行中,第一轮提交截止日期为3月31日,比赛的参与者将提交训练有素的特工,这些特工将根据障碍塔等级的特殊测试集进行评估。 要参加比赛,请详细了解该过程,然后开始这里 。

演示地址

We are happy to share that Google Cloud Platform (GCP) is a prize sponsor of the contest, and on top of the cash prizes and travel grants provided by Unity, winning participants will also receive GCP credits. These prizes are collectively valued at over $100K! Using GCP, it is possible to train agents on the cloud remotely rather than using desktop resources. This can both speed up training time, as well as make it simpler to run multiple concurrent experiments. Users who sign up for a new GCP account get $300 in free credit. On top of this, the first 50 participants who pass Round 1 of the Obstacle Tower Challenge will receive an additional $1100 in credits. The top three winners from Round 2 will receive an additional $5000 in credits.

我们很高兴与您分享Google Cloud Platform(GCP)是竞赛的赞助商,除了Unity提供的现金奖励和旅行补助金之外,获胜的参与者还将获得GCP积分。 这些奖品的总价值超过10万美元! 使用GCP,可以在云上远程培训代理,而无需使用桌面资源。 这样既可以加快训练时间,又可以简化进行多个并发实验的过程。 注册新GCP帐户的用户可获得$ 300的免费信用。 最重要的是,通过障碍塔挑战赛第一轮的前50名参与者将获得额外的$ 1100赠金。 第二轮的前三名优胜者将获得额外的$ 5000赠金。

For those new to training agents, or those wanting an easy way to get started, we have written a guide on using training an agent on Google Cloud Platform. The guide walks through setting up a cloud computing instance and using a state of the art algorithm provided by Google Dopamine to train an agent to progress in the Obstacle Tower. You can read the guide here.

对于那些刚刚开始培训代理的人,或者想要一种简单的入门方法的人,我们编写了关于在Google Cloud Platform上使用培训代理的指南。 该指南逐步介绍了如何设置云计算实例,并使用Google多巴胺提供的最新算法来训练代理在障碍塔中前进。 您可以在此处阅读指南。

If you have any questions about the contest, including support on submitting entries, please see the discussion forum here. For general issues or discussion of the environment itself, see our GitHub repo here. To learn more about the environment, read our research paper. We look forward to seeing the creative solutions the community comes up with to the challenge!

如果您对比赛有任何疑问,包括对参赛作品的支持,请访问此处的讨论论坛。 有关环境本身的一般性问题或讨论,请参阅此处的GitHub存储库。 要了解有关环境的更多信息,请阅读我们的研究论文 。 我们期待看到社区提出挑战的创造性解决方案!

翻译自: https://blogs.unity3d.com/2019/02/18/the-obstacle-tower-challenge-is-live/

障碍 期权

障碍 期权_障碍塔挑战赛已经开始!相关推荐

  1. C++:实现量化barrieroption障碍期权测试实例

    C++:实现量化barrieroption障碍期权测试实例 #include "barrieroption.hpp" #include "utilities.hpp&qu ...

  2. C++:实现量化doublebarrier option双障碍期权 测试实例

    C++:实现量化doublebarrier option双障碍期权 测试实例 #include "doublebarrieroption.hpp" #include "u ...

  3. C++:实现量化Partial-time barrier部分时间障碍期权测试实例

    C++:实现量化Partial-time barrier部分时间障碍期权测试实例 #include "partialtimebarrieroption.hpp" #include ...

  4. ppt_第一章_德塔自然语言图灵系统

    开始组织核心ppt文字描述,图片和源码 在书籍中已经很丰富了. 第一章_德塔自然语言图灵系统 第一章_德塔自然语言图灵系统 分词,排序,神经网络索引,搜索,动态 POS函数流水阀门细化遍历 内核匹配, ...

  5. 石头扫地机器人卡顿_障碍挡不住!石头扫地机器人T6体验:脱困越障不跌落的全能选手...

    --前言 如今,扫地机器人已经随着激烈的行业竞争价格不断下探,越来越多的普通家庭都能够用上此类智能化的清洁工具.不过这一次,笔者要和大家分享的并不是它的清洁能力,而是它的越障脱困性能. 随着家电设备的 ...

  6. 扫地机器人石头爬坡_障碍挡不住!石头扫地机器人T6体验:脱困越障不跌落的全能选手...

    --前言 如今,扫地机器人已经随着激烈的行业竞争价格不断下探,越来越多的普通家庭都能够用上此类智能化的清洁工具.不过这一次,笔者要和大家分享的并不是它的清洁能力,而是它的越障脱困性能. 随着家电设备的 ...

  7. git 工作流的使用_用塔增压您的git流

    git 工作流的使用 A deep dive into the features and benefits of the Tower git client 深入了解Tower git客户端的功能和优势 ...

  8. 玛塔机器人函数_玛塔创想编程机器人套装包含什么?

    展开全部 包含62616964757a686964616fe4b893e5b19e31333433633530控制塔BASIC版/编程板/玛塔机器人BASIC版三个主要组件. 包含46个编程模板/玩法 ...

  9. 动态规划入门_数塔问题

    在讲述DP算法的时候,一个经典的例子就是数塔问题,它是这样描述的: 有如下所示的数塔,要求从顶层走到底层,若每一步只能走到相邻的结点,则经过的结点的数字之和最大是多少?    已经告诉你了,这是个DP ...

最新文章

  1. Boost:bimap双图的突变关系的测试程序
  2. 【机器学习】线性回归之梯度下降、多元线性回归概述
  3. php 注册自动登录,php – 创建第二个自动登录用户的登录页面
  4. 给生命一个助跑的过程(图)
  5. centos中 npm install 被kill的解决方案
  6. webpack搭建服务器,随时修改刷新
  7. 今年春节北京烟花爆竹备货量下降46.7%
  8. 英语发音规则之26个字母发音规则(A字母)
  9. 移动通信网络规划:误码率
  10. MongoDB和MySQL常用增删改查语句
  11. 分享收集网络上的免费0元虚拟主机
  12. 三星A5显示服务器未响应,三星A5手机死机了 屏幕一直亮着 按什么键都没反应==求解答...
  13. php电脑端打开微信页面大小,电脑微信小程序设置全屏的方法是什么?
  14. Goland嗖嗖的: 快捷键,自动生成代码等效率小技巧
  15. VS使用C++开发桌面程序
  16. 华为交换机批量处理端口
  17. java设计求圆的面积周长的代码_java编程 1.设计一个求圆的面积和周长的类,要求:1计算当半径r=10和20时,圆的面积,并显示出来 、...
  18. matlab实现移位寄存器,基于matlab的移位寄存器法m序列的产生
  19. 蓝牙技术|伦茨科技带你了解蓝牙智能门禁
  20. 2018年9月12日

热门文章

  1. L2-1 点赞狂魔 (25分)
  2. [Win10] 安装虚拟光驱 用于加载ISO等镜像文件
  3. 个人日志20120228
  4. 再见2013,你好2014
  5. 多种类型的导航条制作【css3,jquery】
  6. 0欧电阻、电感、磁珠单点接地时有什么区别?
  7. Windows 7 64位用STC ISP下载51单片机程序的方法
  8. 每次产生不一样的随机数方法
  9. Android初体验之星座及生肖查询的思路
  10. android系统自动抓取log方法