Living in an age where big data has become an asset (also refereed to an organization’s unrefined gold) to organization and individuals. Data science has been a hot topic amidst organization’s with the aim of collecting meaningful data to enhance business growth. Not until 2010, organization’s focus was building an infrastructure that can process, store and access data to make sense of consumer data, analyzing them and making decisions based on these data and also to gain business insight.

生活在一个大数据已成为组织和个人资产(也称为组织未精制的黄金)的时代。 数据科学一直是组织中的热门话题,其目的是收集有意义的数据以促进业务增长。 直到2010年,组织的重点才是建立可以处理,存储和访问数据的基础结构,以理解消费者数据,对其进行分析并根据这些数据做出决策并获得业务洞察力。

Due to the great impact data can have on an organization, and considering the rapid advancement in technology, companies processing consumer data has leveraged the use of enhanced software such as Hadoop framework, Business Intelligence software and the use of Artificial Neural Networks and Machine Learning Algorithms to process and understand data. To be able to use these software efficiently, organization needs to employ a data scientist who has a solid understanding on how to analyze data with these software and gain maximum insight to make algorithmic based decision.

由于数据可能对组织产生巨大影响,并且考虑到技术的飞速发展,处理消费者数据的公司已经利用了诸如Hadoop框架,商业智能软件等增强型软件以及人工神经网络和机器学习算法的使用。处理和理解数据。 为了能够有效地使用这些软件,组织需要聘请一位数据科学家,他对如何使用这些软件分析数据具有​​深刻的了解,并获得最大的洞察力,以做出基于算法的决策。

When analyzing these data, data scientist are faced with some ethical challenges such as data collection bias, algorithmic bias, explain-ability results, privacy and so on. In order to formulate and produce morally good solutions: which involves parsing various steps such as generation, creating, processing, discrimination and algorithm. To build such an ethical system, there are four guidelines to consider :

在分析这些数据时,数据科学家面临着一些道德挑战,例如数据收集偏差,算法偏差,可解释性结果,隐私等。 为了制定和产生道德上好的解决方案:涉及解析各个步骤,例如生成,创建,处理,判别和算法。 要建立这样的道德体系,有四个要考虑的准则:

  1. Do good做得好
  2. Minimize harm减少伤害
  3. Just and fair公正公平
  4. Respect privacy尊重隐私

** The word Data Scientist and Data Practitioner are used interchangeably.

**“数据科学家”和“数据实践者”一词可互换使用。

谁是数据科学家? (Who is a Data Scientist?)

Before we go down the road of who a data scientist is, lets consider what data science is. Data science is a vast field with a blend of mathematics, statistics, programming, computer science and so on. It brings in scientific method, process and algorithms to extract insight from both structured and unstructured data. The term could be traced back to 1974 when Peter Naur proposed it as an alternative name for Computer Science. However, the professional term “Data Science” has been attributed to Dj Patil and Jeff Hammrbocher. Till date, there is still no consensus among scientist on the definition of data science and some still consider it a “buzzword”.

在我们走上数据科学家的身份之路之前,让我们考虑一下数据科学是什么。 数据科学是一个广阔的领域,融合了数学,统计学,编程,计算机科学等。 它引入了科学的方法,过程和算法,以从结构化和非结构化数据中提取见解。 该术语可以追溯到1974年,当时Peter Naur提出将其作为计算机科学的替代名称。 但是,专业术语“数据科学”已归因于Dj Patil和Jeff Hammrbocher。 直到现在,科学家之间在数据科学的定义上仍未达成共识,有人仍将其视为“流行语”。

On the other hand, a data scientist is someone who harness and process huge volume of data to generate, extract insight, interpret data effectively and capable of presenting results in a non-technical term. Also, a data scientist is someone who is able to collect a large amount of data (usually consumer data collected or stored by an organization) and gain meaningful insight by working with several elements related to mathematics, statistics, computer science using analytical techniques such as Machine Learning techniques and BI software’s.

另一方面,数据科学家是利用和处理大量数据以生成,提取见解,有效解释数据并能够以非技术术语呈现结果的人。 此外,数据科学家是指能够收集大量数据(通常是组织收集或存储的消费者数据)并通过使用与分析,数学和统计学相关的若干要素进行分析,从而获得有意义的见解的人。机器学习技术和BI软件。

You might be wondering why ethics and laws in relation to the advancement of AI, well we can say because laws cannot move faster than technology and innovations, so rather than waiting for the laws to catch up with them, we work with the tiny bit of technological innovations and problems we encountering right now to mitigate future impacts. Laws and ethics are not meant to make AI feel constrained but to be more innovative and creative which will help in getting prepared for the unknown and being able to do good.

您可能想知道为什么与AI进步相关的道德和法律,好吧,我们可以说,因为法律的发展不能比技术和创新快,所以我们不等法律赶上法律,而是与之合作。我们现在遇到的技术创新和问题,以减轻未来的影响。 法律和道德规范并不意味着要让AI受到约束,而是要更具创新性和创造力,这将有助于为未知事物做好准备并能够做好事。

我们要建立道德环境有多远? (How Far are we to Building an Ethical Environment?)

If you’ve ever googled what ethics means you’d see a lot of definition pop-up but they all digest to ethics being concern of human well-being: about the well-being of others.

如果您曾经搜索过伦理学意味着什么,您会看到很多定义弹出窗口,但是它们都被伦理学所关注,这是人类福祉的关注:关于他人福祉的关注。

Data ethics on the other hand is an new branch of ethics that study’s and evaluate moral problems related to data in order to formulate and support morally good solutions. When it comes to in-questing, accessing and understanding previously unknown human/consumer behavior, data plays an important role. Because of these values data has brought a competitive marketing strategy to the work force.

另一方面,数据伦理学是伦理学的一个新分支,它研究和评估与数据有关的道德问题,以便制定和支持道德上良好的解决方案。 在询问,访问和了解以前未知的人类/消费者行为时,数据起着重要的作用。 由于这些价值观,数据为员工带来了竞争性的营销策略。

With great power comes great responsibility

拥有权利的同时也被赋予了重大的责任

However, with these great opportunities comes some ethical and moral challenges/problems faced by data practitioners when dealing with consumers’ data. Data has brought a competitive impact to the market and has enhanced the development of intelligent products and services. However, they are some ethical challenges which has posed as a threat to human privacy with the use of AI for intelligent product and services; the human privacy is very important.

但是,伴随着这些巨大的机遇,数据从业者在处理消费者数据时面临着一些道德和道德挑战/问题。 数据给市场带来了竞争影响,并促进了智能产品和服务的发展。 但是,它们是一些道德挑战,这些挑战将人工智能用于智能产品和服务对人类隐私构成威胁。 人类的隐私非常重要。

During the last few years we’ve seen various examples of data breach and the use of consumer data without consent to develop advance AI products. A popular and recent example is a tech company called Clearview AI. This company devised a groundbreaking facial recognition app that can take the picture of a person, upload it and get to see public photos of that person, along with the links to where those photos appeared. The system whose backbone is a database of more than three billion of images that Clearview claims to have scrapped from Facebook, YouTube, Venmo and millions of other websites — New York Times, Jan. 18, 2020. This software is great, it could help solve crimes such as shoplifting, identity theft, murder and child sexual exploitation cases and so on, but all these at the expense of corroding privacy.

在过去的几年中,我们看到了各种数据泄露和未经许可就开发高级AI产品使用消费者数据的示例。 最近流行的一个例子是一家名为Clearview AI的科技公司。 该公司设计了一种突破性的面部识别应用程序,可以拍摄人的照片,将其上传并查看该人的公开照片,以及指向这些照片出现位置的链接。 该系统的骨干是一个数据库,该数据库包含Clearview声称已从Facebook,YouTube,Venmo和数百万其他网站( 纽约时报,2020年1月18日)废弃的30亿张图像 该软件功能强大,可以帮助解决诸如入店行窃,身份盗窃,谋杀和儿童性剥削案件等犯罪,但所有这些都以牺牲隐私为代价。

Big tech companies such as google refrained from doing such in 2018: when the company put the kibosh on the Project Maven (awarded by the US Pentagon). After the contract expired (the company said the project was too unethical and about 12 employees left google because of the unethical project). The aim of the project was to support the advance development of human-identifying drone technology by analyzing drone footage using AI trained on billions of data sets derived through the company’s other product (Not long enough a company named Palantir took over the project). Another recent example is the Cambridge Analytica.

谷歌(Google)等大型科技公司在2018年避免这样做:公司将kibosh放在Project Maven上 (由美国五角大楼授予)。 合同到期后(该公司表示该项目太不道德,由于该项目不道德,约有12名员工离开了Google)。 该项目的目的是通过使用通过对公司其他产品衍生的数十亿数据集进行训练的AI分析无人机画面来支持人类识别无人机技术的进一步发展(时间不长,一家名为Palantir的公司接管了该项目 )。 最近的另一个例子是Cambridge Analytica 。

With all the above examples, we can see that the future is bright for AI whilst considering the ethical and moral section of these advancement.

通过以上所有示例,我们可以看到,在考虑这些进步的道德和道德方面的同时,人工智能的未来是光明的。

Dr. Ewa Luger (Chancellor’s Fellow, Digital Arts and Humanities. University of Edinburgh.) said the most ethical and recurrent problem faced by a data scientist are:

Ewa Luger博士(爱丁堡大学数字艺术与人文科学大臣)说,数据科学家面临的最道德和经常性问题是:

  1. Algorithm算法
  2. Prejudice/Bias偏见/偏见
  3. Explain-ability AI (XAI)可解释性AI(XAI)
  4. Privacy隐私

数据从业者应如何看待以激发其道德操守? (What should a Data Practitioner look at to inspire him/her to work ethically?)

Has every revolution has it good and bad side, the data revolution will inflict harm in way intended or not intended to, just as the Clearview problem and so on. Not to exacerbate harm the data revolution will bring, it is important for data scientist to be ethical when handling consumer data.

每一次革命都有好与坏的一面,数据革命将以有意或无意的方式造成损害,就像Clearview问题一样。 为了不加剧数据革命将带来的损害,对于数据科学家而言,在处理消费者数据时要具有道德性很重要。

How then can we make a data scientist more ethical, what could inspire a data scientist to do ethical work or what ethical/moral laws or rules have been laid down to inspire an ethical environment?

那么,我们如何才能使数据科学家更具道德呢?怎样激发数据科学家从事道德工作呢?或者制定了哪些道德/道德法律或规则来激发道德环境呢?

Till date, there hasn’t been a law or rule to inspire an ethical environment for data scientist. However, to inspire an ethical environment, Ben Olsen a Sr. Content Developer at Microsoft drafted a data oat referencing the Hippocratic oat. He proposed what a modern data oat might look like:

到目前为止,还没有法律或规则可以激发数据科学家的道德环境。 但是,为了激发道德环境,Microsoft的高级内容开发人员Ben Olsen起草了引用希波克拉底燕麦的数据燕麦。 他提出了现代数据燕麦的外观:

I, a Data Practitioner will promote the well-being of others and myself while striving to do no harm with data through:

我,一名数据从业者,将努力通过不损害数据的方式促进他人和我自己的福祉:

a. Professional application of analytical technique

一个。 分析技术的专业应用

b. Humility in analytical claims

b。 分析要求中的谦虚

c. Anticipation of legal and regulatory scenarios

C。 预期法律和监管情景

d. Transparency in computation and documentation

d。 计算和文档透明度

e. Fidelity to this oath beyond the bottom line.

e。 保真度超越了底线。

Other ways a data practitioner could inspire an ethical environment will be asking him/herself critical questions when handling consumer data. These questions include but not limited to:

数据从业人员可以激发道德环境的其他方式是在处理消费者数据时询问他/她自己的关键问题。 这些问题包括但不限于:

  1. Is the data bias in terms of gender, prejudice etc.?数据是否存在性别,偏见等偏见?
  2. How much relative importance should be given to the data?应该给数据多少相对重要性?
  3. Can the process of getting the result be explainable?获得结果的过程可以解释吗?
  4. Is the algorithm bias? What bases or intuition is the algorithm built on?算法有偏差吗? 该算法基于什么基础或直觉?
  5. What factors or features did my algorithm consider to get this conclusion?我的算法考虑了哪些因素或功能来得出此结论?
  6. Will my result inflect harm or do good; how much weight should be given to any?我的结果会损害健康还是行善? 应该给多少重量?
  7. What laws and regulatory should be considered when handling these data?

    处理这些数据时应考虑哪些法律和法规?

  8. What consumer right might I have impinge while handling the data?在处理数据时,我可能会遇到哪些消费者权利?
  9. Because I have been given consent doesn’t fully mean I shouldn’t respect privacy?因为得到了我的同意并不完全意味着我不应该尊重隐私吗?

These questions goes on and on which helps to create an ethical environment, as they say with great power comes great responsibility and being an ethical data practitioner will go a long way to paving the way for a safe and responsible social implication and integration of AI.

这些问题持续不断,并有助于建立道德环境,因为他们说,强大的权力带来巨大的责任,而成为一名道德的数据从业者将为安全,负责任的社会影响和AI集成铺平道路。

翻译自: https://medium.com/swlh/does-a-data-scientist-have-to-be-ethical-d444b39d8445


http://www.taodudu.cc/news/show-6914912.html

相关文章:

  • Python基础(5)-Pandas
  • English learning:writing 中一定能用到的句子
  • docker-composedown卡住
  • dotnet 用 gcdump 调试应用程序内存占用
  • 键盘导航网:创意简洁导航网站,让你的键盘变网站导航
  • CSDN网站个性化推荐功能测试
  • 如何个性化hugo个人博客网站
  • iOS开发-键盘样式风格有关设置
  • IOS控件系列--优雅的表情键盘设计(扩展性好)
  • IDEA字体颜色、主题风格个性化 —— 手把手带你尽展个性
  • IQKeyboardManager 键盘管理工具(个性化设置)
  • 三周速通AWS Certified Solutions Architect - Associate(SAA-C03)经验分享
  • Enterprise Architect安装教程
  • Enterprise Architect常见问题
  • 优秀IT建筑师(Architect)之路
  • 架构设计师(Architect)的专业与角色(转载)
  • 计算机公式YEAR的含义是,Excel 中函数YEAR是什么意思,year是什么意思怎么读通俗点...
  • year by year
  • year
  • Java中【年(year)和周年(week year)】的区别
  • java.time.Year详解
  • 决策树思想介绍
  • 树结构如何在关系型数据库中存储
  • 点评Cat报表、消息类型简介
  • 气传导耳机哪个品牌比较好?四款优秀气传导耳机推荐
  • 一个希捷ST2000VX000 2TB监控级硬盘的突然损坏
  • 北京理工大学计算机学院课程表,北京理工大学工业设计课程表.doc
  • DHCP服务器设计
  • 【flutter 动画汇总】
  • Flutter 动画设计之AnimatedBuilder

数据科学家必须符合道德吗相关推荐

  1. 数据科学家 数据工程师_数据科学家实际上赚了多少钱?

    数据科学家 数据工程师 目录 (Table of Contents) Introduction介绍 Junior Data Scientist初级数据科学家 Mid-Level Data Scient ...

  2. 数据科学家必须要掌握的5种聚类算法

    编译 | AI科技大本营 参与 |  刘 畅 编辑 |  明 明 [AI科技大本营导读]聚类是一种将数据点按一定规则分群的机器学习技术.给定一组数据点,我们可以使用聚类算法将每个数据点分类到一个特定的 ...

  3. 论一枚数据科学家的自我修养

    作者 | 林荟 责编 | 何永灿 在回答这个问题之前,希望你先想想另外一个问题:为什么要成为数据科学家?当然,如果你是为了10万美元的年薪也无可厚非,但是我衷心希望你能将这个职业和自己的价值感挂钩.因 ...

  4. 专访 | 微软首席数据科学家谢梁:从经济学博士到爬坑机器学习,这十年我都经历了啥?

    谢梁,美国微软总部首席数据科学家,本科毕业于西南财经大学经济学专业,然后在中国工商银行从事信贷评估工作,一年后辞职到纽约州立大学学习应用计量经济学.研究兴趣主要是混合模型(mixed model)和数 ...

  5. 数据科学家成长指南(下)

    点击上方"Datawhale",选择"星标"公众号 第一时间获取价值内容 本文是数据科学家学习路径的的完结篇,算上<数据科学家成长指南(上)>和&l ...

  6. 找不到完美数据科学家?你还可以组建一支数据科学梦之队

    ◆ ◆ ◆ 导读 提供洞察和分析的公司都在尽力为自己组建完美的数据科学团队,这通常有两条路可以走. 大部分公司都在挣扎中选择了第一条路:寻找这些工资非常贵又很少见的独角兽人才,即同时具备这多种技能的独 ...

  7. 数据科学之基石:数据科学家必须掌握的10个统计学概念

    2021-01-29 12:29:00 全文共2848字,预计学习时长8分钟 图源:Google 数据科学是一个跨学科领域,其基石之一是统计学.如果没有足够的统计知识,就很难理解或解释数据. 统计学帮 ...

  8. Halliburton首席数据科学家兼技术研究员谈能源行业AI应用现状

    能源行业属于高技术驱动性行业.由于需要在严苛的条件下处理大型设备中的各类自然资源数据,石油与天然气行业长期使用数据及分析技术提高流程效率.近年来,能源行业企业开始加大对各类AI这既的应用,通过多种方式 ...

  9. 基于100,000篇演讲的分析数据科学家发现了最佳演讲者的特征——及时解释听众不懂的词语,必要时提高10%的音调,正确和恰当的手势,氛围的营造...

    [TD精选] 基于100,000篇演讲的分析数据科学家发现了最佳演讲者的特征 相信大部分人一定试图寻找过使得自己的演讲变得更加吸引人,更加有气势的方法.现如今,在大数据工具和机器学习技术的辅助下,找到 ...

最新文章

  1. 2019计算机统考word视频,2019年9月 全国计算机二级 MS Office (Word 美化文档)
  2. Spring中配置数据源的4种形式
  3. html文件本质上是一个,html文件是什么
  4. 提高redis cluster集群的安全性,增加密码验证
  5. Python安装dlib包
  6. 开源自动化部署工具_6种开源家庭自动化工具
  7. 牛客OI周赛6-提高组 B 践踏
  8. sns.barplot/sns.countplot/sns.boxplot参数设置
  9. 2017嵌入式软件行业现状及概述
  10. (转载)c++builder/delphi中透明panel及透明窗口的实现方法_delphi教程
  11. 确定性的丧失——20世纪新启蒙运动的来龙去脉
  12. 免费专业的linux web应用防火墙国内排名推荐
  13. h5 invoke android,uniapp安卓版本11.0.0以上真机调试App: onLaunch have been invoked
  14. office2016安装后右键新建没有word、excel、ppt等解决方法
  15. Java使用itextpdf根据关键词插入图片
  16. JS 网页设置横竖屏切换
  17. mysql left用法
  18. sentinel限流入门
  19. 只因女婿是VB程序员,刚见面就被未来岳父轰出家门
  20. Docker:云栖社区开源论题及Spark开源论题

热门文章

  1. C#面向对象的三大特征
  2. 晒晒牛人老公的爆笑忏悔
  3. 腾讯慧眼高可用架构设计
  4. python networkx教程_python – 如何使用networkx绘制子图
  5. 黑马程序员javaweb增删改查笔记brand-demo0
  6. Direct3D9设备构成-------VB6编程学习DX9游戏编程DirectX9编程2D小游戏源码冷风引擎CoolWind2D游戏引擎(4)
  7. 详解C盘Windows文件夹
  8. 推荐 :批大小如何影响模型学习 你关注的几个不同的方面
  9. sentinel哨兵模式详细介绍
  10. Select2-Ajax获取数据