机器学习增强学习

by Rodrigo Araújo

由RodrigoAraújo

为什么我认为机器学习增强型软件系统是未来。 (Why I think machine learning-enhanced software systems are the future.)

I have been brewing the idea of using machine learning to improve software systems since 2016. It was pretty vague and broad, without an actionable plan. I just had the intuition — the software configuration and tuning, especially after the adoption of microservices, was getting too complex.

自2016年以来，我一直在酝酿使用机器学习来改进软件系统的想法。它非常模糊且广泛，没有可行的计划。我只是凭直觉就知道，软件配置和调整(尤其是在采用微服务后)变得太复杂了。

配置和调整系统的复杂性日益增加 (The increasing complexity of configuring and tuning systems)

If you have enough experience in the software industry, then it’s very likely that you’ve struggled with either a configuration problem or a tuning problem.

如果您在软件行业有足够的经验，那么很可能会遇到配置问题或调整问题。

Configuration and tuning problems are pretty common and can lead to really bad outages. They often occur when:

配置和调整问题非常普遍，并且可能导致严重的停机。它们通常在以下情况下发生：

Some parts of the system are poorly or wrongly configured, or系统的某些部分配置不正确或配置错误，或者
A configuration that worked before now doesn’t work because the context of the system has changed.由于系统的上下文已更改，以前有效的配置现在不起作用。

Think of a number of database replicas and their writing schemes. Or in Postgresql, think of the number of shared buffers, effective cache size, and the min and max wal size.

考虑许多数据库副本及其编写方案。或者在Postgresql中，考虑共享缓冲区的数量，有效的缓存大小以及最小和最大wal大小。

If wrongly configured from the start, it won’t work in the given context, plain and simple. What’s more interesting, though, is if it’s correctly configured, it might work at a given time. But as the context changes — system workload, system resources usage, overall system architecture — the system will behave poorly. Or, even worse, an outage might happen.

如果从一开始就配置错误，它将无法在给定的上下文中简单明了地工作。不过，更有趣的是，如果配置正确，它可能会在给定的时间工作。但是，随着上下文的变化(系统工作负载，系统资源使用情况，整体系统架构)，系统将表现不佳。甚至更糟的是，可能会发生中断。

This will, inevitably, lead to manually-performed operations and the creation of heuristics. In other words, it will lead to:

这将不可避免地导致手动执行操作和启发式方法的创建。换句话说，它将导致：

Oh, we should set X to A, when workload is T, but it should be A+10 when workload is T+100 and we have system resources usage above 80%… I guess. Or maybe let’s just up a queue in front of this component, queues solve everything, right?

哦，当工作负载为T时，我们应该将X设置为A，但是当工作负载为T + 100且系统资源使用率超过80％时，应将X设置为A + 10。还是让我们在该组件前面排一个队，队列解决所有问题，对吗？

Now multiply this scenario by tens or hundreds of services. Think for a second about the cognitive burden resulting from these configurations.

现在，将这种情况乘以数十或数百个服务。想一想这些配置导致的认知负担。

This is not a new concern. In 2003, Ganek and Corbi discussed the need for autonomic computing to handle the complexity of managing software systems. They noted that managing complex systems became too costly, labor-intensive, and prone to error due to the pressure engineers felt while maintaining them. This increased the potential of system outages with a concurrent impact on business.

这不是新问题。在2003年，Ganek和Corbi 讨论了需要自主计算来处理管理软件系统的复杂性。他们指出，由于工程师在维护系统时所承受的压力，管理复杂的系统变得过于昂贵，劳动密集型并且容易出错。这增加了系统中断的可能性，同时对业务产生影响。

Even nowadays, most of the configurations and tuning of the systems are performed manually, often in run-time, which is known to be a very time-consuming and risky practice. Check out these two links (here and here) to read more about it.

即使在今天，大多数系统的配置和调整都是手动执行的，通常是在运行时进行，这被认为是非常耗时且有风险的做法。查看这两个链接( 此处和此处 )以了解更多信息。

自主计算的需求 (The need for autonomic computing)

Most decisions to configure and tune the system are made based on the context — there are many different variables such as workload, number of instances of some services, resources usage, and more. So why not delegate these tasks to something that excels at exactly that? Machine learning sounds like a feasible tool for the job.

配置和调整系统的大多数决定都是基于上下文做出的-存在许多不同的变量，例如工作量，某些服务的实例数，资源使用率等等。那么，为什么不将这些任务委托给在这方面表现出色的工作呢？ 机器学习听起来像是一项可行的工作工具。

After starting my Masters at the University of British Columbia, I kept working on this idea. It seemed interesting although quite weird, and, sometimes, unpractical and impossible to implement.

在不列颠哥伦比亚大学攻读硕士学位后，我一直致力于这个想法。看起来很有趣，尽管很奇怪，而且有时不切实际，无法实现。

To my surprise, I realized I wasn’t alone. Some very interesting people were working on these ideas — so it might not be that weird, unpractical, and impossible.

令我惊讶的是，我意识到自己并不孤单。一些非常有趣的人正在研究这些想法-因此可能并不是怪异，不切实际和不可能。

Recently, Jeff Dean — a man that I admire a lot — gave a talk at NIPS 2017 talking about machine learning for systems, where he stated:

最近，我非常佩服的杰夫·迪恩(Jeff Dean) 在NIPS 2017上发表了有关系统机器学习的演讲，他说：

Learning should be used throughout our computing systems. Traditional low-level systems code (operating systems, compilers, storage systems) does not make extensive use of machine learning today. This should change!

学习应在我们的整个计算系统中使用。如今，传统的低级系统代码(操作系统，编译器，存储系统)并未广泛使用机器学习。这应该改变！

Computer Systems are filled with heuristics: compilers, networking code, operating systems. Heuristics have to work well “in general case”. [They] generally don’t adapt to actual pattern of usage and don’t take into account available context

计算机系统充满了试探法：编译器，网络代码，操作系统。启发式方法必须在“一般情况下”工作良好。 [他们]通常不适应实际的使用模式，并且不考虑可用的上下文

Learning in the core of all of our computer systems will make them better/more adaptive.

在我们所有计算机系统的核心中学习将使它们变得更好/更具适应性。

I was in complete awe when I read this. One of the engineers I admire the most was talking about the very same ideas I’ve been thinking about and working on.

当我读到这篇文章时，我感到非常敬畏。我最敬佩的工程师之一是谈论我一直在思考和从事的相同想法。

This led me to think that it’s not only interesting but natural to think about enhancing software systems with machine learning. Throughout the whole software stack, we have many heuristics that, although they work well, could be improved by machine learning.

这使我认为， 考虑通过机器学习增强软件系统不仅很有趣，而且很自然。 在整个软件堆栈中，我们有许多启发式方法，尽管它们运行良好，但可以通过机器学习加以改进。

Is it challenging and potentially risky? Yes, most definitely. Especially given that interpretability, apparently, has become a secondary goal in the machine learning community. How can we interpret and explain the decisions made by neural nets?

它具有挑战性并具有潜在风险吗？是的，绝对可以。尤其是考虑到可解释性，显然已经成为机器学习社区的次要目标。我们如何解释和解释神经网络的决策？

However, with that said, these obstacles shouldn’t hinder scientific and technological progress. Yes, we should question old paradigms and try to improve things.

但是，尽管如此，这些障碍不应该阻碍科学技术的进步。是的，我们应该质疑旧的范例并尝试改进。

迈向机器学习增强型软件系统 (Towards machine learning-enhanced software systems)

As Jeff Dean pointed out: we need to find practical ways to make systems data-aware. We need systems that collect metrics and metadata about themselves. To achieve this, we could learn a thing a two from the ideas in systems observability and instrumentation. We have been instrumenting systems for decades, and the data is already there.

正如Jeff Dean指出的那样：我们需要找到使系统具有数据感知能力的实用方法。我们需要收集自己的指标和元数据的系统。为了实现这一目标，我们可以从系统可观察性和仪器化的思想中学到两点。我们数十年来一直在使用仪器系统，并且数据已经存在。

We also need to find practical and clean ways to integrate machine learning components into software systems, making learning a first-class citizen in the system. This will lead to systems that learn how to improve themselves, beating heuristics and manually-performed operations. Think about this for a second. It does sound cool and feasible.

我们还需要找到实用且干净的方法将机器学习组件集成到软件系统中，从而使学习成为系统中的一流公民。这将导致系统学习如何改善自身，击败启发法和手动执行操作。想一想。听起来确实很酷而且可行。

I would also add that we need practical and clean ways to propagate the decisions made by the learned models to the rest of the system. This would allow the system to have self-adaptive capabilities. Here, we could learn something from the control theory community.

我还要补充一点，我们需要实用且干净的方法来将学习的模型所做出的决策传播到系统的其余部分。这将使系统具有自适应功能。在这里，我们可以向控制理论界学习一些东西。

The general idea is fairly simple: make a system learn about its behavior by training a model on its context. Then allow it to change its structures and configurations in order to optimize for a certain scenario. Now implement this idea in such a way that it could be possible to integrate it into many kinds of systems.

总体思路很简单：通过在上下文中训练模型来使系统了解其行为。然后允许它更改其结构和配置，以便针对特定方案进行优化。现在，以一种有可能将其集成到多种系统中的方式来实现该想法。

摘要 (Summary)

The most interesting questions I have in mind are:

我想到的最有趣的问题是：

Can self-adaptation by learned models lead to more stable, faster, safer software systems? Can it reduce the need for manually configuring and tuning systems, allowing engineers to focus on more important tasks?学习型模型的自适应是否可以导致更稳定，更快，更安全的软件系统？是否可以减少手动配置和调整系统的需求，使工程师专注于更重要的任务？
Can this be easily integrated into software systems, requiring only small changes to the codebase?是否可以将其轻松集成到仅需对代码库进行少量更改的软件系统中？
Can this work with low overhead?可以以较低的开销工作吗？

It is worth noting that this would not replace good engineers, but would rather free the engineers’ cognitive abilities to focus on what matters.

值得注意的是，这并不能取代优秀的工程师，而是可以释放工程师的认知能力，专注于重要的事情。

I genuinely believe that this will become a trend in the next few years. I myself am working on these ideas as part of my graduate studies, and I will be posting the results of my research, so stay tuned.

我真正相信，这将在未来几年成为趋势。在我的研究生学习中，我本人正在研究这些想法，并且我将发布研究结果，敬请期待。

翻译自: https://www.freecodecamp.org/news/why-i-think-machine-learning-enhanced-software-systems-are-the-future-5a1c978486b4/

机器学习增强学习

机器学习增强学习_为什么我认为机器学习增强型软件系统是未来。相关推荐

凸优化机器学习深度学习_我应该在机器学习项目中使用哪个优化程序
凸优化机器学习深度学习 This article provides a summary of popular optimizers used in computer vision, natural ...
吴恩达《机器学习》学习笔记十四——应用机器学习的建议实现一个机器学习模型的改进
吴恩达<机器学习>学习笔记十四--应用机器学习的建议实现一个机器学习模型的改进一.任务介绍二.代码实现 1.准备数据 2.代价函数 3.梯度计算 4.带有正则化的代价函数和梯度计算 5 ...
大数据基石python学习_资源 | 177G Python/机器学习/深度学习/算法/TensorFlow等视频，涵盖入门/中级/项目各阶段！...
原标题:资源 | 177G Python/机器学习/深度学习/算法/TensorFlow等视频,涵盖入门/中级/项目各阶段! 这是一份比较全面的视频教程,基本上包括了市面上所有关于机器学习,统计学习, ...
机器学习建立模型_建立生产的机器学习系统
机器学习建立模型 When businesses plan to start incorporating machine learning to enhance their solutions, t ...
lime 深度学习_用LIME解释机器学习预测并建立信任
lime 深度学习 It's needless to say: machine learning is powerful. 不用说:机器学习功能强大. At the most basic level, ...
学习机器学习的项目_辅助项目在机器学习中的重要性
学习机器学习的项目提示与建议 (Tips and Advice) There are a few questions that are asked frequently by machine lea ...
机器学习多元线性回归_过度简化的机器学习（1）：多元回归
机器学习多元线性回归 The term machine learning may sound provocative. Machines do not learn like humans do. Ho ...
机器学习之线性回归_通过线性回归开始机器学习之旅
机器学习之线性回归线性回归 (Linear Regression) Linear regression is a part of Statistics that defines the relati ...
不使用机器学习的机器视觉_我关于使用机器学习进行体育博彩的发现使博彩公司总能胜出
不使用机器学习的机器视觉 One afternoon, in the middle of my holidays the thought of using machine learning to pr ...

机器学习增强学习_为什么我认为机器学习增强型软件系统是未来。