原文出处:http://www.lornajane.net/posts/2014/working-with-php-and-beanstalkd

Working with PHP and Beanstalkd

I have just introduced Beanstalkd into my current PHP project; it was super-easy so I thought I'd share some examples and my thoughts on how a job queue fits in with a PHP web application.

The Scenario

I have an API backend and a web frontend on this project (there may be apps later. It's a startup, there could be anything later). Both front and back ends are PHP Slim Framework applications, and there's a sort of JSON-RPC going on in between the two.

The job queue will handle a few things we don't want to do in real time on the application, such as:

  • updating counts of things like comments; when a comment is made, a job gets created and we can return to the user. At some point the job will get processed updating the counts of how many comments are on that thing, how many comments the user made, adding to a news feed of activities ... you get the idea.
  • cleaning up; we have had a few cron jobs running to clean up old data but now those cron jobs put jobs into beanstalkd which gives us a bit more visibility and control of them, and also means that those big jobs aren't running on the web servers (we have a separate worker server)
  • other periodic things like updating incoming data/content feeds or talking to some of the 3rd party APIs we use like Mailchimp and Bit.ly

Adding Jobs to the Queue

There are two ends to this process, let's start by adding jobs to the queue. Anything you don't want to make a user wait for is a good candidate for a job. As I mentioned, some of our jobs get handled periodically with cron creating jobs, but since they are just beanstalkd jobs I can easily give an admin interface to trigger them manually also. In this case, I'm just making a job to process things we update when a user makes a comment.

A good job is very self-contained; a bit like a stateless web request it should contain anything that is needed to process it and not rely on anything that went before. On a live platform you would typically have many workers all consuming jobs from a single queue so there are no guarantees that one job will be completed before the next one begins to be processed! You can put any data you like into a job; you could send all the data fields to fill in and send an email template for example.

In this example I need to talk to the database anyway so I'm just storing information about which task should be done and including the comment ID with it.

I'm using an excellent library called Pheanstalk which is well-documented and available via Composer. The lines I added to my composer.json:

  "require": {"pda/pheanstalk": "2.1.0",}

I start by creating an object which connects to the job server and allows me to put jobs on the queue:

    new Pheanstalk_Pheanstalk($config['beanstalkd']['host'] . ":" . $config['beanstalkd']['port'])

The config settings there will change between platforms but for my development version of this project, beanstalkd is just running on my laptop so my settings are the defaults:

[beanstalkd]
host=127.0.0.1
port=11300

Once you have the object created, $queue in my example, we can easily add jobs with the put() command - but first you specify which "tube" to use. The tubes would be queues in another tool, just a way of putting jobs into different areas, and it is possible to ask the workers to listen on specific tubes so you can have specialised workers if needed. Beanstalkd also supports adding jobs with different priorities.

Here's adding the simple job to the queue; the data is just a string so I'm using json_encode to wrap up a couple of fields:

  $job = array("action" => "comment_added","data" => array("comment_id" => $comment_id));$queue->useTube('mytube')->put(json_encode($job));

I wrote a bit in a previous post about how to check the current number of jobs on beanstalkd, so you can use those instructions to check that you have jobs stacking up. To use those, we'll need to write a worker.

Taking Jobs Off The Queue

The main application and the worker scripts don't need to be in the same technology stack since beanstalkd is very lightweight and technology agnostic. I'm working with an entirely PHP team though so both the application and the workers are PHP in this instance. The workers are simply command-line PHP scripts that run for a long time, picking up jobs when they become available.

For my workers I have added the Pheanstalk libraries via Composer again and then my basic worker script looks like this:

require("vendor/autoload.php");$queue =  new Pheanstalk_Pheanstalk($config['beanstalkd']['host'] . ":" . $config['beanstalkd']['port']);$worker = new Worker($config);// Set which queues to bind to
$queue->watch("mytube");// pick a job and process it
while($job = $queue->reserve()) {$received = json_decode($job->getData(), true);$action   = $received['action'];if(isset($received['data'])) {$data = $received['data'];} else {$data = array();}echo "Received a $action (" . current($data) . ") ...";if(method_exists($worker, $action)) {$outcome = $worker->$action($data);// how did it go?if($outcome) {echo "done \n";$queue->delete($job);} else {echo "failed \n";$queue->bury($job);}} else {echo "action not found\n";$queue->bury($job);}}

Here you can see the Pheanstalk object again, but this time we use some different commands:

  • reserve() picks up a job from the queue and marks it as reserved so that no other workers will pick it up
  • delete() removes the job from the queue when it has been successfully completed
  • bury() marks the job as terminally failed and no workers will restart it.

The other alternative outcome is to return without a specific status - this will cause the job to be retried again later.

Once one job has been processed, the worker will pick up another, and so on. With multiple workers running, they will all just pick up jobs in turn until the queue is empty again.

The Worker class really doesn't have much that is beanstalkd-specific. The constructor connects to MySQL and also instantiates a Guzzle client which is used to hit the backend API of the application for the tasks where all the application framework and config is really needed to perform the task - we create endpoints for those and the worker has an access token so it can make the requests. Here's a snippet from the Worker class:

class Worker
{protected $config;protected $db;protected $client;public function __construct($config) {$this->config = $config;// connect to mysql$dsn = 'mysql:host=' . $config['db']['host'] . ';dbname=' . $config['db']['database'];$username = $config['db']['username'];$password = $config['db']['password'];$this->db = new \PDO($dsn, $username, $password,array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"));$this->client = new \Guzzle\Http\Client($config['api']['url']);}public function comment_added($data) {$comment_sql = "select * from comments where comment_id = :comment_id";$comment_stmt = $this->db->prepare($comment_sql);$comment_stmt->execute(array("comment_id" => $data['comment_id']));$comment = $comment_stmt->fetch(PDO::FETCH_ASSOC);if($comment) {// more SQL to update various counts}return true;}

There are various different tasks here that call out to either our own API backend, or to MySQL as shown here, or to something else.

Other Things You Should Probably Know

Working with workers leads me to often do either one of these:

  1. forget to start the worker and then wonder why nothing is working
  2. forget to restart the worker when I deploy new code and then wonder why nothing is working

Beanstalkd doesn't really have access control so you will want to lock down what can talk to your server on the port it listens on. It's a deliberately lightweight protocol and I like it, but do double check that it isn't open to the internet or something!

Long-running PHP scripts aren't the most robust thing in the world. I recommend running then under the tender loving care of supervisord (which I wrote about previously) - this has the added advantage of a really easy way to restart your workers and good logging. You should probably also include a lot more error handling than I have in the scripts here; I abbreviated to keep things readable.

What did I miss? If you're working with Beanstalkd and PHP and there's something I should have mentioned, please share it in the comments. This was my first beanstalkd implementation but I think it's the first of many - it was super-easy to get started!

转载于:https://www.cnblogs.com/argb/p/3729087.html

working-with-php-and-beanstalkd相关推荐

  1. 一种消息和任务队列——beanstalkd

    beanstalkd 是一个轻量级消息中间件,其主要特性: 基于管道  (tube) 和任务 (job) 的工作队列 (work-queue):d 管道(tube),tube类似于消息主题(topic ...

  2. 【队列源码研究】消息队列beanstalkd源码详解

    顺风车运营研发团队 李乐 1.消息队列简介 计算机软件发展的一个重要目标是降低软件耦合性: 网站架构中,系统解耦合的重要手段就是异步,业务之间的消息传递不是同步调用,而是将一个业务操作分为多个阶段,每 ...

  3. beanstalkd消息队列在生产环境的应用

    Beanstalkd 是一个高性能的消息队列中间件,本博文宅鸟将介绍一下这个东东的使用. 一.先通过概念让大家了解Beanstalkd的特性和工作场景. Beanstalkd 是一个轻量级消息中间件, ...

  4. beanstalkd php扩展,PHP操作Beanstalkd队列(1)安装与基础

    Beanstalkd 是一款高性能.轻量级的分布式内存消息队列. 轻量是通过对比而来,相比RabbitMQ和Kafka,Beanstalkd显得更加简单易用,同时可以满足小系统的应用,可以实现生产者和 ...

  5. Beanstalkd使用

    Beanstalkd,一个高性能.轻量级的分布式内存队列系统,最初设计的目的是想通过后台异步执行耗时的任务来降低高容量Web应用系统的页面访问延迟,支持过有9.5 million用户的Facebook ...

  6. Beanstalkd工作队列

    Beanstalkd工作队列 Beanstalkd 是什么 Beanstalkd是目前一个绝对可靠,易于安装的消息传递服务,主要用例是管理不同部分和工人之间的工作流应用程序的部署通过工作队列和消息堆栈 ...

  7. linux 搭建任务队列集群,beanstalkd任务队列 linux平台安装测试

    client-libraries https://github.com/kr/beanstalkd/wiki/client-libraries 1,安装 beanstalkd sudo apt-get ...

  8. Beanstalked的初步了解和使用(包括利用beanstalkd 秒杀消息队列的实现)

    一  Beanstalkd 是什么 Beanstalkd,一个高性能.轻量级的分布式内存队列系统 二  Beanstalkd 特性 1. 优先级(priority) 注:优先级就意味 支持任务插队(数 ...

  9. Beanstalkd消息队列的安装与使用

    一.Beanstalkd是什么? Beanstalkd是一个高性能,轻量级的分布式内存队列 二.Beanstalkd特性 1.支持优先级(支持任务插队) 2.延迟(实现定时任务) 3.持久化(定时把内 ...

  10. ubuntu下安装beanstalkd

    安装之前需要先安装libevent-dev支持 apt-get install libevent-dev ubuntu下安装beanstalkd sudo apt-get install beanst ...

最新文章

  1. java 千分位格式话_Java 字符串小数转成千分位格式
  2. 语义SLAM开源代码汇总
  3. 阿里开源自主研发的 DFSMN 语音识别模型,引谷歌论文引用
  4. 【通知】《深度学习之模型设计》第三次重印,欢迎读者支持!
  5. python 各种推导式玩法
  6. 【Python】Python入门:4000字能把元组tuple讲透吗?
  7. mstsc /console 远程命令
  8. Android6.0到底有什么不一样
  9. 2022年最值得学习的 5 种编程语言,你有在学习吗?
  10. Memcached内存管理机制浅析
  11. BUCK电路中,输入电压增加后,电感电流曲线变化的推导 // 《精通开关电源设计》P44 图2-3
  12. Oracle----MLDN
  13. 互联网安全架构师培训课程 互联网安全与防御高级实战课程 基于Springboot安全解决方案
  14. 概率图模型之隐马尔可夫模型
  15. 数学分析教程(科大)——6.3笔记+习题
  16. C语言12之什么是字符串类型?
  17. python代码情话_程序员的土味情话~(内含表白代码)
  18. “国防七子”、“C9联盟”、“华东五虎”,中国最顶尖的大学都在这!
  19. 统计一TXT文档中单词出现频率,输出频率最高的10个单词
  20. MySQL基本数据类型与Java基本数据类型

热门文章

  1. 暴涨2000+? 2021年软件测试平均薪资出来了,我坐不住了
  2. mysql 订单id格式_【mysql】订单规则id怎么生成?
  3. python if elif else用法_python----if -- elif -- else 用法
  4. 低代码开发平台_如何挑选最适合你的低代码开发平台
  5. cmd 将文件夹下文件剪切到另外一个文件_手把手教你运行第一个 Java 程序,看不懂你来骂我!...
  6. oracle登录页面错误,php和oracle的页面登录问题
  7. 矩阵特征值的一些特点
  8. DXUT实战1:CG+D3D9+DXUT(june_2010)
  9. 信号与槽是如何实现的_铺天盖地的吐槽,结果却是卖到断货?iPhone12魅力何在?...
  10. ncl批量处理多个nc文件_Python办公自动化批量处理文件,一学就会