Async Python - Short Gevent Introduction
Python provides a variety of methods to handle asynchronous programming. One of them is a Gevent, a concurrency library which provides greenlets a thread-like abstraction. Greenlets wrap up an event loop and allow an asynchronous code execution.
What is Greenlet?
A greenlet is implemented as a coroutine and it represents a task scheduled to run cooperatively inside of the OS process. Some people favour greenlets over nesting callback, arguing that linear code composition is simpler to understand. Also, because of GIL, threads in Python don't provide a real concurrency
Only one greenlet can run at any given time. We talk about a context switch, when execution flow changes from one greenlet to another. In other words, a greenlet only « yields » to another greenlet only if a blocking function is reached e.g.
gevent.sleep() - this is known as cooperative multitasking. Traditional threads may yield to other threads at any time, whenever OS decides to switch the context - it is called preemptive multitasking.
Greenlets over threads
Greenlets are also « cheaper » to create than threads; it makes them more adapted in situations when it is necessary to handle large number of persistent TCP connections, e.g. a server that pushes data to many clients. In that scenario spawning a lightweight greenlet per connection is more efficient than forking a thread. It also seems wasteful to allocate a thread pool of certain size beforehand in that case.
Gevent also provides « monkey-patched » versions of modules that act in a cooperative way for most of the blocking system calls in the standard library (including those in
select modules). By invoking
monkey.patch_all() we make sure that all these modules will become compatible with Gevent at runtime.
Following code adapted from gevent examples asynchronously executes
print_head on URLs from given list.
import gevent import urllib2 from gevent import monkey monkey.patch_all() urls = ['http://www.google.com', 'http://www.example.com', 'http://www.python.org'] def print_head(url): print 'Starting %s' % url data = urllib2.urlopen(url).read() print '%s: %s bytes: %r' % (url, len(data), data[:50]) jobs = [gevent.spawn(print_head, url) for url in urls] gevent.joinall(jobs, timeout=2)