Cache¶
A MySQL-specific backend for Django’s cache framework.
MySQLCache¶
An efficient cache backend using a MySQL table, an alternative to Django’s
database-agnostic DatabaseCache
. It has the following advantages:
Each operation uses only one query, including the
*_many
methods. This is unlikeDatabaseCache
which uses multiple queries for nearly every operation.Automatic client-side
zlib
compression for objects larger than a given threshold. It is also easy to subclass and add your own serialization or compression schemes.Faster probabilistic culling behaviour during write operations, which you can also turn off and execute in a background task. This can be a bottleneck with Django’s
DatabaseCache
since it culls on every write operation, executing aSELECT COUNT(*)
which requires a full table scan.Integer counters with atomic
incr()
anddecr()
operations, like theMemcachedCache
backend.
Usage¶
To use, add an entry to your CACHES
setting with:
BACKEND
set todjango_mysql.cache.MySQLCache
LOCATION
set totablename
, the name of the table to use. This name can be whatever you want, as long as it’s a valid table name that’s not already being used in your database.
For example:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "my_super_cache",
}
}
You then need to make the table. The schema is not compatible with that of
DatabaseCache
, so if you are switching, you will need to create a fresh
table.
Use the management command mysql_cache_migration
to output a migration that
creates tables for all the MySQLCache
instances you have configured. For
example:
$ python manage.py mysql_cache_migration
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
# Add a dependency in here on an existing migration in the app you
# put this migration in, for example:
# ('myapp', '0001_initial'),
]
operations = [
migrations.RunSQL(
"""
CREATE TABLE `my_super_cache` (
cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL PRIMARY KEY,
value longblob NOT NULL,
value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p',
expires BIGINT UNSIGNED NOT NULL
);
""",
"DROP TABLE `my_super_cache`"
),
]
Save this to a file in the migrations
directory of one of your project’s
apps, and add one of your existing migrations to the file’s dependencies
.
You might want to customize the SQL at this time, for example switching the
table to use the MEMORY
storage engine.
Django requires you to install
sqlparse
to run the RunSQL
operation in the migration, so make sure it is installed.
Once the migration has run, the cache is ready to work!
Multiple Databases¶
If you use this with multiple databases, you’ll also need to set up routing
instructions for the cache table. This can be done with the same method
that is described for DatabaseCache
in the Django manual, apart
from the application name is django_mysql
.
Note
Even if you aren’t using multiple MySQL databases, it may be worth using routing anyway to put all your cache operations on a second connection - this way they won’t be affected by transactions your main code runs.
Extra Details¶
MySQLCache
is fully compatible with Django’s cache API, but it also extends
it and there are, of course, a few details to be aware of.
incr/decr¶
Like MemcachedCache
(and unlike DatabaseCache
), incr
and decr
are atomic operations, and can only be used with int
values. They have the
range of MySQL’s SIGNED BIGINT
(-9223372036854775808 to
9223372036854775807).
max_allowed_packet¶
MySQL has a setting called max_allowed_packet
, which is the maximum size of
a query, including data. This therefore constrains the size of a cached value,
but you’re more likely to run up against it first with the
get_many
/set_many
operations.
The MySQL 8.0 default is 4MB, and the MariaDB 10.2 default is 16MB. Most applications should be fine with these limits. You can tweak the setting as high as 1GB - if this isn’t enough, you should probably be considering another solution!
culling¶
MySQL is designed to store data forever, and thus doesn’t have a direct way of
setting expired rows to disappear. The expiration of old keys and the limiting
of rows to MAX_ENTRIES
is therefore performed in the cache backend by
performing a cull operation when appropriate. This deletes expired keys first,
then if more than MAX_ENTRIES
keys remain, it deletes 1 /
CULL_FREQUENCY
of them. The options and strategy are described in in more
detail in the Django manual.
Django’s DatabaseCache
performs a cull check on every write operation.
This runs a SELECT COUNT(*)
on the table, which means a full-table scan.
Naturally, this takes a bit of time and becomes a bottleneck for medium or
large cache table sizes of caching. MySQLCache
helps you solve this in two
ways:
The cull-on-write behaviour is probabilistic, by default running on 1% of writes. This is set with the
CULL_PROBABILITY
option, which should be a number between 0 and 1. For example, if you want to use the same cull-on-every-write behaviour as used byDatabaseCache
(you probably don’t), setCULL_PROBABILITY
to 1.0:CACHES = { "default": { "BACKEND": "django_mysql.cache.MySQLCache", "LOCATION": "some_table_name", "OPTIONS": {"CULL_PROBABILITY": 1.0}, } }
The
cull()
method is available as a public method so you can set up your own culling schedule in background processing, never affecting any user-facing web requests. SetCULL_PROBABILITY
to 0, and then set up your task. For example, if you are using celery you could use a task like this:@shared_task def clear_caches(): caches["default"].cull() caches["other_cache"].cull()
This functionality is also available as the management command
cull_mysql_caches
, which you might run as a cron job. It performscull()
on all of yourMySQLCache
instances, or you can give it names to just cull those. For example, this:$ python manage.py cull_mysql_caches default other_cache
…will call
caches['default'].cull()
andcaches['other_cache'].cull()
.
You can also disable the MAX_ENTRIES
behaviour, which avoids the SELECT
COUNT(*)
entirely, and makes cull()
only delete expired keys. To do this,
set MAX_ENTRIES
to -1:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"MAX_ENTRIES": -1},
}
}
Note that you should then of course monitor the size of your cache table well, since it has no bounds on its growth.
compression¶
Like the other Django cache backends, stored objects are serialized with
pickle
(except from integers, which are stored as integers so that the
incr()
and decr()
operations will work). If pickled data has has a size
in bytes equal to or greater than the threshold defined by the option
COMPRESS_MIN_LENGTH
, it will be compressed with zlib
in Python before
being stored, reducing the on-disk size in MySQL and network costs for storage
and retrieval. The zlib level is set by the option COMPRESS_LEVEL
.
COMPRESS_MIN_LENGTH
defaults to 5000, and COMPRESS_LEVEL
defaults to
the zlib
default of 6. You can tune these options - for example, to
compress all objects >= 100 bytes at the maximum level of 9, pass the options
like so:
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"OPTIONS": {"COMPRESS_MIN_LENGTH": 100, "COMPRESS_LEVEL": 9},
}
}
To turn compression off, set COMPRESS_MIN_LENGTH
to 0. The options only
affect new writes - any compressed values already in the table will remain
readable.
custom serialization¶
You can implement your own serialization by subclassing MySQLCache
. It uses
two methods that you should override.
Values are stored in the table with two columns - value
, which is the blob
of binary data, and value_type
, a single latin1 character that specifies
the type of data in value
. MySQLCache by default uses three codes for
value_type
:
i
- The blob is an integer. This is used so that counters can be deserialized by MySQL during the atomicincr()
anddecr()
operations.p
- The blob is a pickled Python object.z
- The blob is a zlib-compressed pickled Python object.
For future compatibility, MySQLCache
reserves all lower-case letters. For
custom types you can use upper-case letters.
The methods you need to override (and probably call super()
from) are:
- django_mysql.cache.encode(obj)¶
Takes an object and returns a tuple
(value, value_type)
, ready to be inserted as parameters into the SQL query.
- django_mysql.cache.decode(value, value_type)¶
Takes the pair of
(value, value_type)
as stored in the table and returns the deserialized object.
Studying the source of MySQLCache
will probably give you the best way to
extend these methods for your use case.
prefix methods¶
Three extension methods are available to work with sets of keys sharing a common prefix. Whilst these would not be efficient on other cache backends such as memcached, in an InnoDB table the keys are stored in order so range scans are easy.
To use these methods, it must be possible to reverse-map the “full” key stored
in the databse to the key you would provide to cache.get
, via a ‘reverse
key function’. If you have not set KEY_FUNCTION
, MySQLCache
will use
Django’s default key function, and can therefore default the reverse key
function too, so you will not need to add anything.
However, if you have set KEY_FUNCTION
, you will also need to supply
REVERSE_KEY_FUNCTION
before the prefix methods can work. For example,
with a simple custom key function that ignores key_prefix
and version
,
you might do this:
def my_key_func(key, key_prefix, version):
return key # Ignore key_prefix and version
def my_reverse_key_func(full_key):
# key_prefix and version still need to be returned
key_prefix = None
version = None
return key, key_prefix, version
CACHES = {
"default": {
"BACKEND": "django_mysql.cache.MySQLCache",
"LOCATION": "some_table_name",
"KEY_FUNCTION": my_key_func,
"REVERSE_KEY_FUNCTION": my_reverse_key_func,
}
}
Once you’re set up, the following prefix methods can be used:
- django_mysql.cache.delete_with_prefix(prefix, version=None)¶
Deletes all keys that start with the string
prefix
. Ifversion
is not provided, it will default to theVERSION
setting. Returns the number of keys that were deleted. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.delete_with_prefix("Truck") 1 >>> cache.get("Truck3") None
Note
This method does not require you to set the reverse key function.
- django_mysql.cache.get_with_prefix(prefix, version=None)¶
Like
get_many
, returns a dict of key to value for all keys that start with the stringprefix
. Ifversion
is not provided, it will default to theVERSION
setting. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.get_with_prefix("Truck") {'Truck3': 'Yellow'} >>> cache.get_with_prefix("Ca") {'Car1': 'Blue', 'Car4': 'Red'} >>> cache.get_with_prefix("") {'Car1': 'Blue', 'Car4': 'Red', 'Truck3': 'Yellow'}
- django_mysql.cache.keys_with_prefix(prefix, version=None)¶
Returns a set of all the keys that start with the string
prefix
. Ifversion
is not provided, it will default to theVERSION
setting. For example:>>> cache.set_many({"Car1": "Blue", "Car4": "Red", "Truck3": "Yellow"}) >>> cache.keys_with_prefix("Car") set(['Car1', 'Car2'])
Changes¶
Versions 0.1.10 -> 0.2.0¶
Initially, in Django-MySQL version 0.1.10, MySQLCache
did not force the
columns to use case sensitive collations; in version 0.2.0 this was fixed. You
can upgrade by adding a migration with the following SQL, if you replace
yourtablename
:
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p';
Or as a reversible migration:
from django.db import migrations
class Migration(migrations.Migration):
dependencies = []
operations = [
migrations.RunSQL(
"""
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 COLLATE utf8_bin
NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 COLLATE latin1_bin
NOT NULL DEFAULT 'p'
""",
"""
ALTER TABLE yourtablename
MODIFY cache_key varchar(255) CHARACTER SET utf8 NOT NULL,
MODIFY value_type char(1) CHARACTER SET latin1 NOT NULL DEFAULT 'p'
""",
)
]