Suppose the combined listing of weather records and city location is of particular interest to your application, but you do not want to type the query each time you need it. You can create a view over the query, which gives a name to the query that you can refer to like an ordinary table:
Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces. Views can be used in almost any place a real table can be used. Building views upon other views is not uncommon.
This is just a very handy method not to write complicated queries. It does not help with performance – the complicated query is still executed every time.
Materialized views in PostgreSQL use the rule system like views do, but persist the results in a table-like form. The main differences between: CREATEMATERIALIZED VIEWmymatview ASSELECT* FROMmytab; and: CREATETABLEmymatview ASSELECT* FROMmytab; are that the materialized view cannot subsequently be directly updated and that the query used to create the materialized view is stored in exactly the same way that a view’s query is stored, so that fresh data can be generated for the materialized view with: REFRESH MATERIALIZED VIEWmymatview;
Simply speaking, the results of the view query are stored in the database – just like any other table. The only difference is that we can not update the view directly but it can be refreshed using records from its source tables.
This way, instead of executing expensive queries, we can use data organized in a more simple way in a materialized view.
Here you can find a very simple application I have created for the purpose of the blog post. It is an example of a real-life problem I have stumbled upon and extracted the essence into the repository.
The idea is simple. Users have permissions. However, permissions are not assigned directly to users. Users belong to groups and the groups have multiple permissions.
It is very probable you have already seen a similar pattern – grouping records with a similar purpose. There is nothing complicated with this design. However, getting information can require a few SQL joins.
To have some data to work on and test results I have prepared seeds.rb. It creates all permissions, 100 users, each belonging to 3 permission groups. There are 10 permission groups and each of them has 5 permissions.
Just run rails db:seed to get your database filled with data.
Solution #1 – naive approach
Suppose we want to check whether a given user has :a and :b permissions:
This translates to the following SQL query:
What do we get with EXPLAIN?
As we can see most scans are using indices so there is not much we can do to make it faster.
I have written a simple benchmark as a rake task. Using the seeded data I’m checking 1000 times whether a user has 1 permission, then 2 random permissions, 3, 4 and finally 5.
user system total real joins - 1 permission 1.750000 0.050000 1.800000 ( 2.266476) joins - 2 permissions 1.840000 0.060000 1.900000 ( 3.427442) joins - 3 permissions 1.860000 0.070000 1.930000 ( 3.497956) joins - 4 permissions 1.880000 0.060000 1.940000 ( 3.568906)
Ok, it does not tell us much. Can we make it faster?
Solution #2 – materialized views approach
This is the moment when it gets interesting. Guys from thoughtbot have created a very nice gem called scenic. It makes managing database views in Rails easy. I recommend you try it. I won’t get into details of installation and how it works. There is documentation for that.
For our problem we could create the following solution:
This generates two files. One is db/views/permissions_check_results_v01.sql, which I have filled with the code below:
What’s happening here? We are creating a simple view with two columns: user_idand permission_name (note that we are using enum – permission_name is stored as an integer).
In the example above user with id 1 has two permissions: 1 and 2, user with id 2 has only one permission: 3. Instead of joining multiple tables and checking for permissions through groups we now have a very simple, two-column table (the view).
The other file generated by the scenic task is a database migration: db/migrate/[TIMESTAMP]_create_permissions_check_results.rb
The important part is adding the materialized: true option. We also need to create a model.
It’s time to compare our previous benchmark results with new ones. Using the code from the beginning of the article and adding a few new lines:
user system total real joins - 1 permission 1.840000 0.050000 1.890000 ( 2.328792) joins - 2 permissions 1.860000 0.070000 1.930000 ( 3.452119) joins - 3 permissions 1.880000 0.060000 1.940000 ( 3.519821) joins - 4 permissions 1.950000 0.070000 2.020000 ( 3.664984)
The new results are 3x times faster. That’s a pretty good result. Some of you might have doubts:
Hey, this is only reading. What about updating records in database?
You are right – I have not mentioned this nor have I tested it. The reason is simple – I don’t care. The permissions in my case are updated very rarely. On the other hand, I need to get them very often. I care much more about reading than updating time.
If you need to write a lot of joins to get information from your database you may consider creating a materialized view.
You can think of database views as kind of an API or public methods. You get a consistent way of accessing your records. The view remains the same while the source tables may change.