Drupal projects architecture
This blog post is an overview of what I will talk about (or I have talked) in my workshop of the DrupalDay Bilbao 2014. Slides: https://rawgit.com/DavidHernandez/reveal.js/dday2014/index.html
Before starting, what do I understand as "Drupal project architecture"? A software project architecture includes from folder structure to the workflows and processes of how do we work and what tools do we use. There is no one right solution for this. This is just another approach that I found works well with most of the Drupal projects that I do.
Every project architecture should have different goals:
- Improve teamwork: help the team members to get starter easier and faster, avoiding conflicts and other collisions.
- Better processes: Deployments should be easy and ideally, automated. Setting a new environment should take minutes. We need to understand where belongs each part of the project.
- Improve confidence: Loose the fear to do a deployment. Be sure that a change is not going to break the site.
Let's start from the very begining. How do we organize our Drupal code? Independently of what are we talking about: themes, modules... or how do we work: installation profiles, sites/all, multisites... All of our code should be organized to identify easily what is every piece of code. The most common approach and the one I suggest is to create the next folder structure inside our themes and modules folder:
- contrib: Where we will put every contributed project we download.
- custom: Where we put the code we write.
- features: When using Features module to export configuration to code, we will put them here (this folder is not necessary inside the themes folder).
- hacked: When we hack a contributed project and if we don't have any way to track the patches, we can move the modules to this folder. It can be used too for forks of existing contrib modules.
Note: when creating custom code, never put it all in one super-module. For maintanability purposes, is better to create multiple modules.
Second note: group your custom modules based on the functionality they provide. It will help you identify modules that can be contributed back.
Instead of using the sites/all structure, for this approach we will use installation profiles
Putting all the code on sites/all has been the most common way to work with Drupal for the last years. For the few people who doesn't know in what consist this method, the only thing you have to do is to drop the code inside the respective folder inside sites/all. Those projects will be available for each site installed in that platform.
The structure of an installation profile is mostly the same, with some additions. You still have the modules, themes and libraries folder, so the change will not be big.
But, what advantages give us to use a installation profile instead of just the sites folder?
- It's common to have a lot of updates on different modules of the project. In an installation profile, that belongs to the profile and will not be scattered through a lot of different modules.
- It's easier to handle the dependencies of the project. Instead of a module that contains all the dependencies, that will probably give a time out or a memory limit error when installing it, specifying the dependencies on the info file of the profile will help us in different ways: the dependencies are in a place that makes more sense and while installing it, will use a batch process.
- It's easier to replicate the project. If you follow all the good practices, you can install the distribution and have a copy of the project without the content.
Drush make is another piece of the puzzle that help us to solve some of the problems that are not covered just with a good code structure or using an installation profile. Here are some reasons to use make files:
- Keeping track of the hacked projects: If we apply a patch into a module, a theme or the core, we can keep the track of the patch inside our drush make file.
- Keep control of the updates: When you have custom code that requires an specific version of a contrib module or you have patched contributed modules, you don't want to update those projects.
- As we list here all the modules, we don't need to keep versioned all the contrib modules, just the make file.
- Includes more than just the list of modules and themes, also knows the folder structure.
- We can create a clone of the project with just a command.
As the make files allows us to include other make files, I suggest to follow the structure of files used on drupal.org distributions:
- drupal-org.make: contains the list of contrib modules.
- drupal-org-core.make: contains the information of the core project (core version and patches).
- profile.make: contains the information about our installation profile.
Once we know all of this, the next step is to keep track of all our changes on the version control system we are using. If we have all the contrib modules in make files and the make file inside a profile, that's what we need to put under version control. So we will only include our custom code, our features, the make files and the code of the profile.
Here is an example of the folder structure of our repository:
- custom_profile: The main folder that contains the rest of the code.
- modules: With all the modules we have. Including custom, features, hacked modules...
- custom: The custom code we have created for the project.
- features: Our configuration, exported to features.
- hacked: The hacked modules that we can't keep track with Drush Make.
- themes: Themes required for the project
- custom: Our custom themes.
- custom_profile.profile: Mostly customization of the installation process.
- custom_profile.install: All the hook updates that can be considered that belong to the whole project and not just a module.
- custom_profile.info: Lists all the dependencies and requirements of the installation profile to be installed when installing the site.
- custom_profile.make: Information of our custom profile, custom modules to download that are on different repositories...
- drupal-org.make: list of contributed projects to be downloaded.
- drupal-org-core.make: Information of the Drupal core version.
- modules: With all the modules we have. Including custom, features, hacked modules...
In the next github repository you can find a sample structure with a skeleton of the files you can use: https://github.com/DavidHernandez/profile-boilerplate/tree/master/profile
Note: I only have talked about what do we store on our repository. I haven't talked about different branching approaches or different flows. That can be content for another blog post.
Configuration to code
To convert all the configuration into code, I prefer to use the Features module, as I think is the most complete solution until we have Drupal 8. Features is not perfect, can create a lot of problems and we have to understand it well in orther to be able to work with it successfully. As Features doesn't cover all the modules, we require other modules in orther to get the best solution possible. When we can achieve that with Features and extensions for it, we have to use the Drupal API to cover what is left.
Some modules I recommend to use with Features:
- Strongarm: Exports system variables to Features.
- Context (or Panels): The Drupal Blocks system is faulty and a nightmare for the deploys. It mixes the content (content of the blocks) with the configuration (location and other settings of the block). So we replace the configuration part with Context (or Panels).
- Beans / Boxes are also good replacements for the blocks system.
But now, how do we organize our features?
Probably the most common approach is to export the features based on the functionality. This means, grouping all the different parts that are related into one feature. The Features module was designed to be used in this way. One of the biggest advantages is that you can export a feature into a different project that requires the same functionality. But also has some drawbacks: you have to be very carefull while exporting the features, so you don't export the same component twice. Also, the dependency tree can become too complicated and not very intuitive.
I propose a different solution: Features group the exportables into components types: Field instances, Field bases, Content Types, Views... So, instead of grouping the different components into different features, just group all the components of the same type into one feature! If you do it in this way, you hardly will be able to reutilize a feature in a different project, but be honest: how often does this happen? We can automate the generation of the features, the dependency tree will always be the same and is really easy to know in which feature goes each component.
Note: Try to avoid core's Custom Blocks, as that system puts content into the configuration: It needs the ID of the block (content) to specify the configuration of the block.
Handling the updates
When something is not covered by Features, we have to cover it using the Drupal API: that's easily achievable with three hooks: hook_update_N, hook_install and hook_uninstall!
- hook_install includes all the internal jobs that need to be done to set up a new module or environment.
- hook_update_N are all the updates between versions of the project.
- hook_uninstall are all the jobs that need to be done to delete the configuration and content of the module.
The first two are used on Installation Profiles, the last one is not necessary, as "uninstalling" the profile means, destroying the database.
The hook_install should contain all the changes done also on the hook_updates, just in case the module or profile is reinstalled from the very begining, something that usually happens when re-installing the local environment or someone new joins the project.
We also use hook_update_N to automatically enable or disable modules on a profile. The goal is that we can do the deploy with zero clicks through the web interface. All the configuration should be handled through Features and hook_updates.
Note: The hook_update supports batch calls, so calling the hook_updates on the hook_install to replicate the last version of the DB may not work always. If using a batch process inside a hook_update, this will not work when called through the hook_install.
What kind of things are not covered by features? Migration of data between content types or from external sites, configuration of modules that don't implement the ctools exportables API, configuration of our custom modules...
Automatization of tasks
As a new environment has to be set (new machine, new member of the team, new computer...), we download the code we have under version control. But this code is not a valid Drupal installation, as it doesn't contain the Drupal core or the contributed modules and themes. But, as we are using Drush make, if the files require each other correctly, we can execute the make command and prepare an installation with just "drush make file.make --destination=/path/to/installation".
But instead of using this, we can follow the way used by drupal.org to build the downloadable packages of the distributions. Drupal.org uses a script to build the distributions, you can find this scripts in most of the distributions (see commerce kickstart, for example). Usually those scripts are prepared to work only in the profile you download, but you can easily rewrite them to be used on your profile, or you can download the script that I created that works for every profile (only for Drupal 7 sites!).
As you start working on your site, will arrive a moment when a new contributed project is added. But as this is not versioned, is there an easy way to download the contrib modules without the need to do a full rebuild of the site? Yes, there is! As we have all the contrib modules on a different make file, we can use drush to just rebuild that part, downloading only the contrib modules. All the parameters can be hard to remember, so you can create a drush alias for them or use another script that just launches that command. You can find that script on the previous Github link.
But, as usually we do deploys that are more than adding just a contributed module: we have to enable it, add the configuration... Rebuilding the contributed modules is not enough. What steps do we have to cover in order to fully update the site without a full deployment? Well, as our profile is versioned, first we do a git pull (svn update, or whatever command your version control system uses), then, we rebuild the contrib modules, update the database to enable modules and other updates and we finish reverting the features to upload the new configuration and clear the cache to be sure that the changes appear as soon as possible. Those comands can be added to a script easily too, and you can find it on the same repository as before.
Other part that we can automate, are the features we need for the project, thanks to the integration of the Features module with Drush. But only if we follow the pattern of a feature for each component type. We have to be careful, to avoid conflicts, we need to generate a feature and before generating the next one, we have to enable it. Also, be sure to generate them in the correct order, if not, you will find features containing different component types. With a bit of "bash magic" we can use the same script to generate new features when needed or recreate the existing ones with the new changes. As before, that script can be found in the Github repository, it contains some of the most common contributed modules. When adding a new module that is not covered by the script, you will have to carefully add it in the correct position (and don't forget to create a pull request!).
To deploy our changes between environments, there are several approaches we can follow. I follow the "Capistrano way", without using Capistrano itself. I will briefly introduce it here, if you are interested in other deployment processes, you can see the session about Drupal Deployments I did for the DrupalCamp Spain 2014 or just check the slides.
How does Capistrano a deployment? The process propossed by Capistrano consists in creating a new full installation for each deployment. This can be slower than just updating the modified files, but is safer, as the user will not notice any downtime. You are creating the new site on the background and when is ready, you switch the active folder and the new site is inmediately available.
But we have to be carefull with a couple things! What happens with the files folder? What happens with the configuration files? A good solution for this, is to create a folder structure out of the Drupal installation. We create a folder folder for the files (if we have a private system, we create it too) and another folder for the configuration files (usually just the settings.php). We create another folder where we will deploy the new releases and the active release is linked to another place, where the virtual host will be pointing always.
When we want to do a deploy, we will create a release, using our previous build script in the releases. We will copy (or link) the configuration files where they belong and we link the files folder. We remove the active link and point it to the current release. And then, we can execute the DB updates and revert the features. Clear the caches and the site has been deployed! Of course, you can automate that too with a simple script.
All of the previous GitHub links are part of a Boilerplate that implements all of this in this GitHub repository: https://github.com/DavidHernandez/profile-boilerplate.
We already have seen all the necessary to have a good process to work with Drupal. Next we are going to see an example of how do we work normally with all that we have seen so far.
We have been working all day in a new user story and we are ready to deploy it. We need a new contributed module, that we add to the make file. We configure it and export the settings to Features. We create a new hook_update_N, where we enable the contributed module and the Feature. And to finish, we add the contributed module and the new feature to the info file, just in case a new site needs to be installed. With this, all of our work with Drupal has finished.
We add the changes (make, info and install files) and the new files (the feature) and do the git commit. We push the changes and we are ready to do the deploy. SSH to the server and execute the deploy script. In a few seconds the changes will be there, available for everybody.
Note: We are not considering the different environments, live, stage, development... Each of this can have more complex deployment processes, like syncing the DB from live to stage, from stage to dev or from dev to local. In dev maybe we don't need to create new releases, just use the update project script...
Second note: I haven't mentioned the backups. You have to take periodic backups, not only before the deploy.
Third note: If something goes wrong, we usually keep multiple releases, so we always have copies of previous code states. Plus, if we have our backups working, we can easily put it back quickly.
A next step to improve our systems, can be to add testing. Until Drupal 7 we had SimpleTest (for D6 was a contrib, in 7 is in core). In Drupal 8 we also have PHP Unit for testing purposes. But apart from that, we can add BDD with Behat nor functional testing with CasperJS or with Selenium.
There are multiple ways to do testing, find someone that works better for you and don't hesitate to try it.
If we have testing, we can continue adding Continuous Integration. We can add a previous step for the deployments: if the tests are green, process with the deploy. If not, send it back to the developer.