Drupal Text Formats 102: Developing custom filters

As I've argued before, input filters and text formats are an unglamorous but vital part of a Drupal site. There are a great many filters available as contributed modules which you can add to your site to increase its functionality - I'm rather partial to one called Pathologic, which was my first contributed Drupal module, and is still going strong - but what if you can't find one which suits your needs and want to write one of your own? Creating your own input filter for Drupal is actually a pretty straightforward affair, as far as Drupal modules go. As a proudly self-proclaimed expert on all things input filter-related, allow me to walk you through the process.

In this tutorial, we'll create a filter which will replace the word "dog" in content with another word. (Of course, the goal here is to learn the process, not to create an actual unique or practical module.) I'll assume that you're already familiar with the basics of module writing for Drupal 7, including how the hook system works on a conceptual level. (If you're not quite at that level yet, I suggest you give the book Pro Drupal Development a look; it was invaluable to me when I was a budding Drupal hacker.)

A finished copy of the module we're going to write is available to download, if you get stumped while following along and want to see what the end result should look like.

First steps

Since our module is going to remove "dog" from content, let's give it the oh-so-clever name of "Doggone." In your Drupal site's modules directory, create a directory named "doggone".

As always when starting a new Drupal module, it's a good idea to write the .info file first. Create the "doggone.info" file and give it the following contents:

name = Doggone
description = Removes the word "dog" from filtered content.
package = "Input filters"
dependencies[] = filter
core = 7.x

Note that we're using "Input filters" as the package value. (If you've forgotten, the package value describes the section on the module list page where the module will appear.) As with other package values, this isn't strictly enforced, but it is customary for modules whose primary purpose is to provide an input filter.

Also note that we've added the "filter" module to the dependencies array; the core "filter" module is what handles all things related to the input filter/text format system. As it's a non-optional core module, adding it to the dependencies array may not seem to do much, but I still like to do it just to make it obvious to any humans or machines investigating the module dependency tree that our module's functionality is tied to the "filter" module's.

Save and close the .info file; we won't need to deal with it further.

Hooking in

Okay, now time to write some code. Create the "doggone.module" file.

The hook that we implement to tell Drupal about the filter we provide is hook_filter_info(). Like most other _info() hook implementations, your implementation will return an array of data about the filters your module provides. If you look at the documentation, there's a lot of stuff an implementation can return, but let's start with a very basic implementation for now. Type (or copy-and-paste, if you're lazy) the following into the module file:

<?php
 
/**
 * Implements hook_filter_info().
 */
function doggone_filter_info() {
 
$filters = array();
 
$filters['doggone'] = array(
   
'title' => t('Remove the word "dog" from filtered text'),
   
'process callback' => '_doggone_filter',
  );
  return
$filters;
}
?>

So in our implementation, we're telling Drupal about one filter, which will have the system name of "doggone" (as defined by the key in the array about the data). Its array has two values. The 'title' value isn't so much a title as it is a brief description of what the filter does; the user will see this on the input format editing form. (Note that it's customary to leave off the period at the end of the sentence here.) As you can probably guess, the 'process callback' is the name of the function in our module which will actually take the text to be filtered and return the altered text. Its name (and the name of the settings callback function, which we'll deal with later) begins with an underscore, as is convention for functions which we don't intend to be called by code other than our own; this is not technically required, but it is customary.

Speaking of which, let's go ahead and implement our process callback function. Process callbacks (as documented by hook_filter_FILTER_process()) get passed a whole bunch of parameters, but the only one we care about for now is the first one, $text, which will contain the text that we need to alter. Our callback will return the altered text.

<?php
/**
 * Filter callback for our doggone filter.
 */
function _doggone_filter($text, $filter, $format, $langcode, $cache, $cache_id) {
  return
str_ireplace('dog', 'cat', $text);
}
?>

Testing

At this point, you have a functional module. Go ahead and try enabling it on your site. After doing that, if you go to create or edit an text format, you should see the filter in the list of filters you can enable for the format. Go ahead and do so now.

Now, go to create a new node of some type. In the "Body" field (or another filtered text field if the node type you're using doesn't have a "Body" field), make sure the format you've added the filter to is selected, then enter something like "Every dog has its day" in the field. Save the node, and if all goes well, the output for the field should appear as "Every cat has its day" instead. Our filter is working!

(Or is it? If it doesn't seem to be working for you, ensure that you've enabled the filter for the input format the field is using to filter the text.)

But what if you don't want to replace "dog" with "cat"? Let's alter our code to replace it with "bear" instead.

<?php
/**
 * Filter callback for our doggone filter.
 */
function _doggone_filter($text, $filter, $format, $langcode, $cache, $cache_id) {
  return
str_ireplace('dog', 'bear', $text);
}
?>

Now if you go back to the node you've just saved and reload its page, you'll see that the output is still "Every cat has its day." Hmm. Okay, so let's try re-saving the node; go to the node's edit page, then click the Save button without changing the contents of the "Body" field. When you see the node view page again, you'll see that the output has changed to… oh, nope, the output is still "Every cat has its day." What gives? Where's our "bear?"

Caching caveat

Perhaps you were expecting Drupal to not be able to tell when you changed your module code and re-filter the text, but how come it didn't re-filter it when you re-saved the node? The truth is that Drupal's filter system caches quite strictly the value of all bits of text which are run through an input filter. When you re-saved the node without changing the contents of the "Body" field, Drupal just saw that it had already filtered that exact text with that exact input filter, so it just went and grabbed the result of that filtering from its cache and used it again. If we want the text to be re-filtered, we need to change it a little. Edit the node again, and change the content of the "Body" field somehow; good old "asdf" works fine. After saving the node, you should see that "Every dog has its dayasdf" has now become "Every bear has its dayasdf" as expected.

You need to be aware of this behavior of Drupal's core as you're testing any input filters you're developing. It's tempting to just go to the node editing form and use repeated clicks of the "Preview" button to see your filter code in action as you ahck on it, but that's not going to work unless you remember to alter the "Body" field in between clicks.


screenshot of filter tester

The example module allows you to enter some text and quickly see the result of that text after filtering, both as formatted and raw HTML.


To that end, I've created a little module called Devel Input Filter to aid in input filter development. It allows you to enter some text and quickly see the result of that text after filtering, both as formatted and raw HTML - this screenshot can probably explain it better than I can. Give it a download if you want to develop input fitlers without any cache-related surprises. (The rest of this tutorial will assume that you're either using Devel Input Filter for testing, or are at least aware of the caching caveats when using filtered text fields on nodes.)

Add customizability with a settings form

So now we know how to filter content to replace "dog" with something else. What that something else is can be changed by altering the filter callback code. But let's spice things up a bit by allowing end users to specify what "dog" gets changed to without having to alter the code. To do that, we'll add a settings form to our filter, and also pass along some default settings.

Let's start by editing our hook_filter_info() implementation and adding two new values to our array for the 'doggone' filter: 'settings callback', which will return some form elements for editing settings for our filter (we'll worry about actually implementing this callback later), and 'default settings', which will hold some default values for our settings if they haven't been manually defined yet. We'll use one setting, 'replacement', which holds the text we want "dog" to be replaced with.

<?php
/**
 * Implements hook_filter_info().
 */
function doggone_filter_info() {
 
$filters = array();
 
$filters['doggone'] = array(
   
'title' => t('Remove the word "dog" from filtered text'),
   
'process callback' => '_doggone_filter',
   
'settings callback' => '_doggone_settings',
   
'default settings' => array(
     
'replacement' => 'cat',
    ),
  );
  return
$filters;
}
?>

We'll also alter our process callback to use the setting value in its behavior. I mentioned that callback implementations have a whole bunch of parameters, most of which we can ignore. You already saw that the first one, $text, is important, but the second one, $filter, is also of value when settings come into play. $filter is an object with a bunch of metadata about the filter and the format it's in which is mostly boring, but its settings parameter is an array which holds values for current settings. Note that we don't have to worry with whether the user has set a custom settings value and use a default value if that's not the case; the value in this array will have already taken that into account.

<?php
/**
 * Filter callback for our doggone filter.
 */
function _doggone_filter($text, $filter, $format, $langcode, $cache, $cache_id) {
  return
str_ireplace('dog', $filter->settings['replacement'], $text);
}
?>

Test the filter. You should see it unceremoniously replacing "dog" with "cat." If you change the 'replacement' value in our filter's 'default settings' array to "bear," you'll see our filter dutifully replacing "dog" with "bear" instead.

Okay, we're halfway there; let's implement the actual settings form. hook_filter_FILTER_settings() documents this. Settings forms will return fields as per the standard Drupal form API, but it's important to note that we don't want to return an actual, full form; just elements relevant to the settings we're using. Those elements will be added by the filter module to the actual, full filter settings form array.

Our settings form will just have a single field for specifying the replacement text. Note that we must give it the same name that we use in our 'default settings' array and in our process callback; 'replacement', in this case. Also, note that implementations of this hook also get a $filter parameter which contains the current settings for the filter; if the user is editing a currently-existing format and there's no value for this setting yet, the default value will automatically be used, just as in the filtering callback function. However, if the user is creating a new format, these values won't be set yet, so you'll want to use isset() to check for that and manually retrieve the default value from the $defaults parameter if that's the case.

<?php
/**
 * Settings form callback for our doggone filter.
 */
function _doggone_settings($form, &$form_state, $filter, $format, $defaults, $filters) {
 
$elements = array();
 
$elements['replacement'] = array(
   
'#type' => 'textfield',
   
'#title' => t('Replacement text'),
   
'#default_value' => isset($filter->settings['replacement']) ? $filter->settings['replacement'] : $defaults['replacement'],
  );
  return
$elements;
}
?>


screenshot of filter settings

At the bottom of the page, in the "Filter settings" section, you should now see a form for altering our filter's replacement text.


Now go back to the configuration form for the text format you've added your filter to. At the bottom of the page, in the "Filter settings" section, you should now see a form for altering our filter's replacement text. Give it a try; try changing it to your animal of choice, then filtering some text.

Up for a challenge? Try adding another setting to your filter which lets the user specify what word should be replaced, if they want to replace a word other than "dog." Remember to add a default setting to your hook_filter_info() implementation.

That's how you do it

That's the basics of how you write a text filter. If you want to explore, there are a lot of other values you can add to your filter definition in your hook_filter_info() implementation, as well as many other parameters passed along to your process callback function, but in all but the cornerest of corner cases, you can ignore all that stuff. Even Pathologic, after over four years of development, is, at its core, the same thing we've just made here: a hook_filter_info() implementation; a process callback which returns text altered according to user settings; and a settings callback which returns form elements to allow the user to alter those settings.

Now go out there and filter something!

File attachments: 
AttachmentSize
Completed example module801 bytes
We want to work with you!