Refactoring code is when you restructure existing code without changing its external behavior. Basically your aim is to make “bad” code better without changing the underlying functionality, and one of the key ways you can do this is by making sure human beings can understand it. In this article, we’ll look at some of the refactoring techniques you can use to make sure your code is clear, human-readable, and simpler to maintain.
It should be noted that there are cases when the time spent refactoring would be better spent rewriting the code from scratch. Minor tweaks and refactors can improve maintainable code, but gutting over 50% of your code may mean a rewrite is more cost-effective. Refactoring incrementally over time avoids crossing this threshold and makes that part of the code easier to maintain in the future. A previous version of this article was written by Gilbert Pellegrom and published in 2016. This version largely retains his original code examples.
Refactoring for Simplicity
There are a few principles you’ll hear about again and again when it comes to refactoring. The YAGNI (“You Aren’t Gonna Need It”) principle suggests focusing on implementing only current requirements without anticipating hypothetical future needs that may never materialize. Closely related is KISS (“Keep It Simple, Stupid”), which emphasizes designing systems simply and intuitively. These are handy principles, but we’ll start by keeping things DRY.
Don’t Repeat Yourself (DRY)
Probably the most popular programming principle you’re likely to hear is “Don’t Repeat Yourself” (DRY). If you find yourself duplicating the same code a couple of times, then you should probably encapsulate the functionality in its own class or function and use said class or function to avoid repetition. This means you only need to fix the code in one place when the inevitable bug arises months down the line.
A good example of this is when different but similar classes require some of the same functionality. The example below defines two classes, AwesomeAddon
and EvenMoreAwesomeAddon
, with the code duplicated in each class.
<?php
class AwesomeAddon {
private $settings;
public function __construct( $settings ) {
$this->set_settings( $settings );
}
protected function set_settings( $settings ) {
if ( ! is_array( $settings ) ) {
throw new \Exception( 'Invalid settings' );
}
$this->settings = $settings;
}
protected function do_something_awesome() {
//...
}
}
class EvenMoreAwesomeAddon {
private $settings;
public function __construct( $settings ) {
$this->set_settings( $settings );
}
protected function set_settings( $settings ) {
if ( ! is_array( $settings ) ) {
throw new \Exception( 'Invalid settings' );
}
$this->settings = $settings;
}
protected function do_something_even_more_awesome() {
//...
}
}
Instead of duplicating the code across two different classes, you could create a parent or abstract class which holds the common functionality, and then make the AwesomeAddon
and EvenMoreAwesomeAddon
classes to inherit from it:
<?php
abstract class Addon {
protected $settings;
protected function set_settings( $settings ) {
if ( ! is_array( $settings ) ) {
throw new \Exception( 'Invalid settings' );
}
$this->settings = $settings;
}
}
class AwesomeAddon extends Addon {
public function __construct( $settings ) {
$this->set_settings( $settings );
}
protected function do_something_awesome() {
//...
}
}
class EvenMoreAwesomeAddon extends Addon {
public function __construct( $settings ) {
$this->set_settings( $settings );
}
protected function do_something_even_more_awesome() {
//...
}
}
First, we declare an abstract Addon
class, which captures the common code for $settings
in AwesomeAddon
and EvenMoreAwesomeAddon
. Next, we added array
type before $settings
in __construct
and set_settings
methods, and void
return type in setSettings
method. This ensures we pass an array of settings when we create an instance and that function doesn’t return anything. We’ve also removed the exception for when $settings
is not an array, because the array type hint on set_settings
method will throw an exception automatically now if anything other than an array is passed to it.
Splitting up Complex Functions
Complex functions and methods can also lead to technical debt and hard-to-read code. One way to deal with this is to break up complicated functions into smaller chunks that are easy to understand (and test).
“Programs must be written for people to read, and only incidentally for machines to execute.” – Harold Abelson
Here’s an example of a complex function, a complete method for uploading a file from a WordPress site to S3 and handling pre-upload conditions, post-upload operations, and potential error situations.
<?php
function upload_attachment_to_s3( $post_id, $data = null, $file_path = null, $force_new_s3_client = false, $remove_local_files = true ) {
$return_metadata = null;
if ( is_null( $data ) ) {
$data = wp_get_attachment_metadata( $post_id, true );
} else {
// As we have passed in the meta, return it later
$return_metadata = $data;
}
if ( is_wp_error( $data ) ) {
return $data;
}
// Allow S3 upload to be hijacked/canceled for any reason
$pre = apply_filters( 'as3cf_pre_upload_attachment', false, $post_id, $data );
if ( false !== $pre ) {
if ( ! is_null( $return_metadata ) ) {
// If the attachment metadata is supplied, return it
return $data;
}
$error_msg = is_string( $pre ) ? $pre : __( 'Upload aborted by filter \'as3cf_pre_upload_attachment\'', 'amazon-s3-and-cloudfront' );
return $this->return_upload_error( $error_msg );
}
if ( is_null( $file_path ) ) {
$file_path = get_attached_file( $post_id, true );
}
// Check file exists locally before attempting upload
if ( ! file_exists( $file_path ) ) {
$error_msg = sprintf( __( 'File %s does not exist', 'amazon-s3-and-cloudfront' ), $file_path );
return $this->return_upload_error( $error_msg, $return_metadata );
}
$file_name = basename( $file_path );
$type = get_post_mime_type( $post_id );
$allowed_types = $this->get_allowed_mime_types();
// Check mime type of file is in allowed S3 mime types
if ( ! in_array( $type, $allowed_types ) ) {
$error_msg = sprintf( __( 'Mime type %s is not allowed', 'amazon-s3-and-cloudfront' ), $type );
return $this->return_upload_error( $error_msg, $return_metadata );
}
$acl = self::DEFAULT_ACL;
// Check the attachment already exists in S3, eg., edit or restore image
if ( ( $old_s3object = $this->get_attachment_s3_info( $post_id ) ) ) {
// use existing non default ACL if attachment already exists
if ( isset( $old_s3object['acl'] ) ) {
$acl = $old_s3object['acl'];
}
// Use existing prefix
$prefix = dirname( $old_s3object['key'] );
$prefix = ( '.' === $prefix ) ? '' : $prefix . '/';
// Use existing bucket
$bucket = $old_s3object['bucket'];
// Get existing region
if ( isset( $old_s3object['region'] ) ) {
$region = $old_s3object['region'];
};
} else {
// Derive prefix from various settings
if ( isset( $data['file'] ) ) {
$time = $this->get_folder_time_from_url( $data['file'] );
} else {
$time = $this->get_attachment_folder_time( $post_id );
$time = date( 'Y/m', $time );
}
$prefix = $this->get_file_prefix( $time );
// Use bucket from settings
$bucket = $this->get_setting( 'bucket' );
$region = $this->get_setting( 'region' );
if ( is_wp_error( $region ) ) {
$region = '';
}
}
$acl = apply_filters( 'as3cf_upload_acl', $acl, $data, $post_id );
$s3object = array(
'bucket' => $bucket,
'key' => $prefix . $file_name,
'region' => $region,
);
// Store acl if not default
if ( $acl != self::DEFAULT_ACL ) {
$s3object['acl'] = $acl;
}
$s3client = $this->get_s3client( $region, $force_new_s3_client );
$args = array(
'Bucket' => $bucket,
'Key' => $prefix . $file_name,
'SourceFile' => $file_path,
'ACL' => $acl,
'ContentType' => $type,
'CacheControl' => 'max-age=31536000',
'Expires' => date( 'D, d M Y H:i:s O', time() + 31536000 ),
);
$args = apply_filters( 'as3cf_object_meta', $args, $post_id );
$files_to_remove = array();
if ( file_exists( $file_path ) ) {
$files_to_remove[] = $file_path;
try {
$s3client->putObject( $args );
} catch ( Exception $e ) {
$error_msg = sprintf( __( 'Error uploading %s to S3: %s', 'amazon-s3-and-cloudfront' ), $file_path, $e->getMessage() );
return $this->return_upload_error( $error_msg, $return_metadata );
}
}
delete_post_meta( $post_id, 'amazonS3_info' );
add_post_meta( $post_id, 'amazonS3_info', $s3object );
$file_paths = $this->get_attachment_file_paths( $post_id, true, $data );
$additional_images = array();
$filesize_total = 0;
$remove_local_files_setting = $this->get_setting( 'remove-local-file' );
if ( $remove_local_files_setting ) {
$bytes = filesize( $file_path );
if ( false !== $bytes ) {
// Store in the attachment metadata for use by WP
$data['filesize'] = $bytes;
if ( is_null( $return_metadata ) ) {
// Update metadata with filesize
update_post_meta( $post_id, '_wp_attachment_metadata', $data );
}
// Add to the file size total
$filesize_total += $bytes;
}
}
foreach ( $file_paths as $file_path ) {
if ( ! in_array( $file_path, $files_to_remove ) ) {
$additional_images[] = array(
'Key' => $prefix . basename( $file_path ),
'SourceFile' => $file_path,
);
$files_to_remove[] = $file_path;
if ( $remove_local_files_setting ) {
// Record the file size for the additional image
$bytes = filesize( $file_path );
if ( false !== $bytes ) {
$filesize_total += $bytes;
}
}
}
}
if ( $remove_local_files ) {
if ( $remove_local_files_setting ) {
// Allow other functions to remove files after they have processed
$files_to_remove = apply_filters( 'as3cf_upload_attachment_local_files_to_remove', $files_to_remove, $post_id, $file_path );
// Remove duplicates
$files_to_remove = array_unique( $files_to_remove );
// Delete the files
$this->remove_local_files( $files_to_remove );
}
}
// Store the file size in the attachment meta if we are removing local file
if ( $remove_local_files_setting ) {
if ( $filesize_total > 0 ) {
// Add the total file size for all image sizes
update_post_meta( $post_id, 'wpos3_filesize_total', $filesize_total );
}
} else {
if ( isset( $data['filesize'] ) ) {
// Make sure we don't have a cached file sizes in the meta
unset( $data['filesize'] );
if ( is_null( $return_metadata ) ) {
// Remove the filesize from the metadata
update_post_meta( $post_id, '_wp_attachment_metadata', $data );
}
delete_post_meta( $post_id, 'wpos3_filesize_total' );
}
}
if ( ! is_null( $return_metadata ) ) {
// If the attachment metadata is supplied, return it
return $data;
}
return $s3object;
}
This function is quite long. The code would be cleaner, more readable, and easier to maintain if we refactored to take a more object-oriented approach where the functionalities are abstracted into methods of a class. We can make the primary function, upload_attachment_to_s3
, much easier to read by abstracting every component of the task into its own method:
<?php
function upload_attachment_to_s3( $post_id, $data = null, $file_path = null, $force_new_s3_client = false, $remove_local_files = true ) {
$return_metadata = $this->get_attachment_metadata( $post_id );
if ( is_wp_error( $return_metadata ) ) {
return $return_metadata;
}
// Allow S3 upload to be hijacked/canceled for any reason
$pre = apply_filters( 'as3cf_pre_upload_attachment', false, $post_id, $data );
if ( $this->upload_should_be_cancelled( $pre ) ) {
return $pre;
}
// Check file exists locally before attempting upload
if ( ! $this->local_file_exists() ) {
$error_msg = sprintf( __( 'File %s does not exist', 'amazon-s3-and-cloudfront' ), $file_path );
return $this->return_upload_error( $error_msg, $return_metadata );
}
// Check mime type of file is in allowed S3 mime types
if ( ! $this->is_valid_mime_type() ) {
$error_msg = sprintf( __( 'Mime type %s is not allowed', 'amazon-s3-and-cloudfront' ), $type );
return $this->return_upload_error( $error_msg, $return_metadata );
}
$s3object = $this->get_attachment_s3_info( $post_id );
$acl = $this->get_s3object_acl( $s3object );
$s3client = $this->get_s3client( $region, $force_new_s3_client );
$args = array(
'Bucket' => $s3object['bucket'],
'Key' => $s3object['key'],
'SourceFile' => $s3object['source_file'],
'ACL' => $acl,
'ContentType' => $s3object['mime_type'],
'CacheControl' => 'max-age=31536000',
'Expires' => date( 'D, d M Y H:i:s O', time() + 31536000 ),
);
$s3client->putObject( $args );
$this->maybe_remove_files( $args, $s3object );
return $s3object;
}
This is much easier to read and understand. It’s amazing the difference simply splitting up large, complex bits of code can make to a codebase. Another thing to note here is that you should use descriptive function names when splitting up complex functions like this. Remember, the aim is human readability. Trying to be too concise with your function names can actually make your code harder to understand. For example:
$this->get_att_inf( $post_id );
Is harder to understand than:
$this->get_attachment_s3_info( $post_id );
Splitting Up Complex Conditionals
Have you ever seen big conditionals that look something like:
<?php
if ( isset( $settings['wp-uploads'] ) && $settings['wp-uploads'] && in_array( $key, array( 'copy-to-s3', 'serve-from-s3' ) ) ) {
return '1';
}
Long pieces of code with conditionals are hard to read and understand. A simple solution to this is to extract the conditional code into clearly named methods. For example:
<?php
if ( upload_is_valid( $settings, $key ) ) {
return '1';
}
function upload_is_valid( $settings, $key ) {
return isset( $settings['wp-uploads'] ) && $settings['wp-uploads'] && in_array( $key, array( 'copy-to-s3', 'serve-from-s3' ) );
}
This makes your code far easier to understand for a future maintainer. As long as the conditional method is clearly named, it should be easy to understand what it does without inspecting the actual method code. This practice is known as declarative programming.
Replacing Nested Conditionals with Guard Clauses
Another way to refactor complex conditionals is to use guard clauses. Guard clauses extract all conditionals that lead to calling an exception or immediate return of a value from the method, and place it at the beginning of the method. Let’s take a look at the complex function below, created without using guard clauses.
<?php
function get_setting( $key, $default = '' ) {
$settings = $this->get_settings();
// If legacy setting set, migrate settings
if ( isset( $settings['wp-uploads'] ) && $settings['wp-uploads'] && in_array( $key, array( 'copy-to-s3', 'serve-from-s3' ) ) ) {
return $default;
} else {
// Turn on object versioning by default
if ( 'object-versioning' == $key && ! isset( $settings['object-versioning'] ) ) {
return $default;
} else {
// Default object prefix
if ( 'object-prefix' == $key && ! isset( $settings['object-prefix'] ) ) {
return $this->get_default_object_prefix();
} else {
if ( 'use-yearmonth-folders' == $key && ! isset( $settings['use-yearmonth-folders'] ) ) {
return get_option( 'uploads_use_yearmonth_folders' );
} else {
$value = parent::get_setting( $key, $default );
return apply_filters( 'as3cf_setting_' . $key, $value );
}
}
}
}
return $default;
}
Here you can see how you could quickly end up in “conditional hell” if the method got any more complex. However, when we refactor to use guard clauses, it looks like this:
<?php
function get_setting( $key, $default = '' ) {
$settings = $this->get_settings();
// If legacy setting set, migrate settings
if ( isset( $settings['wp-uploads'] ) && $settings['wp-uploads'] && in_array( $key, array( 'copy-to-s3', 'serve-from-s3' ) ) ) {
return $default;
}
// Turn on object versioning by default
if ( 'object-versioning' == $key && ! isset( $settings['object-versioning'] ) ) {
return $default;
}
// Default object prefix
if ( 'object-prefix' == $key && ! isset( $settings['object-prefix'] ) ) {
return $this->get_default_object_prefix();
}
// Default use year and month folders
if ( 'use-yearmonth-folders' == $key && ! isset( $settings['use-yearmonth-folders'] ) ) {
return get_option( 'uploads_use_yearmonth_folders' );
}
$value = parent::get_setting( $key, $default );
return apply_filters( 'as3cf_setting_' . $key, $value );
}
Now even if the method gets more complex, it’s not going to be a maintenance headache down the road.
Refactoring Loops and Conditionals Using Functional Methods
This is a slightly more advanced type of refactoring that is used more heavily in functional programming languages and libraries such as JavaScript, but can also be applied to PHP. Functions like map
and reduce
can dramatically improve the readability of your code.
The examples below were inspired by Adam Wathan’s great screencast on the subject and is based on using Laravel Collections. However, the example has been adapted to work with standard PHP functions. Speaking of Adam, his book “Refactoring to Collections” is a great guide to writing clean and maintainable PHP.
Let’s look at two common scenarios and see how they can be improved using functional methods. The example below fetches a bunch of $events
from an API and calculates a score based on the event type:
<?php
$events = file_get_contents( 'https://someapi.com/events' );
$types = array();
foreach ( $events as $event ) {
$types[] = $event->type;
}
$score = 0;
foreach ( $types as $type ) {
switch ( $type ) {
case 'type1':
$score += 2;
break;
case 'type2':
$score += 5;
break;
case 'type3':
$score += 10;
break;
default:
$score += 1;
break;
}
}
The first thing we can improve is replacing the foreach
loop with a map
function. A map function can be used when you want to create a new array from an existing array. In our example we’re creating a $types
array from the $events
array. PHP has an array_map function that will let us write the first foreach
in the above code like this:
<?php
$types = array_map( function( $event ) {
return $event->type;
}, $events );
The second thing we can do is break down the big switch
statement so it’s a bit easier to process.
<?php
$scores = array(
'type1' => 2,
'type2' => 5,
'type3' => 10,
);
$score = 0;
foreach ( $types as $type ) {
$score += isset( $scores[$type] ) ? $scores[$type] : 1;
}
But we can actually go one step further here and use PHP’s array_reduce function to calculate the $score
in a single function. A reduce function takes an array of values and reduces them to a single value. We’re going to do that here to calculate our $score
.
<?php
$scores = array(
'type1' => 2,
'type2' => 5,
'type3' => 10,
);
$score = array_reduce( $types, function( $result, $type ) use ( $scores ) {
return $result += isset( $scores[$type] ) ? $scores[$type] : 1;
} );
Putting it all together we now have:
<?php
$events = file_get_contents( 'https://someapi.com/events' );
$types = array_map( function( $event ) {
return $event->type;
}, $events );
$scores = array(
'type1' => 2,
'type2' => 5,
'type3' => 10,
);
$score = array_reduce( $types, function( $result, $type ) use ( $scores ) {
return $result += isset( $scores[$type] ) ? $scores[$type] : 1;
} );
Much better. No loops or conditionals in sight. You could imagine how, if more complexity was required to find $types
or calculate the $score
down the line, it would be easy to refactor the map
and reduce
function calls into separate methods. This is now possible because we’ve already reduced the complexity down to a function.
Further Reading
This article just scrapes the surface of refactoring, concentrating on the human readability aspects. For a deeper dive, check out PHP The Right Way or SourceMaking Refactoring guide.
How highly do you value the human readability aspects of your code? Let us know in the comments.