2019 / 06 / 01 - article (#php, #bad-idea-good-idea)

htmlspecialchars()

From the PHP documentation, it converts special characters to HTML entities.

Bad idea: Using htmlspecialchars for “clearing” input

This method is made to transform HTML-related characters to their HTML entity counterparts, not to “clean” data before a save operation, e.g. a SQL INSERT (1).

We see a lot of htmlspecialchars usage for saving data into a database, which is definitely not a good thing.

<?php
// Example
$username = htmlspecialchars($_POST['username']);
$db->query("SELECT * FROM users WHERE username = '$username';");

Not only this won’t properly prevent SQL injections, but you’ll also end up modifying the data in a non-reversible way. You cannot revert back the data to “not HTML special chars” in a reliable way.

This means that, by using htmlspecialchars here, you can’t provide any “edit” system, as you won’t be able to allow the user to edit the original message.

Good idea: Using htmlspecialchars to sanitize user-generated content

As said before, this method is made to be used when outputting content to a page. It’s tasked with replacing any HTML-related character with their HTML entity counterpart.

For example, if you have a forum or a comment space, you can use this method to avoid XSS flaws.

<?php
// Example
$comment = 'This is a comment <script src="badstuff.js"></script> to test XSS';
?>
// ...
<article><?= htmlspecialchars($comment); ?></article>