Re: SMF 2.0.7 and PHP 5.2.17

samedi 24 mai 2014

Interesting. Here's my take on the same function.function xss_safe_htmlspecialchars($data)

{

static $quoting = null;

if ($quoting === null)

$quoting = ENT_QUOTES | (defined('ENT_HTML401') ? ENT_HTML401 : 0);

if (is_array($data))

{

return array_map('xss_safe_htmlspecialchars', $data);

}


return htmlspecialchars($data, $quoting);

}




Disclaimer: I have not benchmarked it, but let me explain the rationale behind the different things.

1. There's no need to pass $charset to htmlspecialchars when otherwise you're passing the default value. Save the zvals!


2. I have no idea why they're doing this whole ENT_HTML401 thing, to be honest. Virtually everything these days is XHTML or HTML5 compliant, but in practical terms the difference isn't usually significant. But since we're doing it, let's do it and minimise the effort we have to put in. We don't need to version check it, we just need to identify whether it is defined, but there's no point in doing that every iteration - since we can expect to call this frequently. In that respect, making a static value of the quoting rules makes sense since === null is about the cheapest we can realistically do in an if. Same deal if we were actually caring about the $charset; we'd keep that and not re-establish its value each iteration.


SMF's function doesn't have to work with HTML 4.01 seeing how that SMF itself has never used it. 1.0 through 2.0 use XHTML 1.0 Transitional, 2.1 uses HTML5. Most of the time ENT_COMPAT is suitable, it's only for bbc-parsed content where ENT_QUOTES should be used.


3. There's no point doing in userland code something we can do in an in-built function. Since we're talking about applying the same function to every member of an array, and just passing it back through a recursive, we might as well just use array_map to deal with it. Bonus: we don't have to remember to unset any local variables, as both $key and $arg would have to be unset afterwards which has interesting consequences for the stack. We generally should not try to pre-empt PHP's userland GC; once we leave the function they will fall out of scope and be GC'd when the pass next runs. Trying to pre-empt it means increased memory fragmentation and while you'd get lower overall use, you do it at the cost of longer runtime overall with stack thrashing. As a bonus we even get to retain the array key associations.


Additionally note that keeping the static variables does have a memory cost but *much* lower stack thrashing because the zval doesn't have to be created and released each iteration.


Of course, if the subject and body has values that cannot be tampered with, santising is a bit much. Sanitising $_SERVER['HTTP_HOST'] would usually not be necessary anyway because it usually won't get to the page in question if not defined properly :P The URI on the other hand... that's a bit more tricky...





0 commentaires:

Enregistrer un commentaire

 

Lorem

Ipsum

Dolor