HTTP has a very efficient mechanism to benefit from the client’s Web browser’s cache and other capacities. This allows the reduction of the required bandwidth, processor time, and improve response time.
When a client asks for a document the first time, this one is transmitted. But when this client ask for the same document again, because the document might have been modified, the client sends at the same time a date and an identifier of the last version received. The server will send back the document only if it has been modified since the client’s version, otherwise a not-modified response will be sent. In all the cases, the client sends also its capacities, and the communication is optimised according to that, with compression and persistent connections.
Summary: I propose a free library — only one function — to handle the different kinds of conditional requests (304, 412), HEAD requests, cache management at client and proxy level and compression of data. In RSS/Atom mode, allows filtering by date the articles server side, to transfer to the client only the new articles. Includes basic support of sessions. No modification in the PHP or HTTP server configuration is needed. There is no need to add any software client or server side. In order to use it, the library has to be included at the top of the PHP script with a
require()
and one function has to be called.
Last-Modified: Thu, 08 Jul 2004 17:33:54 GMT
Etag: "82e81980-27f3-4a6ae480"
Last-Modified
or alone (HTTP/1.1).
The fact that the Etag is coded allows hiding if needed of the last modification date.
See reference for ETag format.
If-Modified-Since: Thu, 08 Jul 2004 17:33:54 GMT
Last-Modified
server header.
If-None-Match: "82e81980-27f3-4a6ae480"
Etag
server header.
Cache-Control: private, max-age=0, must-revalidate
Accept-Encoding: gzip,deflate
Content-Encoding: gzip
Accept-Encoding
.
Content-Length: 3495
Connection: keep-alive
Content-Length
to be set.
See my documentation about HTTP and HTML redirections for some comments about HTTP headers.
In the following example, only headers relevant for conditional requests are shown.
1. The client ask for the document the first time
GET /test.php HTTP/1.1
2. The server sends the document
HTTP/1.x 200 OK Date: Thu, 08 Jul 2004 17:42:26 GMT Last-Modified: Thu, 08 Jul 2004 17:33:54 GMT Etag: "82e81980-27f3-4a6ae480" <html> ... </html>
3. The client asks for the same document a second time and gives the references of the version he has (in his cache)
GET /test.php HTTP/1.1 If-Modified-Since: Thu, 08 Jul 2004 17:33:54 GMT If-None-Match: "82e81980-27f3-4a6ae480"
4. The server replies with a not-modified header since the document has not been modified since the client’s version
HTTP/1.x 304 Not Modified Date: Thu, 08 Jul 2004 17:46:31 GMT Etag: "82e81980-27f3-4a6ae480"
5. The client asks for the same document a third time
GET /test.php HTTP/1.1 If-Modified-Since: Thu, 08 Jul 2004 17:33:54 GMT If-None-Match: "82e81980-27f3-4a6ae480"
6. The server provides the newest version because the document has been modified since the client’s version
HTTP/1.x 200 OK Date: Thu, 08 Jul 2004 17:48:54 GMT Last-Modified: Thu, 08 Jul 2004 17:48:52 GMT Etag: "82e81980-2bf2-7ff14900" <html> ... </html>
This mechanism is normally automatically handled by HTTP servers (Apache, IIS, ...) for static documents such as HTML pages, JPEG pictures, etc. but it is the programmer’s responsibility to manage it for dynamic documents like PHP, CGI, etc.
No additional software is needed, server or client side. Most of the current servers and browsers are natively compatible with this technique. In case the client is not HTTP/1.1 compliant, this optimisation is not working anymore but the communication is working normally.
I propose a module, to be included at the top of your PHP pages to automatically manage those conditional requests, in order to save processor time, bandwidth, and allows a faster navigation for the client. It can control cache mechanism in the client and proxy, and can compress data. It has also a special RSS/ATOM feeds feature to send only the new articles to the client. Basic support for sessions is included.
This module takes care of the different conditional requests, uses the last modification date of the script itself, and handle HEAD requests.
Before sending any text to the client, you just have to call the function httpConditional()
with:
$UnixTimeStamp
(required)httpConditional()
takes care of the modification date of the calling script itself.$cacheSeconds=0
(implied)$cachePrivacy=0
(implied)$feedMode=false
(implied)$clientCacheDate
will contain the date of the client’s cache version,
the cache policy is forced to private, the connection is closed quickly since the client usually takes only one file,
and the last modification of the script is not taken into account.$compression=false
(implied)zlib.output_compression_level
in php.ini
or with ini_set('zlib.output_compression_level',7)
for example (1..9).$session=false
(implied)$_SESSION
has been modified since the last generation of the document.session_cache_limiter('');
and/or session.cache_limiter=''
in php.ini
Basic usage:
example.php <?php require_once('http-conditional.php'); //Date of the last modification of the content (Unix Timestamp format) //Example: request form a database $dateLastModification=...; if (httpConditional($dateLastModification)) {//No modification since the client’s version ... //Close the database, and other cleaning exit(); //No need to send anything else } //!\ Do not send any text to the client before this line ... //The rest of the script, just like if this first part was not used ?>
http-conditional.php <?php /*Optimisation: Enable support for HTTP/1.x conditional requests in PHP.*/ //In RSS/ATOM feedMode, contains the date of the clients last update. $clientCacheDate=0; //Global variable because PHP4 does not allow conditional arguments by reference $_sessionMode=false; //Global private variable function httpConditional($UnixTimeStamp,$cacheSeconds=0,$cachePrivacy=0,$feedMode=false,$compression=false) {//Credits: https://alexandre.alapetite.fr/doc-alex/php-http-304/ //RFC2616 HTTP/1.1: http://www.w3.org/Protocols/rfc2616/rfc2616.html //RFC1945 HTTP/1.0: http://www.w3.org/Protocols/rfc1945/rfc1945.txt //If HTTP headers are already sent, too late, nothing to do. if (headers_sent()) return false; if (isset($_SERVER['SCRIPT_FILENAME'])) $scriptName=$_SERVER['SCRIPT_FILENAME']; elseif (isset($_SERVER['PATH_TRANSLATED'])) $scriptName=$_SERVER['PATH_TRANSLATED']; else return false; if ((!$feedMode)&&(($modifScript=filemtime($scriptName))>$UnixTimeStamp)) $UnixTimeStamp=$modifScript; //Last modification date, of the data and of the script itself //The modification date must to be newer than the current time on the server $UnixTimeStamp=min($UnixTimeStamp,time()); //If the conditional request allows to use the client’s cache $is304=true; //If the conditions are refused $is412=false; //There is a need for at least one condition to allow a 304 Not Modified response $nbCond=0; /* Date format: http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3.1 To smallest common divisor between the different standards that have been used is like: Mon, 28 Jun 2004 18:31:54 GMT It is compatible HTTP/1.1 (RFC2616,RFC822,RFC1123,RFC733) and HTTP/1.0 (Usenet getdate(3),RFC850,RFC1036). */ $dateLastModif=gmdate('D, d M Y H:i:s \G\M\T',$UnixTimeStamp); $dateCacheClient='Tue, 10 Jan 1980 20:30:40 GMT'; //Entity tag (Etag) of the returned document. //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19 //Must be modified if the filename or the content have been changed if (isset($_SERVER['QUERY_STRING'])) $myQuery='?'.$_SERVER['QUERY_STRING']; else $myQuery=''; if ($session&&isset($_SESSION)) {//In the case of sessions, integrate the variables of $_SESSION in the ETag calculation global $_sessionMode; $_sessionMode=$session; $myQuery.=print_r($_SESSION,true).session_name().'='.session_id(); } $etagServer='"'.md5($scriptName.$myQuery.'#'.$dateLastModif).'"'; //='"0123456789abcdef0123456789abcdef"' if ((!$is412)&&isset($_SERVER['HTTP_IF_MATCH'])) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24 $etagsClient=stripslashes($_SERVER['HTTP_IF_MATCH']); //Compare the current Etag with the ones provided by the client $etagsClient=str_ireplace('-gzip','',$etagsClient); $is412=(($etagsClient!=='*')&&(strpos($etagsClient,$etagServer)===false)); } if ($is304&&isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25 //http://www.w3.org/Protocols/rfc1945/rfc1945.txt //Get the date of the client’s cache version //No need to check for consistency, since a string comparison will be made. $nbCond++; $dateCacheClient=$_SERVER['HTTP_IF_MODIFIED_SINCE']; $p=strpos($dateCacheClient,';'); //Internet Explorer is not standard compliant if ($p!==false) //IE6 might give "Sat, 26 Feb 2005 20:57:12 GMT; length=134" $dateCacheClient=substr($dateCacheClient,0,$p); //Removes the information after the date added by IE //Compare the current document’s date with the date provided by the client. //Must be identical to return a 304 Not Modified $is304=($dateCacheClient==$dateLastModif); } if ($is304&&isset($_SERVER['HTTP_IF_NONE_MATCH'])) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26 //Compare Etags to check if the client has already the current version $nbCond++; $etagClient=stripslashes($_SERVER['HTTP_IF_NONE_MATCH']); $etagClient=str_ireplace('-gzip','',$etagClient); $is304=(($etagClient===$etagServer)||($etagClient==='*')); } //$_SERVER['HTTP_IF_RANGE'] //This library does not handle this condition. Returns a normal 200 in all the cases. //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.27 if ((!$is412)&&isset($_SERVER['HTTP_IF_UNMODIFIED_SINCE'])) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28 $dateCacheClient=$_SERVER['HTTP_IF_UNMODIFIED_SINCE']; $p=strpos($dateCacheClient,';'); if ($p!==false) $dateCacheClient=substr($dateCacheClient,0,$p); $is412=($dateCacheClient!==$dateLastModif); } if ($feedMode) {//Special RSS global $clientCacheDate; $clientCacheDate=@strtotime($dateCacheClient); $cachePrivacy=0; } if ($is412) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4.13 header('HTTP/1.1 412 Precondition Failed'); header('Content-Type: text/plain'); header('Cache-Control: private, max-age=0, must-revalidate'); echo "HTTP/1.1 Error 412 Precondition Failed: Precondition request failed positive evaluation\n"; //The response is finished; the request has been aborted //because the document has been modified since the client has decided to do an action //(avoid edition conflicts for example) return true; } elseif ($is304&&($nbCond>0)) {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.3.5 header('HTTP/1.0 304 Not Modified'); header('Etag: '.$etagServer); if ($feedMode) header('Connection: close'); //You should comment this line when running IIS return true; //The response is over, the client will use the version in his cache } else //The request will be handled normally, without condition {//http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.2.1 //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3 if ($compression&&isset($_SERVER['HTTP_ACCEPT_ENCODING'])&& extension_loaded('zlib')&&(!ini_get('zlib.output_compression'))) ob_start('_httpConditionalCallBack'); //Use compression. //ob_gzhandler() will check HTTP_ACCEPT_ENCODING and put correct headers //header('HTTP/1.0 200 OK'); //By default in PHP //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 if ($cacheSeconds<0) { $cache='private, no-cache, no-store, must-revalidate'; header('Pragma: no-cache'); } else { if ($cacheSeconds==0) $cache='private, must-revalidate, '; elseif ($cachePrivacy==0) $cache='private, '; elseif ($cachePrivacy==2) $cache='public, '; else $cache=''; $cache.='max-age='.floor($cacheSeconds); } header('Cache-Control: '.$cache); header('Last-Modified: '.$dateLastModif); header('Etag: '.$etagServer); //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.10 //No need to keep a connection opened for RSS/ATOM feeds //since most of the time clients take only one file if ($feedMode) header('Connection: close'); //You should comment this line when running IIS //http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.4 //In the case of a HEAD request, //the same headers as a GET request must be returned, //but the script does not need to calculate any content return $_SERVER['REQUEST_METHOD']=='HEAD'; } } function _httpConditionalCallBack(&$buffer,$mode=5) {//Private function automatically called at the end of the script when compression is enabled //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11 //You can adjust the level of compression with zlib.output_compression_level in php.ini if (extension_loaded('zlib')&&(!ini_get('zlib.output_compression'))) { $buffer2=ob_gzhandler($buffer,$mode); //Will check HTTP_ACCEPT_ENCODING and put correct headers if (strlen($buffer2)>1) //When ob_gzhandler succeeded $buffer=$buffer2; } header('Content-Length: '.strlen($buffer)); //Allows persistent connections //http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.13 return $buffer; } function httpConditionalRefresh($UnixTimeStamp) {//Update HTTP headers if the content has just been modified by the client’s request //See an example on https://alexandre.alapetite.fr/doc-alex/compteur/ if (headers_sent()) return false; if (isset($_SERVER['SCRIPT_FILENAME'])) $scriptName=$_SERVER['SCRIPT_FILENAME']; elseif (isset($_SERVER['PATH_TRANSLATED'])) $scriptName=$_SERVER['PATH_TRANSLATED']; else return false; $dateLastModif=gmdate('D, d M Y H:i:s \G\M\T',$UnixTimeStamp); if (isset($_SERVER['QUERY_STRING'])) $myQuery='?'.$_SERVER['QUERY_STRING']; else $myQuery=''; global $_sessionMode; if ($_sessionMode&&isset($_SESSION)) $myQuery.=print_r($_SESSION,true).session_name().'='.session_id(); $etagServer='"'.md5($scriptName.$myQuery.'#'.$dateLastModif).'"'; header('Last-Modified: '.$dateLastModif); header('Etag: '.$etagServer); } ?>
With $dateLastModification
being you variable containing the date of the last modification of the data in the UNIX format,
here are the recommended arguments to pass to the function in different cases:
httpConditional($dateLastModification,18000,1,false,true)
httpConditional($dateLastModification,3600,0,false,true)
httpConditional($dateLastModification,5,0,false,true)
httpConditional($dateLastModification,3600,0,true,true)
$clientCacheDate
to filter the articles by date in your SQL query.Some useful functions to manage dates and give them to httpConditional()
at the UNIX format:
time()
strtotime()
getlastmod()
filemtime()
MySQL UNIX_TIMESTAMP()
This module has been tested for example with PHP/4/5/7 under Apache/1.3/2.0 and IIS/5.1. Not all the clients are able to handle all the optimisations, but the communication has been perfect with all the tested clients, like InternetExplorer/5.0/5.5/6.0, Netscape/1.22/2.02/3.04/4.8/Mozilla, Opera/7, SharpReader/0.9.5, RSSreader/1.0.88...
Now, here are some examples using this library.
In order to get the full power from conditional requests, the last modification date must be quickly and easily accessible. Also, an optimisation of the database is sometimes a good idea. For example, a table containing the main needed modification dates can be very efficient.
This simple case uses PHP to display an article that is stored into a MySQL database.
The table that contains articles has a field called “modified”, which contains a date at MySQL format.
We want to retrieve by SQL the date of the last modification of this article in a UNIX timestamp.
article.php <?php if isset($_GET['id']) $num=$_GET['id']; //Reference of the article else $num=0; if (($connect=mysql_connect('server','user','password'))&&mysql_select_db('mybase')) { $query='SELECT UNIX_TIMESTAMP(ar.modified) AS lastmod FROM articles ar WHERE ar.id='.$num; if (($result=mysql_query($query))&&($row=mysql_fetch_assoc($result))) { $dateLastModification=$row['lastmod']; mysql_free_result($result); if (httpConditional($dateLastModification,0,0,false,true)) //Private policy, compression {//No modification since the client’s last update mysql_close($connect); exit(); } } } else $connect=false; ?> <html> ... <?php if ($connect) { ... //Other requests to the database mysql_close($connect); } ?> ... </html>
In this example, we have used compression, and a private cache policy, with a lifetime of 0. If this article is public, with no access condition, it is possible to save resources by activating the cache and with a public policy, as we will see in the next example about the dynamic PNG picture.
In this case, we want to generate a dynamic PNG picture.
The picture is just a label, and the title comes from the label.txt
text file.
So the date of the last modification of the picture is the date of the last modification of this text file,
which contains the real data.
This public picture is accessed very frequently. We wish to keep copies of it in clients’ Web browsers and in intermediate caches, such as proxies, Internet providers, etc. We choose a lifetime for these copies of 180 seconds. This is a balance to estimate between resources and freshness; it depends on how often this picture is modified.
image.png.php <?php require_once('http-conditional.php'); header('Content-type: image/png'); //Modification date of the file that contains the title $dateLastModification=filemtime('label.txt'); if (httpConditional($dateLastModification,180,2)) //Public cache, 180 seconds exit(); //No modification since the client’s last update if ($handle=fopen('label.txt','r')) //This file contains “Hello World!” { $label=fread($handle,255); //Read the title for the label fclose($handle); } else $label='Error'; $im=@imagecreate(120,30) or die('GD library error'); header('Content-type: image/png'); $bgcolor=imagecolorallocate($im,160,150,255); $color=imagecolorallocate($im,5,5,5); imagestring($im,5,7,7,$label,$color); imagepng($im); imagedestroy($im); ?>
In this case, we want to know with one SQL query the most recent modification date of elements stored in several tables.
This RSS1.0 feed is composed of data coming from 3 tables: “articles”, “news”, “documents”.
Each table has a field called “modified” containing a UNIX timestamp.
In order to avoid retransmitting articles again and again each time the client ask for an update of the RSS feed,
we will filter server side the articles by date, and send to the client only the articles that are newer than his version ($clientCacheDate
).
This provides an interesting optimisation of bandwidth. But this implies that the response is client dependent,
and the cache policy must be private ($cachePrivacy=0
).
This is ensured by the function when $feedMode
is set to true.
Moreoever, we will enable data compression if the client can handle it ($compression=true
),
in order to decrease the size of the text to send.
If no new article is available, a “304 Not Modified” response will be sent to the client,
like in previous examples.
HTTP/1.1 and compression support are not as good for RSS readers (clients) than they are with Internet browsers, but it is getting better and better. Some clients such as SharpReader are already compliant enough. Here again, if the client does not handle HTTP/1.1, the optimisation cannot be done, but the communication goes on normally, without side effect.
rss.php <?php require_once('http-conditional.php'); if ($odbc=odbc_connect('basetest','user','password')) { $sql='SELECT MAX(modif) AS lastmod FROM ('. 'SELECT MAX(ar.date) AS modif FROM articles ar UNION '. 'SELECT MAX(br.date) AS modif FROM news br UNION '. 'SELECT MAX(dc.date) AS modif FROM documents dc)'; if (($query=odbc_exec($odbc,$sql))&&odbc_fetch_row($query)) { $dateLastModification=odbc_result($query,'lastmod'); odbc_free_result($query); if (httpConditional($dateLastModification,1800,0,true,true)) //Private cache, 30 minutes and compression enabled {//No modification since the client’s last update odbc_close($odbc); header("Content-Type: application/rss+xml"); exit(); } //The global variable $clientCacheDate now contains the client’s last update date } } header('Content-Type: application/rss+xml'); echo '<','?xml version="1.0" encoding="UTF-8"?',">\n"; ?> <rdf:RDF xmlns="http://purl.org/rss/1.0/" xml:lang="en-GB" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <channel rdf:about="https://alexandre.alapetite.fr/blog/actu.en.html"> <title>My RSS</title> <description>Description of my RSS</description> <link>https://alexandre.alapetite.fr/</link> <dc:language>en-GB</dc:language> <dc:date>2004-07-22</dc:date> <dc:creator>Alexandre Alapetite</dc:creator> <items> <rdf:Seq> <?php if ($odbc) { /* Filter the articles: - Only the articles newer than $clientCacheDate (client’s last update) - Articles that are 30 days old at the maximum - Only the 20 most recent articles. */ $clientDate=max($clientDate-60,time()-(30*86400)); $sql='SELECT TOP 20 * FROM ('. 'SELECT title,link,date,description FROM articles WHERE date>'.$clientDate. ' UNION SELECT title,link,date,description FROM news WHERE date>'.$clientDate. ' UNION SELECT title,link,date,description FROM documents WHERE date>'.$clientDate. ') ORDER BY date DESC'; if (($query=odbc_exec($odbc,$sql))&&odbc_fetch_row($query,1)) { odbc_fetch_row($query,0); while (odbc_fetch_row($query)) echo "\t\t\t\t".'<rdf:li rdf:resource="https://alexandre.alapetite.fr'.odbc_result($query,'link').'"/>'."\n"; } } ?> </rdf:Seq> </items> </channel> <?php if ($odbc&&$query) { odbc_fetch_row($query,0); while (odbc_fetch_row($query)) echo "\t\t".'<item rdf:about="https://alexandre.alapetite.fr'.odbc_result($query,'link').'">'."\n", "\t\t\t".'<title>'.odbc_result($query,'title').'</title>'."\n", "\t\t\t".'<link>'.odbc_result($query,'link').'</link>'."\n", "\t\t\t".'<date>'.substr(date('Y-m-d\TH:i:sO',$myDate=odbc_result($query,'date')),0,22).':'.substr(date('O',$myDate),3).'</date>'."\n", "\t\t\t".'<description><![CDATA['.odbc_result($query,'description').']]></description>'."\n", "\t\t".'</item>'."\n"; odbc_close($odbc); } ?> </rdf:RDF>
This library can be used together with sessions,
by activating the session
parameter.
It then automatically checks if the data contained in $_SESSION
has been modified since the call to this function, since the last generation of the document,
using a MD5 hash code
stored in the ETag HTTP header.
The three modification cases that are checked detected:
$_SESSION
has been modified after calling httpConditional()
during the last generation of the document$_SESSION
has been modified from another document$_SESSION
has been modified before calling httpConditional()
in the current execution
It is necessary to disable automatic generation of headers
with session_cache_limiter('');
and/or session.cache_limiter=''
in php.ini
example.php <?php session_cache_limiter(''); //Disable automatic generation of header session_start(); //Start the session ... require_once('http-conditional.php'); //Date of the last modification of the content (Unix Timestamp format) //Example: request form a database $dateLastModification=...; if (httpConditional($dateLastModification)) {//No modification since the client’s version ... //Close the database, and other cleaning exit(); //No need to send anything else } //!\ Do not send any text to the client before this line ... //The rest of the script, just like if this first part was not used ?>
See the example of meter of visitors in PHP/HTTP on its dedicated page.
$cacheSeconds==0
(default value) a conditional revalidation against the server is requested at each access to the document;
when $cacheSeconds<0
caching is disabled.Content-Length
header, when the compression
paramater is activated,
and that even for browsers not handling compression.Accept-Encoding
header not containing gzip
nor deflate
.httpConditionalRefresh()
to update HTTP headers if the content is modified by the client’s requestIf-Modified-Since
"CC BY-SA (FR)"
This content is protected by a licence
Creative Commons Attribution-ShareAlike 2.0 France “BY-SA (FR)”