"Fossies" - the Fresh Open Source Software Archive

Member "incubator-pagespeed-mod-" (28 Feb 2020, 41526 Bytes) of package /linux/www/apache_httpd_modules/incubator-pagespeed-mod-

Caution: In this restricted "Fossies" environment the current HTML page may not be correctly presentated and may have some non-functional links. You can here alternatively try to browse the pure source code or just view or download the uninterpreted raw source code. If the rendering is insufficient you may try to find and view the page on the incubator-pagespeed-mod- project site itself.

PageSpeed Authorizing and Mapping Domains

Authorizing domains

In addition to optimizing HTML resources, PageSpeed restricts itself to optimizing resources (JavaScript, CSS, images) that are served from domains, with optional paths, that must be explicitly listed in the configuration file. For example:

ModPagespeedDomain http://example.com
ModPagespeedDomain cdn.example.com
ModPagespeedDomain http://styles.example.com/css
ModPagespeedDomain *.example.org
pagespeed Domain http://example.com;
pagespeed Domain cdn.example.com;
pagespeed Domain http://styles.example.com/css;
pagespeed Domain *.example.org;

PageSpeed will rewrite resources found from these explicitly listed domains, although in the case of styles.example.com only resources under the css directory will be rewritten. Additionally, it will rewrite resources that are served from the same domain as the HTML file, or are specified as a path relative to the HTML. When resources are rewritten, their domain and path are not changed. However, the leaf name is changed to encode rewriting information that can be used to identify and serve the optimized resource.

The leading "http://" is optional; bare hostnames will be interpreted as referring to HTTP. Wildcards can be used in the domain.

These directives can be used in location-specific configuration sections.

Mapping origin domains

In order to improve the performance of web pages, PageSpeed must examine and modify the content of resources referenced on those pages. To do that, it must fetch those resources using HTTP, using the URL reference specified on the HTML page.

In some cases, the URL specified in the HTML file is not the best URL to use to fetch the resource. Scenarios where this is a concern include:

  1. If the server is behind a load balancer, and it's more efficient to reference the server directly by its IP address, or as 'localhost'.
  2. The server has a special DNS configuration
  3. The server is behind a firewall preventing outbound connections
  4. The server is running in a CDN or proxy, and must go back to the origin server for the resources
  5. The server needs to service https requests

In these situations the remedy is to map the origin domain:

ModPagespeedMapOriginDomain origin_to_fetch_from origin_specified_in_html [host_header]
pagespeed MapOriginDomain origin_to_fetch_from origin_specified_in_html [host_header];

Wildcards can also be used in the origin_specified_in_html, e.g.

ModPagespeedMapOriginDomain localhost *.example.com
pagespeed MapOriginDomain localhost *.example.com;

The origin_to_fetch_from can include a path after the domain name, e.g.

ModPagespeedMapOriginDomain localhost/example *.example.com
pagespeed MapOriginDomain localhost/example *.example.com;

When a path is specified, the source domain is mapped to the destination domain and the source path is mapped to the concatenation of the path from origin_to_fetch_from and the source path. For example, given the above mapping, http://www.example.com/index.html will be mapped to http://localhost/example/index.html.

The origin_specified_in_html can specify https but the origin_to_fetch_from can only specify http, e.g.

ModPagespeedMapOriginDomain http://localhost https://www.example.com
pagespeed MapOriginDomain http://localhost https://www.example.com;

This directive lets the server accept https requests for www.example.com without requiring a SSL certificate to fetch resources. For example, given the above mapping, and assuming the server is configured for https support, PageSpeed will fetch and optimize resources accessed using https://www.example.com, fetching the resources from http://localhost, which can be the same server process or a different server process.

ModPagespeedMapOriginDomain http://localhost https://www.example.com
ModPagespeedShardDomain https://www.example.com \
pagespeed MapOriginDomain http://localhost https://www.example.com;
pagespeed ShardDomain https://www.example.com

In this example the https origin domain is mapped to localhost and sharding is used to parallelize downloads across hostnames. Note that the shards also specify https.

By specifying a source domain in this directive, you are authorizing PageSpeed to rewrite resources found in that domain. For example, in the above directives, '*.example.com' gets authorized for rewrites from HTML files, but 'localhost' does not. See Domain.

When PageSpeed fetches resources from a mapped origin domain, it specifies the source domain in the Host: header in the request. You can override the Host: header value with the optional third parameter host_header. See Mapping Origins with a Shared Domain for an example.

See also LoadFromFile to load origin resource directly from the filesystem and avoid an HTTP connection altogether.

These directives can be used in location-specific configuration sections.

Mapping rewrite domains

When PageSpeed rewrites a resource, it updates the HTML to refer to the resource by its new name. Generally PageSpeed leaves the resource at the same origin and path that was originally found in the HTML. However, it is possible to map the domain of rewritten resources. Examples of why this might be desirable include:

  1. Serving static content from cookieless domains, to reduce the size of HTTP requests from the browser. See Minimizing Payload
  2. To move content to a Content Delivery Network (CDN)

This is done using the configuration file directive:

ModPagespeedMapRewriteDomain domain_to_write_into_html \
pagespeed MapRewriteDomain domain_to_write_into_html

Wildcards can also be used in the domain_specified_in_html, e.g.

ModPagespeedMapRewriteDomain cdn.example.com *example.com
pagespeed MapRewriteDomain cdn.example.com *example.com;

The domain_to_write_into_html can include a path after the domain name, e.g.

ModPagespeedMapRewriteDomain cdn.com/example *.example.com
pagespeed MapRewriteDomain cdn.com/example *.example.com;

When a path is specified, the source domain is rewritten to the destination domain and the source path is rewritten to the concatenation of the path from domain_to_write_into_html and the source path. For example, given the above mapping, http://www.example.com/index.html will be rewritten to http://cdn.com/example/index.html.

Note: It is the responsibility of the site administrator to ensure that PageSpeed is installed on the domain_to_write_into_html. This might be a separate server, or there may be a single server with multiple domains mapped into it. The files must be accessible via the same path on the destination server as was specified in the HTML file. No other files should be stored on the domain_to_write_into_html -- it should be functionally equivalent to domain_specified_in_html. See also MapProxyDomain which enables proxying content from a different server.

For example, if PageSpeed cache_extends http://www.example.com/styles/style.css to http://cdn.example.com/styles/style.css.pagespeed.ce.HASH.css, then cdn.example.com will have to have a mechanism in place to either rewrite that file in place, or refer back to the origin server to pull the rewritten content.

Note: It is the responsibility of the site administrator to ensure that moving resources onto domains does not create a security vulnerability. In particular, if the target domain has cookies, then any JavaScript loaded from a resource moved to a domain with cookies will gain access to those cookies. In general, moving resources to a cookieless domain is a great way to improve security. Be aware that CSS can load JavaScript in certain environments.

By specifying a domain in this directive, either as source or destination, you are authorizing PageSpeed to rewrite resources found in this domain. See Domain.

These directives can be used in location-specific configuration sections.

Mapping Origins with a Shared CDN

Consider a scenario where an installation serving multiple domains uses a single CDN for caching and delivery of all content. The origin fetches need to be routed to the correct VirtualHost on the server. This can be achieved by using a subdirectory per domain in the CDN, and then using that subdirectory to map to the correct VirtualHost at origin. The host-header control offered by the third argument to MapOriginDomain makes this feasible.

In the example below, resources with a domain of sharedcdn.example.com and path starting with /vhost1 will be fetched from localhost but with a Host: header value of vhost1.example.com. Without the third argument to MapOriginDomain, the Host: header would be sharedcdn.example.com.

ModPagespeedMapOriginDomain localhost sharedcdn.example.com/vhost1 vhost1.example.com
ModPagespeedMapRewriteDomain sharedcdn.example.com/vhost1 vhost1.example.com
pagespeed MapOriginDomain localhost sharedcdn.example.com/vhost1 vhost1.example.com;
pagespeed MapRewriteDomain sharedcdn.example.com/vhost1 vhost1.example.com;

This would be used in conjunction with a VirtualHost setup for vhost1.example.com, and a single CDN setup for multple hosts segregated by subdirectory.

Sharding domains

Best practices suggest minimizing round-trip times by parallelizing downloads across hostnames. PageSpeed can partially automate this for resources that it rewrites, using the directive:

ModPagespeedShardDomain domain_to_shard shard1,shard2,shard3...
pagespeed ShardDomain domain_to_shard shard1,shard2,shard3...;

Wildcards cannot be used in this directive.

This will distribute the domains for rewritten URLs among the specified shards. The shard selected for a particular URL is computed from the original URL.

ModPagespeedShardDomain example.com \
pagespeed ShardDomain example.com static1.example.com,static2.example.com;

Using this directive, PageSpeed will distribute roughly half the resources rewritten from example.com into static1.example.com, and the rest to static2.example.com. You can specify as many shards as you like. The optimum number of shards is a topic of active research, and is browser-dependent. Configuring between 2 and 4 shards should yield good results. Changing the number of shards will cause PageSpeed to choose different names for resources, resulting in a partial cache flush.

When used in combination with RewriteDomain, the Rewrite mappings will be done first. Then the shard selection occurs. Origin domains are always tracked so that when a browser sends a sharded URL back to the server, PageSpeed can find it.

Let's look at an example:

ModPagespeedShardDomain example.com static1.example.com,static2.example.com
ModPagespeedMapRewriteDomain example.com www.example.com
ModPagespeedMapOriginDomain localhost example.com
pagespeed ShardDomain example.com static1.example.com,static2.example.com;
pagespeed MapRewriteDomain example.com www.example.com;
pagespeed MapOriginDomain localhost example.com;

In this example, example.com and www.example.com are "tied" together via MapRewriteDomain. The origin-mapping to localhost propagates automatically to www.example.com, static1.example.com, and static2.example.com. So when PageSpeed cache-extends an HTML stylesheet reference http://www.example.com/styles.css, it will be:

  1. Fetched by the server rewriting the HTML from localhost
  2. Rewritten to http://example.com/styles.css.pagespeed.ce.HASH.css
  3. Sharded to http://static1.example.com/styles.css.pagespeed.ce.HASH.css

Proxying and optimizing resources from trusted domains

Proxying resources is desirable under several scenarios:

It is possible to proxy and optimize resources whose origin is a trusted domain that may not be running PageSpeed. This cannot be directly achieved with MapRewriteDomain because that is a declaration that the domains listed are functionally equivalent to one another, either because they are backed by the same storage, or because the target is acting as a proxy (e.g. a CDN). MapProxyDomain makes it technically possible to proxy and optimize resources from any domain that you trust.

You must only proxy resources that are controlled by an organization you trust because it is possible for malicious content (e.g. GIFAR) proxied from an untrustworthy domain to gain access to private content on your domain, compromising your site or its viewers. You must never map directories that may contain files that may be controlled by a third party.

There may be legal issues restricting the optimization of resources you don't own. If in doubt consult a lawyer. {# TODO(jmarantz): it should be possible to use this directive in #} {# combination with Disallow & rewrite_domains to proxy without #} {# optimizing. A demo/test of that will be left for a follow-up. #}

ModPagespeedMapProxyDomain target_domain/subdir \
                           origin_domain/subdir [rewrite_domain/subdir]
pagespeed MapProxyDomain target_domain/subdir
                         origin_domain/subdir [rewrite_domain/subdir];

If the optional rewrite_domain/subdir argument is supplied then optimized resources will be rewritten to that location. This is useful for rewriting optimized resources proxied from an external origin to a CDN.

It is important to specify a subdirectory in the target domain, because PageSpeed will need to be able to unambiguously identify the origin domain given the target when fetching content. Thus each MapProxyDomain command should be given a distinct subdirectory of the target domain.

It is important to specify a subdirectory in the origin domain to limit the scope of the proxying. For example, in picasaweb, all of a user's photos are underneath a single subdirectory; it is critical not to enable proxying for the entire site.


You can see proxy-mapping in action at www.modpagespeed.com on this example.

Fetch server restrictions

PageSpeed will only fetch resources from localhost and domains explicitly mentioned in domain configuration directives such as Domain, MapRewriteDomain and MapOriginDomain. As this security restriction is not desirable for some large deployments, in Apache it is possible to disable it starting from, via the following configuration directive (which has a global effect):

ModPagespeedDangerPermitFetchFromUnknownHosts on

Warning: Enabling DangerPermitFetchFromUnknownHosts could permit hostile third parties to access any machine and port that the server running mod_pagespeed has access to, including potentially those behind firewalls.

Before doing this, however, it must be ensured that at least one of these things is true:
  1. The server running mod_pagespeed has no more access to machines or ports than anyone on the Internet, and that machines it can access will not treat its traffic specially (mod_pagespeed and newer will make sure its own traffic to localhost does not appear to be local, but that does not work across machines)
  2. Every virtual host in Apache running mod_pagespeed (and, if applicable, the global configuration) has an accurate explicit ServerName, and sets the options UseCanonicalName and UseCanonicalPhysicalPort to On.
  3. A proxy running in front of the mod_pagespeed server fully verifies that the URLs and Host: headers that reach it refer only to machines the mod_pagespeed server is expected to contact.
If possible, you are strongly encouraged to use MapOriginDomain in preference to this switch.

Specifying additional URL-valued attributes

All PageSpeed filters that process URLs need to know which attributes of which elements to consider. By default they consider those in the HTML4 and HTML5 specifications and a few common extensions:

  <a href=...>
  <area href=...>
  <audio src=...>
  <blockquote cite=...>
  <body background=...>
  <button formaction=...>
  <command icon=...>
  <del cite=...>
  <embed src=...>
  <form action=...>
  <frame src=...>
  <html manifest=...>
  <iframe src=...>
  <img src=...>
  <input type="image" src=...>
  <ins cite=...>
  <link href=...>
  <q cite=...>
  <script src=...>
  <source src=...>
  <td background=...>
  <th background=...>
  <table background=...>
  <tbody background=...>
  <tfoot background=...>
  <thead background=...>
  <track src=...>
  <video src=...>

If your site uses a non-standard attribute for URLs, PageSpeed won't know to rewrite them or the resources they reference. To identify them to PageSpeed, use the UrlValuedAttribute directive. For example:

ModPagespeedUrlValuedAttribute span src hyperlink
ModPagespeedUrlValuedAttribute div background image
pagespeed UrlValuedAttribute span src hyperlink;
pagespeed UrlValuedAttribute div background image;

These would identify <span src=...> and <div background=...> as containing URLs. Further, the background attribute of div elements would be treated as referring to an image and would be treated just like an image resource referenced with <img src=...>. The general form is:

pagespeed UrlValuedAttribute ELEMENT ATTRIBUTE CATEGORY;

All fields are case-insensitive. Valid categories are:

When in doubt, hyperlink is the safest choice.

Note: Until, stylesheet was accepted by the configuration parser, but was non-functional.

Loading static files from disk

By default PageSpeed loads sub-resources via an HTTP fetch. It would be faster to load sub-resources directly from the filesystem, however this may not be safe to do because the sub-resources may be dynamically generated or the sub-resources may not be stored on the same server.

However, you can explicitly tell PageSpeed to load static sub-resources from disk by using the LoadFromFile directive. For example:

ModPagespeedLoadFromFile "http://www.example.com/static/" \
pagespeed LoadFromFile "http://www.example.com/static/"

tells PageSpeed to load all resources whose URLs start with http://www.example.com/static/ from the filesystem under /var/www/static/. For example, http://www.example.com/static/images/foo.png will be loaded from the file /var/www/static/images/foo.png. However, http://www.example.com/bar.jpg will still be fetched using HTTP.

If you need more sophisticated prefix-matching behavior, you can use the LoadFromFileMatch directive, which supports RE2-format regular expressions. (Note that this is not the same format as the wildcards used above and elsewhere in PageSpeed.) For example:

ModPagespeedLoadFromFileMatch "^https?://example.com/~([^/]*)/static/" \
pagespeed LoadFromFileMatch "^https?://example.com/~([^/]*)/static/"

Will load http://example.com/~pat/static/cat.jpg from /var/www/static/pat/cat.jpg, http://example.com/~sam/static/images/dog.jpg from /var/www/static/sam/images/dog.jpg, and https://example.com/~al/static/css/ie from /var/www/static/al/css/ie. The resource http://example.com/~pat/images/static/puppy.gif, however, would not be matched by this directive and would be fetched using HTTP.

Because PageSpeed is loading the files directly from the filesystem, no custom headers will be set. For example, no headers set with the Header set (Apache) or add_header (Nginx) directives will be applied to these resources. If you have resources that need to be served with custom headers, such as Cache-Control: private, you need to exclude them from LoadFromFile. For resources PageSpeed rewrites in-place it will set a 5-minute cache lifetime by default, which you can adjust by changing LoadFromFileCacheTtlMs.

Furthermore, the content type will be set based upon only the filename extension and only for common filename extensions we recognize (.html, .css, .js, .jpg, .jpeg, ... see full list: content_type.cc). Before, filenames with unrecognized extensions were served with no Content-Type header; in and later such filenames will not be loaded from file and instead will fall back to ordinary fetching.

You can also use the LoadFromFile directive to load HTTPS resources which would not be otherwise fetchable directly. For example:

ModPagespeedLoadFromFile "https://www.example.com/static/" \
pagespeed LoadFromFile "https://www.example.com/static/"

The filesystem path must be an absolute path.

You can specify multiple LoadFromFile associations in configuration files. Note that large numbers of such directives may impact performance.

If the sub-resource cannot be loaded from file in the directory specified, the sub-request will fail (rather than fall back to HTTP fetch). Part of the reason for this is to indicate a configuration error more clearly.

As an added benefit. If resources are loaded from file, the rewritten versions will be updated immediately when you change the associated file. Resources loaded via normal HTTP fetches are refreshed only when they expire from the cache (by default every 5 minutes). Therefore, the rewritten versions are only updated as often as the cache is refreshed. Resources loaded from file are not subject to caching behavior because they are accessed directly from the filesystem for every request for the rewritten version.

See also MapOriginDomain.

This directive can not be used in location-specific configuration sections.

Limiting Direct Loading

A mapping set up with LoadFromFile allows filesystem loading for anything it matches. If you have directories or file types that cannot be loaded directly from the filesystem, LoadFromFileRule lets you add fine-grained rules to control which files will be loaded directly and which will fall back to the standard process, over HTTP.

When given a URL PageSpeed first determines whether any LoadFromFile mappings apply. If one does, it calculates the mapped filename and checks for applicable LoadFromFileRules. Considering rules in the reverse order of definition, it takes the first applicable one and uses that to determine whether to load from file or fall back to HTTP.

Some examples may be helpful. Consider a website that is entirely static content except for a /cgi-bin directory:


While most of the site can be loaded directly from the filesystem, guestbook.pl and visitcounter.pl are perl files that need to be interpreted before serving. Adding a rule disallowing the /cgi-bin directory tells us to fall back to HTTP appropriately:

ModPagespeedLoadFromFile http://example.com/ /var/www/
ModPagespeedLoadFromFileRule Disallow /var/www/cgi-bin/
pagespeed LoadFromFile http://example.com/ /var/www/;
pagespeed LoadFromFileRule Disallow /var/www/cgi-bin/;

The LoadFromFileRule directive takes two arguments. The first must be either Allow or Disallow while the second is a prefix that specifies which filesystem paths it should apply to. Because the default is to allow loading from the filesystem for all paths listed in any LoadFromFile statement, most of the time you will be using Disallow to turn off filesystem loading for some subset of those paths. You would use Allow only after a Disallow that was overly general.

Not all sites are well suited for prefix-based control. Consider a site with PHP files mixed in with ordinary static files:


Blacklisting just the .php files so they fall back to an HTTP fetch allows everything else to be loaded directly from the filesystem:

ModPagespeedLoadFromFile http://example.com/ /var/www/
ModPagespeedLoadFromFileRuleMatch Disallow \.php$
pagespeed LoadFromFile http://example.com/ /var/www/;
pagespeed LoadFromFileRuleMatch Disallow \.php$;

The LoadFromFileRuleMatch directive also takes two arguments. The first is either Allow or Disallow and functions just like for LoadFromFileRule above. The second argument, however, is a RE2-format regular expression instead of a file prefix. Remember to escape characters that have special meaning in regular expressions. For example, if instead of \.php$ we had simply .php$ then a file named example.notphp would still be forced to load over HTTP because "." is special syntax for "match any single character".

Consider a site with the opposite problem: a few file types can be reliably loaded from file but the rest need interpretation first. For example:


This site uses server side includes (Apache, Nginx) in its javascript and generate-image.pl needs to be interpreted to make images. The only resources on the site that are generally safe to load are .css ones. By first blacklisting everything and then whitelisting only the .css files, we can make PageSpeed do this:

ModPagespeedLoadFromFile http://example.com/ /var/www/
ModPagespeedLoadFromFileRuleMatch disallow .*
ModPagespeedLoadFromFileRuleMatch allow \.css$
pagespeed LoadFromFile http://example.com/ /var/www/;
pagespeed LoadFromFileRuleMatch disallow .*;
pagespeed LoadFromFileRuleMatch allow \.css$;

This works because order is significant: later rules take precedence over earlier ones.

Script Variables with LoadFromFile

Note: New feature as of

Note: Nginx-only

As of Nginx script variables are now supported with the various LoadFromFile directives. Script support for those options makes it possible to configure a generic mapping of http hosts to disk, to reduce the amount of configuration required when you want to load as much from disk as possible but have a lot of server{} blocks.

As an example, consider one server that hosts three sites, each of which have a directory /static that holds static resources and can be loaded from file. One way to configure this server would be:

http {
  server {
    server_name a.example.com;
    pagespeed LoadFromFile http://a.example.com/static /var/www-a/static;
  server {
    server_name b.example.com;
    pagespeed LoadFromFile http://b.example.com/static /var/www-b/static;
  server {
    server_name c.example.com;
    pagespeed LoadFromFile http://c.example.com/static /var/www-c/static;

For three sites this is kind of annoying, but the more sites you have the worse it gets. With ProcessScriptVariables you can define one generic LoadFromFile mapping instead of defining each one individually:

http {
  pagespeed ProcessScriptVariables on;
  pagespeed LoadFromFile "http://$host/static" "$document_root/static";

  server {
    server_name a.example.com;
  server {
    server_name b.example.com;
  server {
    server_name c.example.com;

This will use Nginx's $host and $document_root script variables instead of requiring you to explicitly code each one.

For more details on script variables, including how to handle dollar signs, see Script Variable Support.


This should only be used for completely static resources which do not need any custom headers or special server processing. If non-static resources exist in the specified directory, the source code will be used without applying SSI includes, CGI generation, etc. Furthermore, all the resources should have filenames with common extensions for their Content-Type (Ex: .html, .css, .js, .jpg, .jpeg, ... see full list: content_type.cc).

Inlining resources without explicit authorization

Several filters in PageSpeed operate by inlining content from resources into the HTML: inline_css, inline_javascript and prioritize_critical_css are a few of the filters that operate in this manner. If resources from third-party domains are not authorized explicitly, the effectiveness of these filters decreases. For instance, prioritize_critical_css attempts to remove blocking CSS requests needed for the initial render by inlining critical CSS snippets into the HTML, however, the CSS resources that are not authorized will continue to block. This option allows such resources to be inlined without having to authorize all the individual domains.

The InlineResourcesWithoutExplicitAuthorization directive can be used to allow resources from third-party domains to be inlined into the HTML without requiring explicit authorization for each domain. This option is "off" by default, and takes a comma-separated list of strings representing resource categories for which the option should be enabled. The list of valid resource categories is given here. Currently, only Script and Stylesheet resource types are supported for this option.

This option can be enabled as follows:
ModPagespeedInlineResourcesWithoutExplicitAuthorization Script,Stylesheet
pagespeed InlineResourcesWithoutExplicitAuthorization Script,Stylesheet;

Warning: Enabling InlineResourcesWithoutExplicitAuthorization could permit hostile third parties to access any machine and port that the server running mod_pagespeed has access to, including potentially those behind firewalls. Please read the following information for details.

This directive should only be enabled if all of the following conditions are met for the resource types for which this option is enabled:

  1. The webmaster is confident that the resources referenced on their pages are from trusted domains only.
  2. The site does not allow user-injected resources for the enabled resource types.
  3. Fetches from the PageSpeed server should have no more access to machines or ports than anyone on the Internet, and machines it can access should not treat its traffic specially. Specifically, the PageSpeed servers should not be able to access anything that is internal to a firewall. Please refer to Fetch server restrictions sections for more details.

Note that resources inlined into HTML via this option will not be accessible directly via a pagespeed URL, since that involves different security risks. Resources will also not be inlined into other non-HTML resources via this option. This means that flatten_css_imports will not flatten third-party CSS into another CSS resource, unless the relevant third-party domains are authorized explicitly via one of the techniques mentioned in the previous sections.