Before You Begin
- This article is intended for a technical audience. It assumes a basic level of knowledge with web servers and related concepts
- Generally, Web, IT and/or security teams can assist a marketer in setting up a Reverse Proxy
Why Set Up a Reverse Proxy in PathFactory?
PathFactory helps Marketing and Sales use insights and automation to connect visitors with content, removing friction in the B2B buying journey. Using Pathfactory’s Campaign Tools (Content Tracks & Explore Pages & Microsite Builder) and Virtual Event Experience (VEX), you can create PathFactory hosted experiences for your audiences. To better brand these experiences, most customers use a custom subdomain (<>.mycompany.com). Some PathFactory customers wish to take this one step further by using a Reverse Proxy. This article will discuss the pros and cons to using a reverse proxy and explain the steps involved in setting one up.
Setting up a Reverse Proxy for your PathFactory Campaign Tools and Virtual Events (VEX) means delivering them on a URL that is part of your website's root domain.
Example: If your company’s website has the URL: https://www.mycompany.com, your content tracks can be served on a subdirectory of that domain https://www.mycompany.com/resources, https://www.mycompany.com/content, https://www.mycompany.com/library
While the experiences in this case are still hosted by PathFactory, the URL appears as if it is part of the company’s website.
Setting up a Reverse Proxy can be advantageous in certain situations including:
- In some cases where you wish to iframe your webpages into PathFactory experiences, a content security policy will prevent the iframe from working. Often the PathFactory experience being on a custom subdomain (eg. resources.mycompany.com) will not work with these content security policies. If your IT/Security team will not whitelist the subdomain you are using with PathFactory, then using a reverse proxy to display the PathFactory experience on the root domain (eg. mycompany.com/resources) will be allowed by the content security policy, causing the iframed webpages to work correctly.
- You want to seamlessly preserve some advanced technical functionality related to your website without having to modify it or created a dedicated version of it for the PathFactory custom subdomain
- Example: You have a script running on your website that creates a cookie with on the root domain www.mycompany.com to perform a certain function, and you want that to be functional on PathFactory experiences. Using a reserve proxy may allow that script/cookie to work across both the website and PathFactory experiences.
- Your IT/Security teams are familiar with Reverse-Proxies, they may prefer this set-up from a trust/security perspective, and are willing to set one up.
While using a reverse proxy has benefits, the set-up steps are complex and will generally require IT expertise. For many PathFactory clients a custom subdomain is sufficient, so this guide should not in any way be treated as a mandatory set-up step.
PathFactory Requirements for a Reverse-Proxy Setup:
PathFactory does not require any specific type of reverse proxy. You can choose the most appropriate option for your situation based on your preference or what you already have in place. Possible options include:
- A Content Delivery Network (CDN): Examples include: Amazon CloudFront, Akamai, Cloudflare (Enterprise)
- A dedicated load balancer or reverse proxy server: such as HAProxy or NGINX
- A reverse proxy/gateway module on your web server: such as mod_proxy on Apache HTTP Server or Application Request Routing on Microsoft IIS
Configure the Reverse Proxy
PathFactory cannot provide a detailed step-by-step on how to set up Reverse Proxy on your end because environments vary. The reverse proxy technology you are using will have its own documentation that outlines the exact steps you need to follow to get the desired result.
We will, however, provide the general setup objectives you need to achieve, as well as the reasoning behind them.
Objective 1: Proxy Requests for a Subdirectory to a Specific Origin Server.
Create a rule in your reverse proxy. The rule must:
- Proxy requests for a subdirectory and anything below it to a specific origin server.
- Example: If your subdirectory path is http://www.mycompany.com/resources/*, it needs to be directing traffic to your out-of-the-box default PathFactory sub-domain and include the subdirectory in it. Typically the default PathFactory domain provided is of the format mycompany.pathfactory.com. This means that you would be proxying requests coming through to http://www.mycompany.com/resources/, to mycompany.pathfactory.com/resources/*
Objective 2: Ensure the Required Request Methods are Passed
- Your rule must be able to pass all requests that use the following methods:
- You do not need to set up a rule that passes only requests using these methods. Requests using any other method will simply be ignored by the PathFactory origin server.
Objective 3: Ensure Requests are Passed Intact
- Your reverse proxy and/or your rule must be set up in such a way that requests passed to the PathFactory origin server are passed through as-is.
- This means that the origin server must receive the original request exactly as it was sent by the client — the reverse proxy may not make any changes to it.
Passing the requests intact also allows us to ensure the correct experience (eg. Content Track and content within the Content Track) is served to the client. It will also allow us to identify visitors and accurately track their engagement with the content served via PathFactory.
Objective 4: Enable TLS (“SSL”) Between Your Reverse Proxy and the Origin Server
- Requests passed from the reverse proxy server to the origin server must use a secure connection (HTTPS over TLS).
- The connection will be terminated at the origin server and is therefore covered by PathFactory's certificate, so you do not need to provide a certificate to PathFactory.
Objective 5: Disable Caching of the Origin Server (If Needed)
- Your reverse proxy may support caching of origin servers (i.e. if you are using a CDN). If so, you must specifically disable this feature for the PathFactory origin server – and preferably all across your website.
Confirm Basic Functionality
When you have configured your reverse proxy to meet the objectives outlined above, you can inform your PathFactory point-of-contact (Client Success Manager or Solutions Architect). PathFactory will test the URL provided with the subdirectory to ensure that experiences (such as Content Tracks, Explore Pages, and Virtual Events are served successfully through it.
Removing HTTP headers and changing the request method
Some proxy servers can be set up to change components of HTTP messages, including the method and the values in the HTTP request and response header fields.
This can cause problems within PathFactory as various functionality of PathFactory’s experiences rely on our origin server receiving this information exactly as it was sent by the client's browser (and vice-versa):
For example, the User-Agent and Cookie header fields, among others, are used to collect visitor metrics.
If your proxy server changes the HTTP messages, this can either break particular functionality, cause security risks to the client's browser, or cause the PathFactory experience to stop working altogether.
Changing the value of the Host header field
In addition to generally ensuring that your reverse proxy is not changing any HTTP headers, it's particularly important that the Host header on HTTP requests is not modified in any way.
Some reverse proxies (such as Apache's mod_proxy) will change the hostname on the request header by default, and replace it with the hostname of the origin server. For example, it would change Host: www.mycompany.com/subdirectory/content track/ to Host: mycompany.pathfactory.com/
With proxies that do this, you would generally need to specifically override this behavior by changing a setting (in mod_proxy's case, this setting is called ProxyPreserveHost) to ensure that the original host is kept on the proxied request.
Enabling caching on the proxy server
Some proxy servers, particularly those that are part of CDNs, offer the ability to cache server responses. With this functionality enabled, the proxy server can serve resources it has saved from its own cache, rather than requesting them from the origin server.
Using a proxy server's cache can have performance benefits for many web properties, but with PathFactory, doing so will break important functionality. As such caching functionality should be disabled, as specified in Objective 5 above.
Frequently Asked Questions
Q: My website uses HTTPS (SSL/TLS). If I set up a custom domain for my PathFactory experiences on a subdirectory of my website, do I need to send PathFactory a certificate?
A: No, you don't have to send us a certificate in this case. When a client requests the subdirectory for your content track, the TLS connection is terminated at your domain, because the subdirectory is under that domain. This means that it is covered by your domain's certificate.
The request is then passed on by your reverse proxy server to the PathFactory origin server (mycompany.pathfactory.com/resources/). This connection also uses HTTPS, and as it terminates on PathFactory's domain, it is covered by PathFactory's certificate.
Q: Are there any specific reverse proxies that do not work with custom domains?
A: Yes. At present, we have confirmed that the following will not work:
- AWS ALB (Application Load Balancer), ELB (Elastic Load Balancer) and Classic Load Balancer.
- Netlify, because it strips the host header.
We will be adding to this list as we identify other reverse proxy services that are not compatible with PathFactory custom domains.