Webscale Application Architecture

A Brief Overview of the Application Architecture managed by Webscale

Webscale manages (and optionally, hosts) applications running on hyperscale cloud providers with its established architecture guidelines. This document provides insight into the Application architecture managed by Webscale.

We deploy the code to a centralized server where it’s then horizonally scaled across multiple servers based on configuration set by the customer. This collection of servers is also known as an Application Cluster. Several commands help manage this sync process and the symbolic links that allow all application servers to log to the centralized server, also called a dataserver.

Definitions:

Dataserver

This is the centralized database and content server. It hosts an NFS (network file system) file share for static content and several Redis datastores that are memory-based fast caches. Magento cache, full page cache, and sessions use the Redis datastores. The dataserver is also the primary location for hosting the code. When you update your code, Webscale syncs it to the application server. Cron is also normally hosted here. In some architectures, the database and data server may be separated into different servers for increased efficiency. The database may also have a replica in another region/zone, mirroring the database contents to ensure business continuity.

Application server/Web server

This is the distributed, auto-scaling part of your application environment. It hosts the web server and the PHP portions. We use symbolic links inside the /var/www/sitename directory (also known as the document root) structure to point to directories hosted on the NFS (network file system) share. This way the application can share static content without having to sync huge amounts of data from the dataserver to the application server. This also allows logging directories within Magento or other applications to log code-related errors to a centralized, non-transient location.

Webscale data plane/ADC/Proxy

These are all synonyms for the Webscale platform component that “fronts” your application infrastructure. (ADC is an acronym for application delivery controller). The Webscale data plane sits between your application and the internet, functioning as a reverse proxy, managing and protecting traffic accessing the application. It can do many things - load balancing, web application firewall, caching, and more. It also has a powerful DIY policy engine called Web Controls. A common use case that we enable through Web Controls is to protect your admin login pages by blocking traffic from all but the whitelisted addresses.

Separation of responsibilities

There are two distinct spheres of responsibility, although there is overlap in some areas. Webscale is responsible for infrastructure, and you, the Webscale customer, are responsible for code. Overlap comes into play when it’s related to cron jobs, Apache settings, and PHP settings on the web servers.

Code is the PHP core of the site, including the interactions of code with the MySQL database. There are lots of interactions of code with infrastructure, but the Webscale staff are not PHP programmers, so we can only help up to a point. Code also includes images, documents, and other static assets. Static content usually goes on the network drive so that it doesn’t have to be synced to every web server created.

Infrastructure is the “backend hardware” of the site. Yes, your site is virtualized, but all that virtual stuff is designed to be a virtual representation of some bit of hardware that could be physically manifested outside the cloud—for example, servers, disks, RAM, server software, and network.

Code deployment layout

In the webscale environment, there are typically two separate disks that hold code.

The first is /var/www/www.site.net.

The second is var/www/shared/www.site.net.

For example, we set up /var/www/www.site.net/var to point to /var/www/shared/www.site.net/var.

The symbolic link is then replicated by the sync process to the web server so that it is consistent across a cluster. We can also use this for upload directories to ensure that any upload to any web server is immediately available across the cluster. Other links include images or other large media files. We like to sync as little as possible to the web server so that it can quickly start when needed.

There are a couple of Webscale specific commands that are very useful in this environment. The two most common are ACS and webscale-cli which are documented on our website on the Webscale CLI Reference Guide.

Deployment best practices

  1. Use the www-upload user for deployment. Use the command sudo -iu www-upload to switch to this user.
  2. To fix any permission issues on the dataserver run sudo webscale-cli deploy permissions.
  3. To fix any permission issues on the web node, run sudo webscale-cli deploy app_permissions.
  4. To check for a sync issue and to ensure that sync is working run acs sync && acs status -w && acs sync-status.

Accessing your servers

As a customer, you usually would not need to access any web server, as they are transient, and we forward logs to the dataserver.

We control access to the dataserver with ssh or secure shell. We strictly enforce public key authentication as passwords are too easily compromised. As such, we add any customer login to a list of users that can use sudo to become www-upload, the primary web content owner. You also need sudo to run the two Webscale CLI commands mentioned above.

Webscale Control Panel

A vital window into your website is the Webscale Control Panel.

The Webscale Control Panel allows you to view or take action on different aspects of your application, including but not limited to:

  • Monitor traffic
  • View security policies
  • Analyze performance
  • Troubleshoot errors
  • Configure the Webscale services available to the application (such as setting up PHP whitelisting and blacklisting, for example)

Tips

PHP whitelisting and blacklisting is important to protect your site from common PHP hacks. You usually would blacklist all PHP scripts, and then whitelist the scripts you know are used in your site. You can and should set up a whitelist for your Magento admin page, and block all other IP addresses.

You can also use the control panel to put your site into maintenance mode when you need to work on it.

Please explore the Webscale control panel to become familiar with it.

Further Reading

Have questions not answered here? Please Contact Support to get more help.


Last modified April 10, 2020