Under the hood of the Netscaler

After writing about the Netscaler hardware specifically, the SDX, it was logical to take a step back to discuss an oft-overlooked, yet critical topic: the Netscaler software.

Netscaler joined the Citrix family of products by way of an acquisition in June of 2005. It is a customized kernel that resides within the user region of the FreeBSD operating system.


Source: Citrix

FreeBSD is a free open-source operating system with similarities to Unix. Unlike Linux, FreeBSD has a modular kernel making it the operating system of choice for applications. Citrix capitalized on this with the Netscaler Application Delivery Controller (ADC).  They modified the FreeBSD bash shell by removing the networking subsystem and replacing it with their own TCP/IP stack and zero-copy driver stack. The modifications were housed in a custom kernel module called the NetScaler Core Packet Processing Engine (PPE).

It is comprised of two shells: the BSD kernel and the Netscaler kernel. Both work in tandem as a cohesive unit thanks to the strict delineation of roles. The BSD kernel manages the boot process, file system access, and long term logging. The Netscaler kernel controls time slicing for BSD, network access, SSL offloading, SNMP and syslog processing.

netscaler kernel architecture

The PPE (alternatively known as the packet engine (PE)) was designed to mine the performance gains that can be realized from parallelization. With the advent of multi-core CPUs and Intel’s Receive Side Scaling (RSS) technology, Citrix saw an opportunity to optimize packet processing. Each PPE process is assigned to a core and works as follows:
– monitor incoming packets
– pull them off the packet queue
– attend to them accordingly for content switching, front end optimization, caching etc
– place the packets back in the packet queue
– listen for more packets

So, at any given time, the process is either working on a packet or listening for packets. With multi-core CPUs, this can be done in parallel. Specific cores are tasked with specific functions. For example, core 1 might be dedicated to managing network traffic, core 2 to processing TCP/IP , core 3 to layer 7 (e.g., HTTP) processing, etc. This is possible because each process is a mini Netscaler capable of performing all Netscaler supported application optimization tasks.

The upper bound on how much parallel processing can occur at any given time is determined by the  number of cores in the CPU. For example, on a CPU with 4 cores, 3 cores are assigned to 3 separate PPEs with 1 core reserved for management functions eg. SNMP. Note that one core is always reserved for management.

When the Netscaler is powered up, FreeBSD boots and loads the Netscaler kernel. It offloads all the CPUs to the Netscaler kernel except the management core and then hands the reins to the Netscaler to complete boot up.

  1. Upon bootup, FreeBSD loader initiates the boot process.

Netscaler bootup - freeBSD

2. Netscaler concludes the boot up process

Netscaler bootup - custom kernel



Citrix NetScaler nCore Technology

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s