Microsoft Windows NT 4.0 Server, Terminal Server Edition: An Architectural Overview

Microsoft Windows NT 4.0 Server, Terminal Server Edition:

An Architectural Overview

Abstract

Microsoft Windows NT 4.0 Server, Terminal Server Edition (code named Hydra) gives the Windows NT Server operating system the capability to serve 32-bit Microsoft Windows operating system-based applications to terminals and terminal emulators running on PC and non-PC desktops. The Hydra environment is, by definition, a thin-client architecture where all application processing occurs centrally on the server. Because Hydra terminal emulator clients will be available for many different desktop platforms (Macintosh, UNIX, and others), Hydra provides access to 32-bit Windows-based applications from virtually any desktop, and provides a bridging technology for organizations that are transitioning to a pure 32-bit desktop environment.

Technology Update: What is Windows NT Hydra?

Microsoft Windows NT Hydra is the code name for a project that will add Microsoft Windows-based Terminal support to the Microsoft Windows NT Server operating system and a super-thin client to the Windows operating system family product line. This new technology will provide enterprise customers with a compelling new extension to the Windows computing environment that combines low total cost of ownership, the familiar 32-bit Windows user interface, and the power and choice of the Windows operating system family.

Microsoft Windows NT 4.0 Server, Terminal Server Edition is an extension of the Windows NT Server 4.0 product line. In the multi-user Windows NT environment, a super-thin client allows users to run the Windows desktop operating system and Windows-based applications completely off the server. Windows NT 4.0 Server, Terminal Server Edition will provide users access to 16- or 32-bit Windows-based applications from any of the following types of desktops:

A new class of low-cost hardware, commonly referred to as Windows-Based Terminals, that will be marketed by third-party hardware vendors.

Any existing 32-bit Windows desktop operating system, such as
Windows 95, Windows NT Workstation, and even Windows NT Server (running the 32-bit Windows Terminal Server client as a window within the local desktop environment).

Older 16-bit Windows-based desktops running the Windows 3.11 (Windows for Workgroups) operating system (running the 16-bit Windows Terminal Server client as a window within the local desktop environment).

X-based Terminals, Apple Macintosh, MS-DOS operating system, Networked Computers, and UNIX-based desktops (through a third-party add-on product).

Product Overview

Windows NT 4.0 Server, Terminal Server Edition consists of three componentsthe Windows NT Server multiuser core, the Remote Desktop Protocol (RDP), and the "super-thin" Windows-based client software. Specifically:

Windows Terminal Server A multiuser server core that provides the ability to host multiple, simultaneous client sessions on Windows NT Server 4.0, and on future versions of Windows NT Server. Windows Terminal Server is capable of directly remoting a 32-bit Windows NT-based desktop to a variety of Windows-based and non-Windows-based hardware. Standard Windows-based applications do not need modification to run on the Windows Terminal Server, and almost all Windows NT-based management infrastructure and technologies can be used to manage the client desktops. In this way, corporations can take advantage of the rich choice of applications and tools offered by todays Windows environment.

Remote Desktop Protocol (RDP) A key component of Windows Terminal Server is the protocol that allows a super-thin client to communicate with the Windows Terminal Server over the network. This protocol is based on International Telecommunications Unions (ITU) T.120 protocol, an international-standard multi-channel conferencing protocol currently used in the Microsoft NetMeeting conferencing software product. It will also support encrypted sessions.

Super-Thin Client The client software that presents, or remotes, the familiar 32-bit, Windows NT 4.0 user interface to a range of desktop hardware:

- New Windows-based Terminal devices based on Windows CE.

- Personal computers running Windows 95 and Windows NT Workstation.

- Personal computers running Windows for Workgroups (Windows 3.11).

Whats New in the Architecture of Windows NT 4.0 Server, Terminal Server Edition

To achieve the multi-user capabilities required in Windows NT 4.0 Server, Terminal Server Edition, components, services and drivers have been added or modified to the Windows NT 4.0 core operating system. Windows NT 4.0 components such as the Virtual Memory Manager and Object Manager have been modified to perform in a multi-user environment.[1] While these are not the only components modified, they are the most significant, and other subsystem additions will be detailed in later areas of this paper.

The Windows NT Object Manager (now called the Multi-User Object Manager) has been modified in Windows Terminal Server to provide the virtualization of objects (for example, mutant, timer, semaphore, process, thread, and so on) so that applications and system programs of different sessions do not collide. Every object name created is appended with a unique identifier number associated with an individual session (SessionID). For example, if Microsoft Word was started in Session 1, the Multi-User Object Manager would append the object name with the SessionID, \\winword:1.

The Virtual Memory Manager maps virtual addresses in the processs address space to physical pages in the computers memory. The Virtual Memory Manager hides the physical organization of memory from the processs threads to ensure that the thread can access its own memory but not the memory of other processes. In Windows Terminal Server, to allow each sessions processes to share the same kernel virtual address space, a new virtual address space called SessionSpace was designed and is managed by the virtual memory manager. SessionSpace is the space in virtual memory that is specific to each user sessions kernel. This virtual address space points to the same set of memory-management-mapped objects and physical pages for all processes that share the same SessionID. Other process groups, with different SessionIDs, point to a separate set of memory-mapped objects and physical pages at the same virtual address.

A new Windows NT service called Terminal Server (termsrv.exe) is the controlling process in the Hydra architecture. It is primarily responsible for session management, initiation and termination of user sessions, and session event notification. The Terminal Server service is entirely protocol-independent, so can function using RDP or a third-party add-on protocol such as Citrixs ICA.

A user mode protocol extension provides assistance to the Terminal Server service. It is the responsibility of this component to provide protocol-specific functions and services, such as licensing, session shadowing, client session enumeration, and so on. Each Terminal Server protocol (for example, RDP and ICA) will have its own protocol extension providing a variety of these types of services.

RDP

Remote Desktop Protocol (RDP)

Remote Desktop Protocol is based onand is an extension ofthe T-120 family of protocol standards. A multi-channel capable protocol allows for separate virtual channels for carrying presentation data, serial device communication, licensing information, highly encrypted data (keyboard, mouse activity), and so on. As RDP is an extension of the core T.Share protocol, several other capabilities are retained as part of the RDP, such as the architectural features necessary to support multi-point (multi-party sessions). Multi-point data delivery allows data from an application to be delivered real-time to multiple parties, without having to send the same data to each session individually (for example, Virtual Whiteboards). In this first release of Windows Terminal Server, however, we are concentrating on providing reliable and fast point-to-point (single-session) communications. However, the flexibility of RDP gives plenty of room for functionality in future products.

One reason that Microsoft decided to implement RDP for connectivity purposes within the Windows Terminal Server is that it provides a very extensible base from which to build many more capabilities. This is because RDP provides 64,000 separate channels for data transmission. However, current transmission activities are only using a single channel (for keyboard, mouse and presentation data). Also, RDP is designed to support many different types of Network topologies (such as ISDN, POTS, and many LAN protocols such as IPX, Netbios, TCP/IP, and so on). The current version of RDP will only run over TCP/IP, but with customer feedback, other protocol support may be added in future versions.

The activity involved in sending and receiving data through the RDP stack is essentially the same as the seven-layer OSI model standards for common LAN networking today. Data from an application or service to be transmitted is passed down through the protocol stacks, sectioned, directed to a channel (through MCS), encrypted, wrapped, framed, packaged onto the network protocol, and finally addressed and sent over the wire to the client. The returned data works the same way only in reverse, with the packet being stripped of its address, then unwrapped, decrypted, and so on until the data is presented to the application for use. Key portions of the protocol stack modifications occur between the fourth and seventh layers, where the data is encrypted, wrapped and framed, directed to a channel and prioritized.

One of the key points for application developers is that in utilizing RDP, Microsoft has abstracted away the complexities of dealing with the protocol stack. This allows them to simply write clean, well designed, well-behaved 32-bit applications, then the RDP stack implemented by the Terminal Server and its client connections takes care of the rest. For more information on how applications interact on the Terminal Server and what to be aware of when developing applications for a Windows Terminal Server infrastructure, look at the White Paper on the Microsoft Web site at http://www.microsoft.com/ntserver/library/hydrawp2.exe

Four components worth discussing within the RDP stack instance are the Multipoint Communication Service (MCSMUX), Generic Conference Control (GCC), Wdtshare.sys and Tdtcp.sys. MCSmux and GCC are part of the International Telecommunication Union (ITU) T.120 family. The MCS is made up of two standards: T.122, which defines the multipoint services, and T.125, which specifies the data transmission protocol. MCSMux controls channel assignment (by multiplexing data onto predefined virtual channels within the protocol), priority levels, and segmentation of data being sent. It essentially abstracts the multiple RDP stacks into a single entity, from the perspective of the GCC. GCC is responsible for management of those multiple channels. The GCC allows the creation and deletion of session connections and controls resources provided by MCS. Each Terminal Server protocol (currently, only RDP and ICA are supported) will have a protocol stack instance loaded (a listener stack awaiting a connection request). The Terminal Server device driver coordinates and manages the RDP protocol activity and is made up of smaller components, an RDP driver (wdtshare.sys) for UI transfer, compression, encryption, framing, and so on, and a transport driver (tdtcp.sys) to package the protocol onto the underlying network protocol, TCP/IP.

RDP was developed to be entirely independent of its underlying transport stack, in this case TCP/IP. RDP, being completely independent of its transport stack, means that we can add other transport drivers for other network protocols as customers needs for them grow, with little or no significant changes to the foundational parts of the protocol. These are key elements to the performance and extendibility of RDP on the network.

Walk-through of the Terminal Server Start-up

To assist in understanding how this new operating system operates, we will walk through the initialization process of a Hydra server and describe what occurs when a user connects to the server and runs an application.

Windows Terminal Server Initialization

As the Windows Terminal Server boots and loads the core operating system, the Terminal Server service (termsrv.exe ) is started and creates listening stacks (one per protocol and transport pair), which listen for incoming connections. Each connection is given a unique session identifier or SessionID to represent an individual session to the Hydra server, and each process created within a session is tagged with the associated SessionID to differentiate its namespace from any other connections namespace.

The console (Hydra server keyboard, mouse, and video) session is always the first to load and is treated as a special-case client connection and assigned SessionID0. The console session starts as a normal Windows NT system session, with the configured Windows NT display, mouse, and keyboard drivers loaded.

The Terminal Server service then calls the Windows NT Session Manager (SMSS.exe) to create two (default = 2) idle client sessions (after creating the console session) awaiting client connections. To create the idle sessions, the Session Manager executes the Windows NT-based client/server runtime subsystem process (CSRSS.exe) and a new SessionID is assigned to that process. The CSRSS process will also invoke the Winlogon (Winlogon.exe) process and the Win32k.sys (Window Manager and GDI) kernel module under the newly associated SessionID. The modified Windows NT image loader will recognize this Win32k.sys as a SessionSpace loadable image by a predefined bit set in the image header. It will then relocate the code portion of the image into physical memory with pointers from the virtual kernel address space for that session if Win32k.sys has not already been loaded. By design, it will always attach to a previously loaded images code (Win32k.sys) if one already exists in memory (for example, from any active application or session). The data (or non-shared) section of this image will then be allocated to the new session from a newly created SessionSpace pageable kernel memory section. Unlike the console session, Hydra client sessions are configured to load separate drivers for the display, keyboard, and mouse. The new display driver is the Remote Desktop Protocol (RDP) display device driver (Tsharedd.dll). The mouse and keyboard drivers communicate into the stack through the multiple instance stack manager, termdd.sys. Termdd.sys will send the messages for mouse and keyboard activity to and from the RDP driver Wdtshare.sys. These drivers allow the RDP client session to be both available and interactive, remotely. Finally, Terminal Server will also invoke a connection listener thread for the RDP protocol, again managed by the multiple instance stack manager (Termdd.sys), which listens for RDP client connections on TCP port number 3389.

At this point, the CSRSS process exists under its own SessionID namespace, with its data instantiated per process as necessary. Any processes created from within this SessionID will execute within the SessionSpace of the CSRSS process automatically. This prevents processes with different SessionIDs from accessing another sessions data.

Client Connection

The RDP client can be installed and run on any Windows-based Terminal (based on WinCE), Windows for Workgroups or Microsoft Win32 API-based platform (non-Windows-based clients are supported by the Citrix Metaframe add-on). The Windows for Workgroups RDP client is approximately 170K in size (.exe), uses a 300K working set and 100K for display data. The Win32-based client is approximately 190K in size, uses a 300K working set and 100K for display data.

The client will initiate a connection to the Hydra server through TCP port 3389. The Terminal Server RDP listener thread will detect the session request and create a new RDP stack instance to handle the new session request. The listener thread will hand over the incoming session to the new RDP stack instance, and continue listening on TCP port 3389 for further connection attempts. Each RDP stack is created as the client sessions are connected in order to handle negotiation of session configuration details.

The first details will be to establish an encryption level for the session. The Hydra server will initially support 3 encryption levelsLow, Medium, and High.

Low encryption will encrypt only packets being sent from the client to the Hydra server. This input only encryption is to protect the input of sensitive data like a users password. Medium encryption will encrypt outgoing packets from the client the same as Low-level encryption, but will also encrypt all display packets being returned to the client from the Hydra server. This method of encryption will secure sensitive data as it travels over the network to be displayed on a remote screen. Both Low and Medium encryption use the Microsoft-RC4 algorithm (modified RC4 algorithm with improved performance) with a 40-bit key. High encryption will encrypt packets in both directions, to and from the client, but will use the industry standard RC4 encryption algorithm, again with a 40-bit key. A non-export version of Windows NT Terminal Server will provide 128-bit high-level RC4 encryption.

All RDP clients will, by default, reserve 1.5 MB of memory for a cache that is used to cache glyphs and bitmaps, such as characters, icons, toolbars, cursors, and so on. When displaying characters on an RDP client, fonts are rasterized on the Hydra server and cached on the client as required. This method saves on bandwidth, CPU cycles and equalizes performance throughout the entire RDP client family. The client cache is tunable (through a registry key) and overwritten using a Least Recently Used (LRU) algorithm.

The Hydra server also contains buffers to enable flow-controlled passing of screen refreshes to clients, rather than a constant bitstream. When user interaction at the client is high, the buffer is flushed at approximately 20 times per second. During idle time or when there is no user interaction, the buffer is slowed to only flush 10 times per second. All these numbers are tunable through the registry.

After session details have been negotiated, the server RDP stack instance for this connection will be mapped to an existing idle Win32k user session and the user will be prompted with the Windows NT logon screen. If autologon is configured, the encrypted username and password will be passed to the Hydra server and logon will proceed. If no idle Win32k sessions currently exist, the Terminal Server service will call the Session Manager (SMSS) to create a new user space for the new session. Much of the Win32k user session is utilizing shared code and will load noticeably faster after one instance has previously loaded.

After the user types their username and password, packets are sent encrypted to the Hydra server. The Winlogon process then performs the necessary account authentication to ensure that the user has privilege to log on and passes the users domain and username to the Terminal Server service, which maintains a domain/username SessionID list. If a SessionID is already associated with this user (for example, a disconnected session exists), the currently active session stack is simply attached to the old session. The temporary Win32 session used for the initial logon is then deleted. Otherwise, the connection proceeds as normal and the Terminal Server service creates a new domain/username SessionID mapping. If for some reason more than one session is active for this user, the list of sessions is displayed and the user decides which one to make the reconnection.

Running an Application

After user logon, the desktop (or application if in single app mode) is displayed for the user. When the user selects a 32-bit application to run, the mouse commands are passed to the Hydra server, which launches the selected application into a new virtual memory space (2-GB application, 2-GB kernel).

All processes on the Hydra server will share code in kernel and user modes wherever possible. To achieve the sharing of code between processes, the Windows NT Virtual Memory (VM) manager utilizes copy-on-write page protection. When multiple processes want to read and write the same memory contents, the VM manager will assign copy-on-write page protection to the memory region. The processes (Sessions) will use the same memory contents until a write operation is performed, at which time the VM manager will copy the physical page frame to another location, update the processs virtual address to point to the new page location and now mark the page as read/write. Copy-on-write is extremely useful and efficient for applications running on a Hydra server.

When a Win32-based application such as Microsoft Word is loaded into physical memory by one process (Session) it is marked as copy-on-write. When new processes (Sessions) also invoke Word, the image loader will just point the new processes (Sessions) to the existing copy because the application is already loaded in memory. When buffers and user-specific data are required (for example, saving to a file), the necessary pages will be copied into a new physical memory location and marked as read/write for the individual process (Session). The VM manager will protect this memory space from other processes. Most of an application, however, is shareable code and will only have a single instance of code in physical memory no matter how many times it is run.

Its is preferable (although not necessary) to run 32-bit applications in a Hydra environment. 32-bit applications (Win32) will allow sharing of code and run more efficiently in multi-user sessions. Windows NT allows 16-bit applications (Win16) to run in a Win32 environment by creating a virtual MS-DOS-based machine (VDM) for each Win16 application to execute. All 16-bit output is translated into Win32 calls which perform the necessary actions. Because Win16 apps are executing within their own VDM, code cannot be shared between applications in multiple sessions. Translation between Win16 and Win32 calls also consumes system resources. Running Win16 applications in a Hydra environment can potentially consume twice the resources than a comparable Win32-based application will.

Session Disconnect/User Logoff

Session Disconnect

If a user decides to disconnect the session, the processes and all virtual memory space will remain and be paged off to the physical disk, if physical memory is required for other processes. Because the Terminal Server keeps a mapping of domain/username and its associated SessionID, when the same user reconnects, the existing session will be loaded and made available again. An additional benefit of RDP is that it is able to change session screen resolutions depending on what the user requests for the session. For example, lets say a user had previously connected to a Hydra session at 800 x 600 resolution and disconnected. If the user then moves to a different computer that only supports 640 x 480 resolution and re-connects to the existing session, the desktop will be redrawn to support the new resolution.

User Logoff

Logoff is typically very simple to implement. Once a user logs off from the session, all processes associated with the SessionID are terminated and any memory allocated to the session is released. Of course, if the user was running a 32-bit application like Microsoft Word and logged off from the session, the code of the application itself would remain in memory until the very last user exited from the application.

Summary

To enable Windows NT 4.0 Server to function in a multi-user environment, various core components have been modified and new functionality added. However, due to the modular design of Windows NT, most components remain unchanged and function equally well in a Windows Terminal Server-based architecture.

For More Information

For the latest information on the Windows NT Terminal Server product, see http://www.microsoft.com/ntserver.

Appendix A

The Windows NT 4.0 architecture merges the best attributes of a layered operating system with those of a client/server or microkernel operating system.[2]

The Windows NT operating system can be divided into two sections: the user mode (Windows NT protected subsystems) and kernel mode (Windows NT executive).

Privileged or Kernel mode is a highly privileged mode of operation in which the code has direct access to all hardware and memory in the system, including that of user mode processes. Application threads must be switched to privileged mode to run in operating system code. Applications call privileged-mode operating system services for essential functions such as drawing windows, receiving information about user keyboard and mouse input, and checking security.

User mode is the processing mode in which all applications run. User mode is a less privileged processor mode with code running only in its own address space and with no direct access to hardware. Processes running in user mode must call operating system functions to switch their threads to privileged mode to use operating system services.

The Windows NT executive is the kernel mode portion of the Windows NT operating system. The executive is made up of a series of components that provide a variety of functions and system services to subsystems and other executive components.

The function of each of the executive components is outlined below:

Microkernel

The efficiency of an operating system typically depends on the operation of the microkernel or kernel. It is the responsibility of the kernel to control how the operating system uses its processor(s) and must ensure that it is used effectively. The kernel typically controls process thread scheduling and dispatching, detects and responds to interrupts and exceptions, synchronizes the running of multiple processors and provides system robustness in case of power failure.

I/O Manager

The I/O Manager coordinates and manages all input and output for Windows NT, primarily managing communication between other I/O drivers by providing a common interface that they can call. The I/O Manager typically controls access to: cache manager, which handles caching of frequently used files in memory for the entire I/O system; file system drivers, which allow support for multiple file systems such as FAT, FAT32, and NTFS; hardware device drivers, which allow support for peripheral devices such as printers, disks, and mice; and network drivers, which allow support for network redirectors, protocols and network adapters.

Object Manager

Object Manager is the part of the Windows NT Executive that creates, manages, and deletes objects. Objects are software components that consist of a data type, attributes, and a set of operations the object performs.

Object Manager also provides uniform rules for retaining, naming, and setting the security of objects, and creates object handles. An object handle consists of access control information and a pointer to the object. Processes use object handles to manipulate Windows NT objects.

Security Reference Monitor

The Security Reference Monitor enforces security policies. It provides services to both kernel and user modes to ensure the users and processes attempting access to an object have the necessary permissions. This component also generates audit messages when appropriate.

Process Manager

The Process Manager is the part of the Windows NT Executive that manages two types of objectsprocess objects and thread objects. A process is defined as an address space, a set of objects (resources) visible to the process, and a set of threads that run in the context of the process. A thread is most the basic schedulable entity in the system. It has its own set of registers, its own Kernel stack, a thread environment block, and user stack in the address space of its process.

The Process Manager provides a standard set of services for creating and using threads and processes in the context of a particular subsystem environment.

Local Procedure Call (LPC) Facility

Applications and protected subsystems have a client/server relationship. The subsystems provide services that applications can utilize. Applications communicate with subsystems by passing messages through the Windows NT executives local procedure call (LPC) mechanism. The message-passing process is hidden from the applications by function stubs provided in special dynamic link libraries (dll).

Virtual Memory Manager

The Virtual Memory Manager allocates memory in two phases for efficiency: reserving it, then committing it. Committed memory is part of the paging file, the disk file used to write pages to physical memory. Reserved memory is held until needed, then committed. The act of reserving memory maintains a contiguous virtual address space for a process, which is then consumed as needed.

The Virtual Memory Manager allocates to each process a unique protected set of virtual addresses available to the processs threads. Each process has a separate address space, so a thread in one process cannot view or modify the memory of another process without authorization. This address space appears to be 4 gigabytes (GB) in size, with 2 GB reserved for program storage and 2 GB reserved for system storage.

Window Manager and Graphics Device Interface (GDI)

The Window Manager is the part of the Windows NT Executive that creates the familiar screen interface. The Graphics Device Interface (GDI), also called the graphics engine, consists of functions in Win32k.sys that display graphics on the computer monitor and printers. It is implemented by the same component as Window Manager.

Applications call the standard USER functions to create windows and buttons on the display. Window Manager communicates these requests to GDI, which passes them to the graphics display drivers, where they are formatted for the display device.

The Window Manager notifies applications when changes occur in the users interaction with the interface, such as movement or resizing of windows, cursor movement, and icon selection.

Prior to Windows NT 4.0, the Window Manager was the USER component of the Win32 subsystem and ran in a protected process in user mode. With Windows NT 4.0, they were moved into the Windows NT Executive to run in kernel mode. This change was designed to speed up graphics calls and reduce the memory requirements of interprocess communication.

For More Information

For the latest information on Windows, check out our World Wide Web site at http://www.microsoft.com/windows

[1] For more information about the core architecture of Windows NT Server 4.0, See Appendix A, at the end of this document.

[2] For a complete discussion on Windows NT architecture, see Inside NT by Helen Custer.