- 1 -
Make Software Streaming Based on Light-weight
Virtualization1
Youhui Zhang, Gelin Su
Department of Computer Science and TechnologyTsinghua University Beijing, PRC, (100083)
Abstract
For many tasks IT performs in enterprises, deploying and managing personal applications are very
expensive. Use of tools and technologies to automate and speed up PC deployments can yield
significant savings that represent a surprisingly high proportion of total deployment costs. This paper
presents such a solution for Windows systems based on lightweight virtualization technologies.
Namely, the LAN user’s data and their configurations are stored on a central server. And at run-time,
the desktop-applications are downloaded from the server and run in a lightweight virtualization
environment without installation. Especially, in the multi-user environment, a Copy-on-Write
mechanism This paper also describes the whole design, technical details and performance evaluation.
Keywords: streaming software; light-weight virtualization; on-demand software
1 Introduction
For the many tasks IT performs, deploying and managing personal operating systems and
applications are very expensive. As mentioned by Gartner Group [1], PC hardware and operating
system choices are no longer the greatest determinants of PC total cost of ownership (TCO). Use
of tools and technologies to automate and speed up PC deployments can yield significant savings
that represent a surprisingly high proportion of total deployment costs. Moreover, for a personal
user, to install, maintain and upgrade PC software is a tedious and difficult task, especially under
the wild Internet with lots of virus and potential hackers.
Therefore, how to construct the easy software deployment and management system for PCs,
especially for enterprises is an important job, which will reduce TCO significantly and totally get
rid of the overheads of software installation.
Some similar technologies have been used in enterprises. For example, CITRIX [2] is such an
architecture, which allows a variety of remote computers to connect to a Windows NT terminal
server to remotely access a powerful desktop and its applications, while the remote computers
execute only the graphical interface of the applications. However, the central server may become
the performance bottleneck.
Collective [3] propose a virtual-machine-based solution. It runs virtual machines on a PC and
downloads OS images from a central server, so that all user’s applications and data and
configurations can be recovered. The VM-based solution is promising, because the processing
power of the local machine can be utilized efficiently. However, virtual machine will introduce
fairly extra performance overheads. As mentioned in [4], VM-based configuration incurred a
26-29% increase in response time for Office Productivity and Internet Content Creation
applications. Moreover, carrying or downloading a whole VM-based OS is not economical for
storage capacities.
In contrast to hardware-level virtual machine technologies, application-level technologies
have the virtualization layer positioned between the operating system and application programs.
Every virtualization environment shares the same execution environment as the host machine, and
only retains any divergences from the host as the VM’s local state. Therefore, such an
Supported by the High Technology Research and Development Program of China under .
中国科技论文在线
- 2 -
environment can have very small resource requirements and thus its overhead is lightweight.
Based on the lightweight virtualization technology, this paper presents a solution to make
software streaming: application software can be used just by one click on desktop without
installation, without configuration. All of these are completed in the central software-delivery
server. The result is a scalable software distribution system that is securely deployed, managed
and immediately available anywhere. And it is compatible with most existing PC software.
In this approach, the user’s data and applications and configurations are made portable; each
personalized application runs in an OS-level virtualization environment layered on top of the local
machine’s OS. This environment intercepts some resource-accessing APIs, including those that
access the system registry, files/ directories, and environment variables, from these applications,
and redirects them to the actual storage position(s) (the network server) rather than the local host.
In another word, this way software can be dealt just like data, making them easier to manage.
It means most of the primitive software installation, configuration, maintenance and upgrade are
centralized on the server rather than on each PC while the running performance does not drop,
which reduces the TCO significantly.
2 Key Issues and Technical Solution
On-demand Software
Application-level virtualization is used to makes applications instantly available and easy to
use, just like a web service that no longer needs to be installed, no longer conflict, and does not
alter the operating system. Moreover, management of software is centralized in a dedicated server.
By this way, just after click an icon on the desktop of PC, the related software files, configurations,
and anything else will be gotten from the server, which works like VOD.
Figure 1: Three parts of software
To conquer the challenges, we analyze the installation process and runtime behaviors of
software to give a construction and runtime model. And then, the design principles are presented.
The Model
Most Windows applications need to be installed before they can run normally. Even for an
application that can work without installation, many of them may save their customizations into
the system registry and/or into configuration files located in some system folders.
Then software can be regarded as containing three parts: Part 1 includes all resources
provided by the OS; Part 2 contains what are created/modified/deleted by the installation process;
R u n n i n g
P r o c e s s
S o f t w a r e R e s o u r c e
R e s o u r c e
A c c e s s e s
P a r t 3
P a r t 1
P a r t 2
M o d i f i c a
t i o n s
中国科技论文在线
- 3 -
and Part 3 is the data created/modified/deleted during the run time. For Windows OS, the
resources here mainly refer to files/folders, the related system registry keys/values and
environment variables.
During the run time, the running instance will access resources of all parts on the fly: some
resources are read-only while some may be modified/added/deleted. Therefore, no part is fixed:
the resource modified by the application instance at run time will be moved into Part 3.
Then, the design principles are drawn as follows:
1. All parts should be captured and made portable.
The exception is Part 1, because our solution only makes software portable on compatible
hosts, which implies that all resources of Part 1 are available on the local system.
2. All related resource accesses should be intercepted dynamically by a runtime system for
access redirection.
Installation Snapshot
To make Part 2 portable, the modifications made by the software’s installation process must
be captured. There are usually two types of modifications: registry contents and files/folders.
Some system monitoring tool, like InstallWatch [5], is used to complete the task. It can track
changes to the computer’s hard disk, registry, and .ini files when a new application is being
installed.
In our implementation, a target application is installed on one clean Windows system, while
InstallWatch is running to log those files created or modified in this process, as well as registry
additions and modifications. Then, the files/folders created or updated are copied to a separated
folder, called the private folder, while the directory hierarchy is retained. Similarly, the contents of
the added/modified registry keys are collected to be stored in a separated file, the private registry
file.
Furthermore, the captured snapshot is divided into six sets:
1. Added registry set (abbreviated to AR). It contains the entries created by the installation.
2. Deleted registry set (DR). Those entries deleted by the installation are included. So that
the entries in this set will not be accessed during the run time.
3. Modified registry set (MR). It contains the entries whose values or sub-keys have been
modified or deleted.
4. Added file set (AF). It is similar to the added registry set, including new files and new
folders created by the installation.
5. Deleted file set (DF). It is similar to the deleted registry set.
6. Modified folder set (MF). For any file or folder in the added/deleted file set, its parent
folder will be included in this set.
Runtime System
The destination of the runtime system is to make all parts accessible by the application’s
executable file transparently. API Interception is employed here to complete such a lightweight
virtualization environment.
API interception means to intercept calls from the application to the underlying running
system and to reinterpret them. It is usually used to extend existing OS and application
functionality without modifications of the source code. Detours [6], a library developed by
Microsoft Research Institute, are used to intercept those Windows APIs accessing the system
registry and files/folders. In detail, interception code is applied dynamically at
中国科技论文在线
- 4 -
runtime——Detours replaces the first few instructions of the target API with an unconditional
jump to the user-provided detour function. They are inserted at execution time. The code of the
target function is modified in memory, not on disk, thus facilitating interception of binary
functions at a very fine granularity.
The detailed work flow is presented as follows, which takes registry accesses as the example
(for APIs that access the file system, a similar method is adopted because folders can be regarded
as registry keys and files can be regarded as values).
The private registry is a complete registry system that provides access APIs just as Windows
OS does. When the target application is launched, the three registry sets (AR, MR and DR) will be
initialized. We use the absolute path to identify a single registry key and maintain a map structure
which maps a handle of any opened key to its full path. For example, when opens the
registry key "HKCR\.doc", the interception code will map the returned handle to the path string.
Then every time uses this handle, its full path can be gotten.
During the runtime, some resource-accessing APIs, including those that access the system
registry, from these applications, are intercepted and redirects them to the actual storage position(s)
rather than the local host.
The access principle is that any modification is always saved in the private space while any
query will return the combination of results from both registries. In addition, if there is any
duplication, the private has the higher priority.
Figure 2: Runtime virtualization environment
Deployment of On-demand Software across a LAN
Based on virtualization technologies, application software are downloaded from a central
server under central authentication. This way application software can be managed and released in
this mode. This eliminates the need to manage and install specific application sets for specific
users, and optimizes the use of the server.
Therefore portable applications are stored in a file server with the high-speed LAN
connection, and are shared through the CIFS protocol.
On the client end running the Windows OS, when the user launches our shell program, she
has to enter her ID information firstly, which can be used to present distinct applications for
different users. Then the shared position on the server is mounted as a local drive.
V i r t u a l i z a t i o n E n v i r o n m e n t
A p p
S y s t e m R e g .
P r i v a t
e F S
P r i v a t
e R e g .
I n t e r
c e p t i
o n
I n t e
r c e p
t i o n
F i l e
A c c e s s
R e g i s t
r y
A c c e s s
S y s t e m F S
中国科技论文在线
- 5 -
Besides of the shared position, each user has her own space (which is also mounted as
another local drive) on the server to host her personal documents, folders and configurations. In
the current implementation, a simple COW strategy is adopted: once any file is written, it will be
copied into the private position, and any following access will be redirected to the new version.
Some Optimizations
As we know, some Windows pre-installed applications, such as IE and Outlook Express, are
integrated into the OS, so it is very difficult to separate them from the system. On the other side,
they are always available on a compatible host. For such applications, only their customizations
and personalized data are made portable.
For example, when IE (located on the host system) is launched from our GUI, it will run in
the virtualization environment and its registry APIs are intercepted. Then, when it accesses
registry entries that store customizations, like home page, download folder, favorites, browser
history, internet temporary files and so on, the interception code will return values from the
private registry so that the portable personalized customizations are implemented.
For registry entries, our principle is that any modification is always saved in the private space
while any query will return the combination of results from both registries. In reality, lots of
registry accesses can be skipped. Most applications do not write their implementation from stem
to stern; instead a lot of Windows components will be employed. For example, when an
application shows an open-file dialog, many registry accesses will happen although they are
totally transparent to the application’s logic. Therefore, registry accesses can be divided into two
categories further: the first is application-specific that program developers complete intentionally
while the second belongs to system behaviors that can be skipped. For instance, Adobe Reader
creates the key “HKEY_LOCAL_MACHINE\SOFTWARE\Adobe\ Acrobat Reader” to save its
configurations. Therefore only those entries below this key is dealt with specially and others are
left to the host system. Of course, how to differentiate the two types depends on the specific
application. Fortunately, for Windows OS, some public publications, like Microsoft® Windows®
Internals [7], have explained which registry entries are system-related.
3 Prototype and Tests
Based on the design and technologies above, a lot of desktop application are made portable,
including MS Word 2003, MS Excel 2003, MS PowerPoint 2003, Lotus Notes, Photoshop,
Internet Explorer , Outlook Express 6, Winzip, UltraEdit, FlashGet, Bittorrent and so on.
The detailed process flow of our prototype is as follows.
1. The user selects a program from the GUI to start.
2. During the start-up progress, a wrapper DLL is injected into the target process’ virtual
address space. The user-level virtualization environment for this process is established.
3. During the running process, all registry/file system accesses of the target process are
handled as described in the above sections.
4. As the target exits, all modifications of the system registry and the file system are stored
on the user’s own space. So, when the program is launched from another computer, its latest
modifications are accessible.
Performance Tests
The application start-up time is the key metric of our prototypes’ usability: the time it takes
for applications to respond to user-initiated operations is a measure of what it feels like to use the
中国科技论文在线
- 6 -
system for everyday work. Therefore, we construct a test environment to get the key values and
compare them with the normal.
The client platform for test is a Windows XP SP2 PC, equipped with 2 GBytes DDR2
SDRAM and one Intel Core Duo CPU. The hard disk is one 160 GBytes SATA drive.
A Windows server, equipped with one Intel Core 2 Duo E4500 CPU (2200MHz), 2 GBytes
DDR2 SDRAM, and one 240GBytes SATA II disk, is used as the network server. The two
machines are connected with the 100M Ethernet.
Ten applications are used, including some office applications (AbiWord, Adobe Reader ,
Lotus Notes and UltraEdit ), media players (VLC and MPlayer), Image Processing (Adobe
Photoshop ), small games (Zuma) and some network applications (GTalk and flashFXP).
Moreover, to simplify the test process and presentation, a CDA (Common Desktop
Application) benchmark is used: it employs programming scripts to launch the execution of the
ten desktop applications in the Windows XP environment. At first, the benchmark invokes
CreateProcess to the application, and then another API WaitForInputIdle is used to judge whether
any new process has finished its initialization and is ready to response user’s input or not.
After all applications are launched, the whole elapsed time is recorded as the start-up value.
In this case, the COW mechanism is enabled; and n client machines (n = 1, 2, 3, 4) are
running the benchmark simultaneously to show the response times under different access pressure.
Table 1 gives the test results of the enterprise scenario: compared with the baseline, the LAN
access introduces 102% increase as the client number is 1. When more clients are tested
simultaneously, the time rises correspondingly.
Table 1: Start-up times of Enterprise Case
System
Configuration
Run Time(unit:
ms)
Normalization
Value
Physical, IDE
(Baseline)
20348 1
Virtual, LAN (n=1) 41103
Virtual, LAN (n=2) 47614
Virtual, LAN (n=3) 60433
Virtual, LAN (n=4) 74270
4 Discussions
Employ virtualization technologies to enable computation’s suspending &
resuming across the Internet
A virtual machine monitor (VMM) cleanly encapsulates all volatile execution state of a VM.
A VMM typically maps the volatile state of its VMs to files in the local file system of its host. If
these files are copied to a remote host with similar hardware architecture, a VMM on that host can
resume the VM.
The related researches include ISR (Internet Suspend & Resume project at CMU, supported
by Intel)[8]. How to integrate the VM solution with our light-weight method is an interesting
open-problem.
Design of a smart distributed storage system
A distributed storage system can serve as the transport mechanism for propagating
on-demand software. However, their size will occupy tens of gigabyte or more. So, it is an
中国科技论文在线
- 7 -
important issue to decrease its size and improve the access performance. Existing solutions
include Content-addressable storage and so on, which lacks the knowledge of the upper-level file
system.
Our proposal is to develop a smart distributed storage system as the storage background,
whichh will enable the storage system to learn more details of the storage data in order to adopt
more flexible storage optimizations (like prefetching, different QoS and so on) under the
complicated Internet environment.
Moreover, much software of diverse users is similar, which can be used to decrease the state
size and improve access performance by p2p-sharing technologies. In our plan, we propose a
file-aware block level distributed storage. That is, this type of storage can identify which file is
being accessed by the upper-level OS while the original storage interface remains the same. Then,
p2p-file-sharing and some other look-aside caching technologies can be adopted. Moreover, some
semantically-prefetching methods will be enabled.
Employ Look-aside caching technologies to increase storage performance
The local hard disks and/or portable storage devices can be used as the data cache or personal
data storage to increase storage performance. Both of them and network data are all managed by
the light-weight virtualization environment. And the user should own different operation
privileges on different data, and the privacy and security of these data should be maintained, too.
Security and Privacy
Our solution does not write any customization to the host PC. This isolation can keep the
local OS pristine, helping prevent and contain security breaches and infections.
Another question is whether our solution is safe or not if the host has already been infected
by some virus. Because our access control mechanism can prevent illegal access to the protected
application files, it seems that the virus cannot impair them.
Unlike some VM-based methods, this solution works at OS-level and highly depends on the
host OS, so user should trust the host before operating on it. Then, it looks like that the VM-based
solution is safer in some ways since it does not run any software previously installed on the host
and starts the host from a known power-down state, provided that the local BIOS is not
compromised.
Therefore, we believe it is necessary to construct a trusted chain between the user and the
host to solve this problem completely [8], which depends on the prevalence and availability of
trusted computing.
Business value
In [1], it is said that “use of tools and technologies to automate and speed up PC deployments
can yield significant savings that represent a surprisingly high proportion of total deployment
costs. Our model shows that automating these processes can reduce deployment costs by up to
$578 per PC.” And IDC estimates that 50 – 70% of the cost of application ownership is attributed
to management, because so much of it is manual. Therefore, our software-on-demand solution,
which presents a brand-new software distribution system with central installation, configuration
and management, is believed attractive for enterprises that perform tasks through IT. Also for
education, especially for electronic education systems of Chinese undeveloped areas, it is a
suitable low-cost solution.
中国科技论文在线
- 8 -
5 Conclusion
This paper presents a solution for on-demand Windows applications/customizations based on
the application-level virtualization. This solution can separate an application’s private
files/folders/registry entries into a portable device and/or network, and employs API interception
to make the application transparently access these resources during run time.
Compared with the existing related works, our contribution has the following features:
1. high efficiency in performance and storage capacity;
2. can leverage the current desktop application directly;
The design principle and prototypes are introduced, as well as those technical solutions for
some practical issues.
References
[1] Federica Troni, Michael A. Silver. Use Processes and Tools to Reduce TCO for PCs, 2005- 2006 Update.
Gartner Group
[2] CITRIX, .
[3] R. Chandra, N. Zeldovich, C. Sapuntzakis, and M. S. Lam. “The Collective: A Cache-Based System
Management Architecture”, Proceedings of the Second Symposium on Networked Systems Design and
Implementation (NSDI 2005), May, 2005.
[4] Ramon Caceres, Casey Carter, Chandra Narayanaswami and Mandayam Raghunath, “Reincarnating PCs with
Portable SoulPads”, Proceedings of the Third International Conference on Mobile Systems, Applications, and
Services (MobiSys2005), June, 2005.
[5]
[6] Galen Hunt and Doug Brubacher, "Detours: Binary Interception of Win32 Functions", Proceedings of the Third
USENIX Windows NT Symposium, July, 1999.
[7] Mark E. Russinovich and David A. Solomon. Microsoft Windows Internals, Fourth Edition: Windows 2000,
Windows XP, and Windows Server 2003. Microsoft Press. Dec. 2004.
[8] M. Kozuch and M. Satyanarayanan, “Internet Suspend /Resume”, Proceedings of 4th IEEE Workshop on
Mobile Computing Systems and Applications, June, 2002.
[9] Scott Garriss, Ramon Caceres, Reiner Sailer, Leendert van Doorn, Xiaolan Zhang. Towards Trustworthy Kiosk
Computing. In Proceedings of the Eighth IEEE Workshop on Mobile Computing Systems and Applications. 2007,
Pages 41-45.
中国科技论文在线