IBM - Tuning LAMP Systems - Part 2 of 3

Tuning LAMP systems, Part 2: Optimizing Apache and PHP http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.
html
Tuning LAMP systems, Part 2: Optimizing Apache

and PHP
What slows Apache down, and how to get the most out of PHP
Level: Intermediate
Sean A. Walberg (sean@ertw.com), Senior Network Engineer
30 Apr 2007
Applications using the LAMP (Linux®, Apache, MySQL, PHP/Perl) architecture are constantly being
developed and deployed. But often the server administrator has little control over the application itself
because it's written by someone else. This series of three articles discusses many of the server
configuration items that can make or break an application's performance. This second article focuses on
steps you can take to optimize Apache and PHP.
Linux, Apache, MySQL, and PHP (or Perl) form the basis of the LAMP architecture for Web applications. Many open
source packages based on LAMP components are available to solve a variety of problems. As the load on an
application increases, the bottlenecks in the underlying infrastructure become more apparent in the form of slow
response to user requests. The previous article showed you how to tune the Linux system and covered the basics of
LAMP and performance measurement. This article focuses on the Web server components, Apache and PHP.
Tuning Apache
Apache is a highly configurable piece of software. It has a lot of features, but each one comes at a price. Tuning
Apache is partially an exercise in proper allocation of resources, and involves stripping down the configuration to only
what's needed.
Configuring the MPM
Apache is modular in that you can add and remove features easily. Multi-Processing Modules (MPMs) provide this
modular functionality at the core of Apache -- managing the network connections and dispatching the requests. MPMs
let you use threads or even move Apache to a different operating system.
Only one MPM can be active at one time, and it must be compiled in statically with
--with-mpm=(worker|prefork|event) .
The traditional model of one process per request is called prefork. A newer, threaded, model is called worker, which
uses multiple processes, each with multiple threads to get better performance with lower overhead. The final, event
MPM is an experimental module that keeps separate pools of threads for different tasks. To determine which MPM
you're currently using, execute httpd -l.
Choosing the MPM to use depends on many factors. Setting aside the event MPM until it leaves experimental status,
it's a choice between threads or no threads. On the surface, threading sounds better than forking, if all the underlying
modules are thread safe, including all the libraries used by PHP. Prefork is the safer choice; you should do careful
testing if you choose worker. The performance gains also depend on the libraries that come with your distribution and
your hardware.
Regardless of which MPM you choose, you must configure it appropriately. In general, configuring an MPM involves
telling Apache how to control how many workers are running, whether they're threads or processes. The important
configuration options for the prefork MPM are shown in Listing 1.
Listing 1. Configuration of the prefork MPM
StartServers 50
MinSpareServers 15
MaxSpareServers 30
MaxClients 225
MaxRequestsPerChild 4000
In the prefork model, a new process is created per request. Spare Compiling your own software
processes are kept idle to handle incoming requests, which reduces the
1 of 6 10/05/2007 11:16 AM
Tuning LAMP systems, Part 2: Optimizing Apache and PHP http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.html
start-up latency. The previous configuration starts 50 processes as soon

as the Web server comes up and tries to keep between 10 and 20 idle When I got started with UNIX®, I insisted
servers running. The hard limit on processes is dictated by on compiling my own software for
MaxClients. Even though a process can handle many consecutive everything I put on my systems.
requests, Apache kills off processes after 4,000 connections, which Maintaining updates eventually caught up
mitigates the risk of memory leaks. with me, so I learned how to build
packages to ease this task. Eventually, I
Configuring the threaded MPMs is similar, except that you must realized that most of the time I was
determine how many threads and processes are to be used. The Apache duplicating the effort the distribution was
documentation explains all the parameters and calculations necessary. doing; now, for the most part, I stick with
whatever is provided by my distribution of
Choosing the values to use involves some trial and error. The most choice when I can, and roll my own
important value is MaxClients. The goal is to allow enough worker packages when I must.
processes or threads to run without causing your server to swap
excessively. If more requests come in than can be handled, then at least Similarly, you may find that
those that made it through get service; the others are blocked. maintainability of vendor packages
outweighs the benefits of going with the
If MaxClients is too high, then all clients experience poor service latest and greatest code. Sometimes
because the Web server tries to swap out one process to allow another performance tuning and systems
one to run. Too low a setting means you may deny services administration have conflicting goals. You
unnecessarily. Checking the number of processes running at high loads may have to consider vendor support if
and the resulting memory footprint of all the Apache processes gives you're using a commercial Linux or relying
you a good idea of how to set this value. If you go over 256 on third-party support.
MaxClients, you must also set ServerLimit to the same
number; read the MPM's documentation carefully for the associated If you strike out on your own, learn how to
caveats. build packages that work with your
distribution and how to integrate them into
Tuning the number of servers to start and keep spare depends on the your patching system. This will ensure that
role of the server. If the server runs only Apache, you can use modest the software, along with any tweaks you
values as shown in Listing 1, because you're able to make full use of make, are built consistently and can be
the machine. If the system is shared with a database or other server, used across multiple systems. Also keep on
then you should limit the number of spare servers being run. top of software updates by subscribing to
the appropriate mailing lists and Rich Site
Using options and overrides efficiently Summary (RSS) feeds.
Each request that Apache processes goes through a complicated set of rules that dictates any restrictions or special
instructions the Web server must follow. Access to a folder can be restricted by IP address to a certain folder, or a
username and password can be configured. These options also include the handling of certain files, such as if a
directory listing is provided, how certain filetypes are to be handled, or whether the output should be compressed.
These configurations take the form of containers in httpd.conf such as <Directory> to specify that the configuration to
follow refers to a location on disk, or <Location> to indicate that the reference is to a path in the URL. Listing 2 shows
a Directory container in action.
Listing 2. A Directory container being applied to the root directory
<Directory />
AllowOverride None
Options FollowSymLinks
</Directory>
In Listing 2, the configuration enclosed in the Directory and /Directory tags is applied to the given directory
and everything under it — in this case, the root directory. Here, the AllowOverride tag dictates that users aren't
allowed to override any options (more on this later). The FollowSymLinks option is enabled, which lets Apache
look past symlinks to serve the request, even if the file is outside the directory containing Web files. This means that if
a file in your Web directory is a symlink to /etc/passwd, the Web server happily serves the file if asked. With
-FollowSymLinks used instead, this feature is disabled, and the same request causes an error to be returned to the
client.
This last scenario is a cause for concern on two fronts. The first is a performance matter. If FollowSymLinks is
disabled, then Apache must check each component of the filename (directories and the file itself) to make sure they're
not symbolic links. This incurs extra overhead in the form of disk activity. A companion option called
FollowSymLinksIfOwnerMatch follows the symbolic link if the owner of the file is the same as that of the link.
This has the same performance hit as disabling following of symlinks. For best performance, use the options in Listing
2 of 6 10/05/2007 11:16 AM
2.
Security-conscious readers should be alert by now. Security is always a trade-off between functionality and risk. In this
case, the functionality is speed, and the risk is allowing unauthorized access to files on the system. One of the
mitigations is that LAMP application servers are generally dedicated to a particular function, and users can't create the
potentially dangerous symbolic links. If it's vital to have symbolic link-checking enabled, you can restrict it to a
particular area of the file system, as in Listing 3.
Listing 3. Restricting FollowSymLinks to a user's directory
<Directory />
Options FollowSymLinks
</Directory>
<Directory /home/*/public_html>
Options -FollowSymLinks
</Directory>
In Listing 3, any public_html directory in a user's home directory has the FollowSymLinks option removed for it
and any child directories.
As you've seen, options can be configured on a per-directory basis through the main server configuration. Users can
override this server configuration themselves (if permitted by the administrator by the AllowOverrides statement)
by dropping a file called .htaccess into a directory. This file contains additional server directives that are loaded and
followed on each request to the directory where the .htaccess file resides. Despite the earlier discussion about not
having users on the system, many LAMP applications use this functionality to control access and for URL rewriting,
so it's wise to understand how it works.
Even though the AllowOverrides statement prevents users from doing anything you don't want them to, Apache
must still look for the .htaccess file to see if there is any work to be done. A parent directory can specify directives that
are to be processed by requests from child directories, which means Apache must also search each component of the
directory tree leading to the requested file. Understandably, this causes a great deal of disk activity on each request.
The easiest solution is to not allow any overrides, which eliminates the need for Apache to check for .htaccess. Any
special configurations are then placed directly in httpd.conf. Listing 4 shows the additions to httpd.conf to enable
password checking for a user's project directory, rather than putting in a .htaccess file and relying on
AllowOverrides.
Listing 4. Moving .htaccess configuration into httpd.conf
<Directory /home/user/public_html/project/>
AuthUserFile /home/user/.htpasswd
AuthName "uber secret project"
AuthType basic
Require valid-user
</Directory>
If the configuration is moved into httpd.conf and AllowOverrides is disabled, disk usage can be reduced. A user's
project may not attract many hits, but consider how powerful this technique is when applied to a busy site.
Sometimes it's not possible to eliminate use of .htaccess files. For example, in Listing 5, where an option is restricted
to a certain part of the file system, overrides can also be scoped.
Listing 5. Scoping .htaccess checking
<Directory />
AllowOverrides None
</Directory>
<Directory /home/*/public_html>
AllowOverrides AuthConfig
</Directory>
After you implement Listing 5, Apache still looks for .htaccess files in the parent directories, but it stops in the
public_html directory because the rest of the file system has the functionality disabled. For example, if a file that maps
3 of 6 10/05/2007 11:16 AM
to /home/user/public_html/project/notes.html is requested, only the public_html and project directories are searched.
One final note about per-directory configurations is in order. Any document about tuning Apache will tell you to
disable DNS lookups through the HostnameLookups off directive because trying to reverse-resolve every IP
address connecting to your server is a waste of resources. However, any limitations based on hostname force the Web
server to perform a reverse lookup on the client's IP address and a forward lookup on the result of that to verify the
authenticity of the name. Therefore, it's wise to avoid using access controls based on the client's hostname and to scope
them as described when they're necessary.
Persistent connections
When a client connects to a Web server, it's allowed to issue multiple requests over the same TCP connection, which
reduces the latency associated with multiple connections. This is useful when a Web page refers to several images: The
client can request the page and then all the images over one connection. The downside is that the worker process on the
server has to wait for the session to be closed by the client before it can move on to the next request.
Apache lets you configure how persistent connections, called keepalives, are handled. KeepAlive 5 at the global
level of httpd.conf allows the server to handle 5 requests on a connection before forcing the connection closed. Setting
this number to 0 disables the use of persistent connections. KeepAliveTimeout, also at the global level, determines
how long Apache will wait for another request before closing the session.
Handling persistent connections isn't a one-size-fits-all configuration. Some Web sites fare better with keepalives
disabled (KeepAlive 0), and some experience a tremendous benefit by having them on. The only solution is to try
both and see for yourself. It's advisable, though, to use a low timeout such as 2 seconds with KeepAliveTimeout
2 if you enable keepalives. This ensures that any client wishing to make another request has ample time, and that
worker processes aren't idling while waiting for another request that may never come.
Compression
The Web server can compress the output before it's sent back to the client. This results in a smaller page being sent
over the Internet at the expense of CPU cycles on the Web server. For those servers that can afford the CPU overhead,
this is an excellent way of making pages download faster — it isn't unheard of for pages to be a third of their size after
compression.
Images are generally already compressed, so compression should be limited to text output. Apache provides
compression through mod_deflate. Although mod_deflate can be simple to turn on, it includes many
complexities that the manual is eager to explain. This article doesn't cover the configuration of compression except to
provide a link to the appropriate documentation (see the Resources section.)
Tuning PHP
PHP is the engine that runs the application code. You should install only the modules you plan to use and have your
Web server configured to use PHP only for script files (usually those ending in .php) and not all static files.
Opcode caching
When a PHP script is requested, PHP reads the script and compiles it into what's called Zend opcode, a binary
representation of the code to be executed. This opcode is then executed by the PHP engine and thrown away. An
opcode cache saves this compiled opcode and reuses it the next time the page is called. This saves a considerable
amount of time. Several opcode caches are available; I've had a great deal of success with eAccelerator.
Installing eAccelerator requires the PHP development libraries on your computer. Because different Linux
distributions place files in difference places, it's best to get the installation instructions directly from the eAccelerator
Web site (see the Resources section for a link). It's also possible that your distribution has already packaged an opcode
cache, and you just have to install it.
Regardless of how you get eAccelerator on your system, there are a few configuration options to look at. The
configuration file is usually /etc/php.d/eaccelerator.ini. eaccelerator.shm_size defines the size of the shared
memory cache, which is where the compiled scripts are stored. The value is in megabytes. Determining the proper size
depends on your application. eAccelerator provides a script to show the status of the cache, which includes the
memory usage; 64 megabytes is a good start (eaccelerator.shm_size="64"). You may also have to tweak
your kernel's maximum shared memory size if the value you choose isn't accepted. Add
kernel.shmmax=67108864 to /etc/sysctl.conf, and run sysctl -p to make the setting take effect. The value
for kernel.shmmax is in bytes.
4 of 6 10/05/2007 11:16 AM
If the shared memory allocation is exceeded, eAccelerator must purge old scripts from memory. By default, this is
disabled; eaccelerator.shm_ttl = "60" specifies that when eAccelerator runs out of shared memory, any
script that hasn't been accessed in 60 seconds should be purged.
Another popular alternative to eAccelerator is the Alternative PHP Cache (APC). The makers of Zend also have a
commercial opcode cache that includes an optimizer to further increase efficiency.
php.ini
You configure PHP in php.ini. Four important settings control how much system resources PHP can consume, as listed
in Table 1.
Table 1. Resource related settings in php.ini

Setting Description Recommended value
max_execution_time How many CPU-seconds a script can consume 30
max_input_time How long (seconds) a script can wait for input data 60
memory_limit How much memory (bytes) a script can consume before being killed 32M
output_buffering How much data (bytes) to buffer before sending out to the client 4096
These numbers depend mostly on your application. If you accept large files from users, then max_input_time may
have to be increased, either in php.ini or by overriding it in code. Similarly, a CPU- or memory-heavy program may
need larger settings. The purpose is to mitigate the effect of a runaway program, so disabling these settings globally
isn't recommended. Another note on max_execution_time: This refers to the CPU time of the process, not the
absolute time. Thus a program that does lots of I/O and few calculations may run for much longer than
max_execution_time. It's also how max_input_time can be greater than max_execution_time
The amount of logging that PHP can do is configurable. In a production environment, disabling all but the most critical
logs saves disk writes. If logs are needed to troubleshoot a problem, you can turn up logging as needed.
error_reporting = E_COMPILE_ERROR|E_ERROR|E_CORE_ERROR turns on enough logging to spot
problems but eliminates a lot of chatter from scripts.
Summary
This article focused on tuning the Web server, both Apache and PHP. With Apache, the Share this...
general idea is to eliminate extra checks the Web server must do, such as processing the
.htaccess file. You must also tune the Multi-Processing Module you're using to balance the Digg this
system resources used with the availability of idle workers for incoming requests. The best story
thing you can do for PHP is to install an opcode cache. Keeping your eye on a few resource
settings also ensures that scripts don't hog resources and make the system slow for everyone del.icio.usPost to
else. del.icio.us
Slashdot it!
The next and final article in this series will look at tuning the MySQL database. Stay tuned!
Resources
Learn
"Quantify performance changes using application tracing" (developerWorks, July 2006) shows how to use
application tracing to show the effect of configuration changes on Apache.
"Using the new memory manager" (developerWorks, March 2007) covers the latest changes to PHP 5.2's
handling of memory. PHP is constantly refining its use of system resources.
mod_deflate is an Apache module that compresses output on the fly. This can also be done in PHP through
output compression.
Pre-caching compressed static files such as JavaScript code. CSS is another way to improve performance.
Compressing and concatenating all your JavaScript code and CSS is even better.
5 of 6 10/05/2007 11:16 AM
The Apache documentation on Multi-Processing Modules is worth reading to learn about the functionality of
each; follow the links to the specific documentation for the MPM you choose.
In the developerWorks Linux zone, find more resources for Linux developers.
Stay current with developerWorks technical events and Webcasts.
Get products and technologies

If your distribution doesn't include eAccelerator, the Install From Source instructions will be helpful.
The Alternative PHP Cache and Zend Platform are alternatives to eAccelerator.
Siege lets you simulate users, so you can find out how much traffic your site can handle.
Sooner or later you're going to want to cache certain elements of your site and distribute load across multiple
Web servers. Squid in accelerator mode (also known as a reverse proxy) or the Linux Virtual Server Project are
excellent tools.
Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®,
Lotus®, Rational®, Tivoli®, and WebSphere®.
With IBM trial software, available for download directly from developerWorks, build your next development
project on Linux.
Discuss
Check out developerWorks blogs and get involved in the developerWorks community.
About the author
Sean Walberg has been working with Linux and UNIX since 1994 in academic, corporate, and Internet
service provider environments. He has written extensively about systems administration over the past
several years.
DB2, Lotus, Rational, Tivoli, and WebSphere are trademarks of IBM Corporation in the United States, other countries, or
both. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. UNIX is a registered trademark
of The Open Group in the United States and other countries. Other company, product, or service names may be
trademarks or service marks of others.
6 of 6 10/05/2007 11:16 AM

IBM - Tuning LAMP Systems - Part 2 of 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

IBM - Tuning LAMP Systems - Part 2 of 3

Uploaded by

Copyright:

Available Formats

Tuning LAMP systems, Part 2: Optimizing Apache and PHP http://www.ibm.com/developerworks/linux/library/l-tune-lamp-2.

Tuning LAMP systems, Part 2: Optimizing Apache

Listing 1. Configuration of the prefork MPM

start-up latency. The previous configuration starts 50 processes as soon

Listing 2. A Directory container being applied to the root directory

Listing 3. Restricting FollowSymLinks to a user's directory

Listing 4. Moving .htaccess configuration into httpd.conf

Listing 5. Scoping .htaccess checking

Table 1. Resource related settings in php.ini

Stay current with developerWorks technical events and Webcasts.

Get products and technologies

About the author

You might also like