Friday, November 17. 2006An improvement to Serendipity.
Posted by Matthew Groeninger
at
17:10
Tuesday, while I was goofing off trying to “load test” my server, I had a small epiphany. I realized that the Serendipity blog system (or s9y as it is sometimes called) was not taking full advantage of MySQL server caching because it was submitting actual current timestamps for each query. This results in MySQL treating each new query as a brand new query.
So I whipped up a patch which used floor() to round the time stamp down so that the queries would be cached for a period of time and sent it to Garvin. My patch was poorly constructed and had all kinds of problems, but I wasn't really sure where it really should sit in the overall scheme of Serendipity. Garvin didn’t use my patch. However, he did use the idea of my patch, and he gave me tons of credit. Overall, I think it is a nifty idea because it improves performance when the server is under a heavy load, which is not an easy thing to do. But how much does it improve performance? BenchmarksOf course, the only way to answer that is to benchmark a system. Unfortunately, benchmarking a system is a dark art and it has many pitfalls and problems. In my effort to quantify the performance increase, I have made no effort to avoid any of them. In fact, I drove over many of those pitfalls as quickly as I possibly could. So, what follows is an absolutely terrible discussion of benchmarking Serendipity v1.03 and the v1.1 snapshot from November 15th, 2006. Qualifiers for this dataSo let’s start with a brief discussion of why it is so terrible.
And finally, I will not discuss response times. Not gonna do it. The spreadsheet listed under “Results” offers some connection time data if one is interested, but I’m not going to even look at it. I don’t believe it is likely to be very accurate, given that it is measured in milliseconds and it faces all the problems listed above. Response time can be inferred from the graphs below, but are not likely to be accurate. If you want to try to infer response times, I believe the best you are going to do is “O.K.”, “slow” ,and “slower”. TestsOk, so what tests did I do? Well, as I mentioned I used http_load from Acme Software. It makes connections to the web server and times itself. I used two version of Serendipity, v1.03 and v1.1 SVN snapshot (revision 1506). That version of Serendipity v1.1 is post “beta5” but pre-release. I used a base install with the default template. I did not enable persistent connections to the database. In fact, all settings were default. I did not attempt to simulate a real web client. Instead the client tool only requested the URL of the installed directory. (This is kind of important since Serendipity loads the framework to generate a css file. This is generally seen as somewhat of a weakness, I believe, but I have a plugin somewhere that makes pretty cool use of it. Anyway, I do not believe the css hook makes use of the database, so I don’t see this as a huge concern.) I ran two sets of tests on each install. First, I ran a test which made a specific number of connections at a time, for a specific period of time, using a variant of the command: # ./http_load -parallel N -fetches X urlfile N is the number of requests made at one time, while X is the total number of requests. The “urlfile” contained a single URL to the installed directory. The combinations used for N and X with a fixed number of requests are shown in Table 1. Table 1
Table 2
(See what I mean about not tested on a linear scale? It’s ok, though. I don’t use those 10/100 tests in the graphs below! Everything’s better now.) Now if I understand this tool correctly, this test will make N requests at a time and wait until it receives all the responses (or times out) before making more requests. It will do this until it has made X number of requests. So this test is totally unrealistic. Nobody waits for a web page. If the page doesn’t come up pretty much immediately they just give up and move on. So why do this test? Well, because it is easy to type. Oh, and it tells us something about how Serendipity responds to large numbers of requests at one time and how quickly it recovers (if recovery is required). I ran this test on an install with an empty data set, with a copy of my live data set (which is 286 entries... Man, you’d think I’d have written more in two years?), and then with the dataset and the Simple Cache plugin enabled. (The Simple Cache plugin uses PEAR Cache_Lite to create file based caches of the web page. It should keep database access to a minimum.) Then I ran a test which made a fixed number of requests, per second, for a fix amount of time: # ./http_load -rate N -seconds X urlfile N is the number of requests made per second, while X is the total number of seconds to make requests. The combinations used for requests in a fixed time period are shown in Table 2. I only ran these tests on the installations with the dataset and with the Simple Cache plugin installed. So, if I I can read my own typing, this test will open N requests per second for X number of seconds. A more realistic test? Not really. But it will tell us how well Serendipity does under a (short) load that isn’t nice enough to care about how well the server is doing. You know, kinda like users... Oh, those horrible users. Set upOk, with me so far? Now, a word about the server set up. The server is a dual processor 733Mhz Dell something or other. (Poweredge 2400 sounds right.) It is running the web server (Lighttpd 1.4.x), php (4.x using fcgi), and MySQL 5.0. Php is also running eAccelerator. The operating system is OpenBSD 3.9. Each of those applications can be configured to do amazing things. I have not configured them that way. Please do not assume that I am an idiot, though. The server is built so that MySQL has plenty of room to breath and so that the web server does everything it can to prevent MySQL from crashing. That means the pool of fcgi handlers is relatively small, and the web server should start dropping connections before MySQL melts. That being said, I am sure there are instances where the whole darn thing will just walk away. For those instances I have an amazing “stability through obscurity” plan. ResultsOk, so let’s talk about data. Here are my results (in a text file dump of each command result) and here is a summary (in a silly little Excel spreadsheet). I am not going to do any real statistical analysis on the data, because it is such a small sample size. I will try to interpret the data broadly. And mostly with cheesy Excel graphs. Fetches per second Time to completion You can click on the images to make them bigger. The first graph looks at the fetches per second for a fixed number of requests and it looks pretty good. It actually looks a lot better than I believed it would. I am slightly suspicious of it. I am slightly suspicious of nearly everything, though. Anything I am not slightly suspicious of makes me deeply suspicious. So, it appears that running Serendipity without content actually works pretty well, giving about ten fetches per second for version 1.03 and eleven to twelve fetches per second for Serendipity v1.1 rev 1506. Unfortunately, most users want some data in their blog, so I have to look at what happens when Serendipity uses a dataset. For Serendipity v1.1 rev 1506 I saw a decent five-to-six fetches per second, but Serendipity v1.03 dropped down to a poor 1 fetch per second and refused to budge much more above that. I should note that the last test on Serendipity v1.03, which used 100 parallel connections, offered funky results. I have excluded these from the data, even though that is bad practice. I really should retest those results. I doubt I am going to. What kind of funky results? Http_load would offer negative numbers or really large numbers, and, based on the percentage of CPU MySQL used for 10 minutes after the end of the test, I believe MySQL was failing to respond to the fcgi process in a sufficient amount of time for results to make sense. The other evidence that supports this belief is the increase in the number of failures that occurred during the 50 parallel connection test. In previous tests I had seen either full or near full responses (HTTP response code 200). In the test of 1000 connections with 50 in parallel on Serendipity v1.03 I saw nearly a quarter of the connections result in an error (HTTP response code 500). I believe it is safe to say that the 100 parallel connection test on the same install likely resulted in more failures. Lastly, the Simple Cache plugins actually do have significant results, taking both versions of Serendipity back at least to the levels seen without data in the database. Now if I can direct your attention to the“Time to Completion” graph you will get a different view of the exact same results. This view tells us that you are going to be waiting a lot longer for one thousand patient people to read a page on Serendipity v1.03, but that you won’t have to wait as long if you use Serendipity v1.1 rev 1506. And obviously the Simple Cache plugin helps quite a bit. (I do understand that the graphs are directly related. I used a fixed number of connections (1000) for each test. I measured how long it takes to complete the test. I can divide the number of connections by the time it takes and I will have fetches per second. The second graph is just a nice illustration of how dependent on MySQL Serendipity v1.03 is.) Fetches per Second in 60 Second Connection Test The results of the 60 second connection test were perhaps more dramatic, since they show how well the patch allows Serendipity v1.1 rev 1506 to scale compared to Serendipity v1.03. If you would like to see it expressed in total fetches, you can click here, but it is identical except for the scale on the axis (oh, and the labels are different, too). That nice linear scaling for both version with the Simple Cache plugin is what you would expect to see if the application was not hitting a bottleneck (in this case, MySQL). Interestingly, the ability of Serendipity v1.1 rev 1506 to use MySQL cache actually allows it to keep up with the versions running Simple Cache (at least through to five connections per second) while Serendipity v1.03 is limited to roughly one fetch per second no matter how many requests it is receiving at one time. ConclusionThe patch works pretty well and Serendipity v1.1 should offer better performance right out of the box. From the data in the first test we can tell that it is able to offer about five fetches per second, and the data in the second test implies that this should work for sustained requests (at least for up to a minute). The tests also show that, on my hardware, Serendipity v1.03 is bound to about one fetch per second, and anything above that will become queued. It appears that this will not actually cause timeouts until the number of requests reaches somewhere between twenty and fifty per second, but I doubt the page response will be quick enough for any casual user to wait for it. Note that sustained requests of that magnitude are likely to bring down the application The results for the Simple Cache plugin are nice to know. Even using Serendipity v1.03 the Simple Cache plugin offers enormous benefit if you are able to use it, allowing the application to scale to at least ten requests per second (see the spreadsheet for two tests with 10 parallel connections). Meanwhile, Serendipity v1.1, when using the Simple Cache plugin, should be able to reach twelve to fourteen fetches per second, if my data isn’t too far off the mark. So what can I say? I can say that I think Serendipity v1.1 will have better performance than Serendipity v1.03 at the "high load" end of the market, and that using Simple Cache is likely to help out even more, if you can use it. Pretty cool stuff. |
Recent Entries
Syndicate This BlogQuicksearchCopyrightThe contents of this website (text and published images) are protected by applicable copyright laws. All rights are reserved by the author.
Distribution without express written permission is prohibited. Blog Administration |
As a regular user of the web, I have never really been satisfied with very many websites. While many sites might look nice, and some sites seem to do an excellent job simplifying usability, I’ve felt that most websites rely exclusively on color and
Tracked: Dec 29, 09:48