2Working Set Size
This is the set of data a system needs to address during normal operation. A complex system will have many distinct working sets, but one or two usually dominate. In stream-like apps such as email or a news feed, the working set can be much smaller than the total set. People rarely access messages more than a few weeks old; they might as well be considered a different system. It’s most useful to think in probability bands: Over a given period of time, what is the probability of various pieces of data being used? For the initial analysis, you can focus on the rough size of the working set, as opposed to the detailed characteristics. However, those details often come back to bite you.
3Average Transaction Size
This can be thought of as the working set of a single transaction performed by the system. How much data does the system have to touch in order to serve a transaction? Downloading a photo and running a Web search involve similar-sized answers sent to the client. However, the amounts of data touched in the background are very different. Note that we’re using the word “transaction” to mean a distinct piece of work. This idea equally applies to big analytical jobs.
How many transactions are expected per hour/minute/second? Is there a peak hour, or is demand steady? In a search engine, you may have five to 10 queries per user over a period of minutes. An online ebook reader might see constant but low volumes of traffic. A game may require multiple transactions per second per user. In short, consider the expected throughput. The combination of throughput and transaction size governs most of the total data flow of the system.
This is a measure of how often data is added, deleted and edited. An email system has a high add rate, a low deletion rate and an almost-zero edit rate. An ad auction use case has ridiculously high rates for all three. A useful way to gauge how much to worry about the update rate is to compare it to the read throughput. The growth rate of the data also ties into the working set size or retention policy. A 0.1 percent growth rate implies a three-year retention (365 times 3 is about 1,000), and vice-versa. A 1 percent rate implies 100 days.
How quickly does an update have to spread through the system? For a keyword advertising bid, a few minutes might be acceptable. Stock trading systems have to reconcile in milliseconds. A comments system is generally expected to show new comments within a second or two, with frantic work backstage to provide the illusion of immediacy to the commenter. Consistency is a critical factor if the update rate is a significant portion of the request rate. It is also critical if propagating updates is especially important to the business, e.g., account sign-ups or price and inventory changes.
What portion of the working set does one request need access to? How is that portion defined? What is the overlap between requests? On one extreme you have search engines: A user might want to query bits from anywhere in your system. In an email application, the user is guaranteed to access their inbox only, a tiny well-defined slice of the whole. In another instance, you may have a deduplicated storage for email attachments, leaving you prey to hot spots.
How quickly are transactions supposed to return success or failure? Users seem to be okay with a flight search or a credit card transaction taking several seconds. A Web search has to return within a few hundred milliseconds. An API that outside systems depend on should return in 100 milliseconds or less. It’s also important to think about the variance. It’s arguably worse to answer 90 percent of queries in 0.1 seconds and the rest in 2 seconds, rather than all requests in 0.2 seconds.