So much has changed since I last blogged about making Epic Frontiers, I decided to address something that many developers have a hard time with: Server Architectures. Now, server architecture is something that is more of an IT skill than a game design skill, but with the proliferation of social gaming and online-enabled gaming in general, it’s something that designers need to be very well aware of. As a matter of discussion, I’ll talk about server architecture as it relates to one of my own projects, an MMORPG demo named Epic Frontiers.
Now, Epic Frontiers was based on a T3D server which connected to a MySQL database directly, serving all of the information to the client machine through the game server. It’s an old architecture that looks like so (I’m omitting Authentication and Master Servers because they are mainly involved only in the beginning of the game interaction cycle):
So basic that it's not secure...
That’s a bad network design for a couple of reasons:
- Security: First and foremost, if that T3D server gets compromised, the hacker has a direct connection to the database, and you’re pretty much dead in the water. It’s as simple as that.
- Server Load: The T3D server is already doing so much for the game world that adding database queries to the list of things to do is asking a bit much. Throw 200 players onto a T3D server, and with all the crafting, trading, and combat, the server might start sparing a CPU cycle or two towards plotting to murder you in your sleep. Also, if you have any sort of scaling system in place, your server is probably going to take advantage of it earlier than you might want it to, translating into higher costs of running the game.
- Complexity: There’s easier ways to get your data than to assemble SQL statements using Torque Script, and mixing database calls and bugs with gameplay just means that your debugging gets that much more complex.
So, “a couple” means three to me. It actually means two, but being a programmer, I’m counting from 0, so there you go… Anyway, going by the above, there was an iteration of server architecture that solved the above issues (two and a half of them, anyway) by placing a web-server between the T3D server and the database server. What this did was make the T3D server ask a web server to fetch the data, which it then passed to the player. The network architecture looked like so:
Better, but still very basic...
It’s a more secure design, in that if a hacker got into the server running T3D, they would then need to hack into the web server, which was between the T3D server and the database server- or they could try directly for the database server, but at least the password wouldn’t be sitting inside a T3D instance, and you could enforce some very strict security rules on the database server that could force them to have to compromise the web server first or else be very creative.
Secondly, you’ve taken load off of the server by using things like TCPObject and HTTPObject to ask a web server for data, which is then fetched by that web server. T3D just uses the data whenever it comes in, and issues in grabbing that data won’t impact the entire server (unless the data relates to the operation of the entire server). So that will translate into lower operating costs over time, and higher performance of the server.
And second-and-a-halfly: Complexity. This is only a half-point, because T3D is still making all of the database calls on behalf of the player, not to mention all of the calculations for some of these things. Want to craft something? Your request goes to the T3D server, which then queries the database, and then uses that information to figure out if you’ve successfully crafted your thing, and then it notifies you. Hurray- you’ve just made T3D do something that you can offload to an app that was made specifically for the task!
That’s right- what are you doing asking T3D to mediate crafting, which (depending on your gameplay mechanics, but I’m assuming your run-of-the-mill MMO crafting here) doesn’t really happen in the game world? Think about it: When a player crafts, their client plays an animation while in the background, the game server tries to figure out the crafting mechanics. That can be done either in Torque Script or in code, but either way, it’s being done by T3D, which has lots of other things to worry about, like positions of player, physics (if you’re using it), AI, etc. But crafting doesn’t have to happen within the T3D server to happen in the game, does it? I mean, we’re already having T3D ask a web-server for which items are in a player’s backpack when they click on their inventory, right? Wouldn’t it be easier to offload crafting to an application server running a crafting app which is optimized to just figure this stuff out? Can’t crafting queries be directed to that application server using TCPObjects and/or HTTPObjects in the same way that database queries can?
That network architecture would look like this:
More robust architectures are...more robust...
What we’re doing here is implementing application servers that take the workload for non-realtime gameplay mechanics off of T3D’s already burdened back, and turn it into a set of queries which T3D fires off to be done, relaying the results once that’s done. As you can see, this requires a server for that application, and so in a scaling scenario, you’re probably going to scale your crafting server at a different rate than that of your T3D servers, because a single crafting server can almost certainly handle many T3D servers’ queries. And your T3D servers will run much better, and you’ll probably fit a few more players into your zones as a result.
Going further (but not by much, since it’s been mentioned several times in years past by others), you can offload your chat functions to a specific chat server (or IRC server, for that matter), and any other functionality which doesn’t need realtime feedback. Offload combat? Probably not- unless it’s some kind of turn-based mechanic which can stand waiting a second or so for the player to get a result without the AI eating him alive during that time. For Epic Frontiers, I have an NPC conversation feature that is quite logic and database intensive. Offloading that would take a huge load off of the server. The same goes for some of the AI functionality which, while relatively fast, is not always needed for realtime use (the NPCs aren’t the only things with AI in the game). Offloading it means not only does the server not have to process that itself, but also that I can devote a few more cycles to it.
That’s all the good news! Now, because I can’t be honest without doing this, here’s the bad news:
- Network Complexity: The network will be more complex as you have more servers to manage. If an application server goes down and you do not have some kind of load-balancing in place, you’re in for a world of hurt as portions of the game simply no longer work (“every time I try to craft I lose a hammer? OMGWTFBBQ!!!11!11!one1!”). So if you’re going to go down this road, you need to have both scaling and load-balancing in place to ensure that these portions of your game service run as smoothly as your T3D instances. Additional servers also means additional cost, though as a Crafting Server would likely need less resources to handle crafting queries from multiple T3D servers, this would not be as big a hit as having to spawn more T3D servers because the crafting slows them down.
You can expand these techniques out to design the network how you need so that it performs well...
- Design Complexity: Yes, I did say previously that your T3D code would be simplified because the work previously done by T3D basically gets reduced to web queries. But you can’t get anything back from those web queries unless there’s something doing something with that query and returning information- and that requires writing specialized applications that do nothing but crunch that data. Just as you can implement a Master Server using PHP, ASP, AS3, C/C++/C#, or anything else, so can you implement a Crafting Server in any of those languages (though I’d recommend a low-level language, for performance reasons). This means that changes to a query function in your Crafting Server may dictate changes to the query formation in T3D. A bug in T3D may mean changes to Crafting Server code. These two applications need to talk to each other (securely- even on the back end!), and that means you need to design them with each other in mind.
Now, I realize that this is quite a bit of food for thought for some of you Indies who want to make an Online (MMO or otherwise) game and already find the task daunting, but just remember that in this case, the crafting code is already needed, and the additional overhead of wrapping that code in its own application and spinning up a server for it is minimal in comparison to going further down the road and trying to wring more performance out of a server that is already doing too much.
Designers are creative people, and Indie designers have to be both creative and technical in order to be viable. Be creative with technology, leverage flexibility, and automate as much as you safely and responsibly can. And of course, the above is not at all the final word on network design for online games, and as someone who has spent 13 years in IT can tell you, network architectures are custom things. Customize your network to serve your needs!