
|
|||||||
| Go to: | FIFA Series | | | Battlefield Series | | | C&C | | | Need for Speed Series | | | The Sims Series | | |
![]() |
|
|
LinkBack | Thread Tools |
|
|
#1 (permalink) |
|
DICE
Join Date: Jan 2010
Posts: 275
|
What follows is a description of how the stats backend functions for BFBC2, what happens during high load, and what we are doing to resolve it. Consider it a peek 'under the hood' of BFBC2.
System overview When playing online, all game clients and game servers are permanently connected to the game's backend servers. There is a separate backend for each of the PC/PS3/360 versions of BFBC2. A backend is split into two portions - one group of machines which run some custom software, and a database. The database is not directly accessible by game clients/servers; they can only reach it by sending requests to the custom software portion, which in turn talks to the database. Each database is a cluster of machines which run Oracle 9i with RAC enabled. There are a few modules in the backend, and a few tables in the database, which are shared between multiple platforms / titles. Those are generally rather low-intensity processes. However those have to be cared for if one wants to perform changes to the physical configuration of the machines that run the backend. Stats A stat is a short identifier with an accompanying value. Stats are tracked for each player, and they are saved between game sessions. For BC2 there are approximately 2000 unique stats values. Some of the stats have a direct meaning - your current score with a specific kit, number of kills with a specific weapon and so on - whereas other stats are meaningless on their own and track your progress toward various achievements/trophies, pins and insignias. The stats are kept in a couple of big tables in the Oracle database. Game client and stats The game client only reads from the stats database; it never writes. Stats reads happen on two occasions: when a player logs in, and when a player exits from a server back to the main menu. The client has a local cache of all stats. When one of the two previous events occur, the game client requests a handful stats (for instance the, the player's total score and accumulated online playtime). If any of those stats are different from the locally cached values, the game client goes out and grabs all stats (approximately 2000 values). The game client uses these stats to display information in the main menu. It is not used in-game in multiplayer. Game server and stats The game server reads and writes to the stats database. When a player enters a server, the server requests approximately 1000 stats for that player from the database. Anything that has to do with stats and ranks is controlled by the server (for instance, which weapons are unlocked for a specific player). The server writes back a player's stats when the player leaves the server. Also, all players' stats are written to the database at the end of each round. This is to minimize the risk that player progress is lost because of a server crash. When writing stats, the server will only write those stats that have changed. In addition, whenever possible the server will issue commands like "add 3 to stat named ABCD" rather than "write 27 to stat named ABCD". This minimizes the risk that any bugs in the code or network communications problems will trample stats; the worst that can occur is that a stat is not increased, it will not get lowered or set to zero inadvertently. Usually the game client will write a lot less than 1000 stats. I don't have figures at hand, but perhaps 100 stats are usually updated after a player has played a full round. High load scenarios and the backend Normally the database responds to the custom software's read/write queries very quickly. The database can service requests from a couple of game clients/servers in parallel; if there are too many requests made at once, new ones are put into a queue. Normal turnaround time for retrieving 2000 stats is approximately a second. Requesting 2000 stats takes a bit more time than requesting 1000 stats - probably about twice as long. The database completes the queued-up entries as quickly as it can. The requests do not come in a steady flow however. Sometimes many servers and clients will ask for stats data at nearly the same time. The database will then service some of those requests a bit slower than usual. The database is the weaker portion of BFBC2; that is, the custom software can handle more players being active simultaneously, than the database can. If the clients/servers are doing a lot of requests to the database over a long period of time, then the backlog of queries in the database's queue will get longer and longer. When the queue is so long that the database is unable to service queries in 10 seconds, the custom software will give up on those queries and respond with an error to those clients/servers. High load scenarios and the game client/server With the above in mind, let's imagine what happen when the number of simultaneous players increases. At first, there are not a lot of players. The database will handle any requests quickly and its queue is nearly empty all the time. As the number of players go up, the database will still be able to keep up with most requests. However, occasionally a lot of servers/clients will happen to perform stat requests at nearly the same time. This causes the queue to fill up a bit more than usual. Some of those queries will then time out when they hit the 10 second cutoff. Since clients normally request more data, it is usually the game client's requests that fail first. If the game client's request fails, the game client will attempt to retrieve stats for 10 or 20 seconds - and then give up, and the game's main menu will claim that the player is Rank 1 and has zero score etc. As the load increases further, the game server read requests will also fail more often. When game server read requests fail, the players which are affected will play with rank 1 and no stats-related unlocks. When this happens, the game server will not record & write back progress for the affected players either. Finally, with a really high load, all requests from game clients & game servers will fail. High load versus too high load One important thing to notice about some online systems and load, is that the load does not behave like you would intuitively expect it to. Usually it rises slowly... until it gets to a certain point, and then it all spirals out of control and horror ensues. There are several reasons for this. One is the human factor: When the load is at such a level that stats requests are failing intermittently, it appears to the player like he/she has lost all his/her progression, but either logging in/out (in the case of no stats in the main menu) or disconnecting/reconnecting (in the case of no stats in the game) has a % of chance to get stats back. People will then naturally do this over and over until they either get stats, or are frustrated enough to give up. This behaviour will cause more load on the backend than normal gameplay behaviour, which worsens the problem overall. Another can be in the code; sometimes game client/game server code is written to retry a couple of times when an operation fails. This is a good thing when the backend is not under high load - after all, the error might be due to a momentary hiccup. However, when the load is high this will make the problem worse (in just the same way as the "human factor example"). There are also some things happening in the background on databases - like backups, or regularly scheduled maintenance / dataprocessing jobs. This means that some online systems can seem to be running fine, with a steady load, and then something happens and within minutes they grind to a halt. How well-behaving the system is depends on what functions it performs, and the behaviours of the users of the system. BFBC2's custom backend software is well-behaved in most respects. The database suffers a bit from the problems described above - the step between "players are occasionally not getting stats" and "players are never getting stats" is smaller than theory would predict. A closer look at the database itself Somehow the stats database used to handle considerably more players back when it launched than now. In other words, reads/writes against the database takes more time to complete. There are two main reasons for this.
Defining the problem The problem we will tackle is the following: the current player population is suffering from stats outages. That shouldn't be happening. Stats should be reliable with roughly the player numbers that we have now, plus a bit of headroom. We will not attempt to make it handle 100.000 concurrent users on a single backend. Tackling the problem One can attempt to make individual database accesses faster. Taking the database offline, and rebuilding the tables. This is certain to help. That is also the first thing that we will do. (And schedule new rebuilds whenever necessary in the future.) Making disk cache sizes larger. Memory is faster than disk, so if more of the database is kept in memory then accesses will go faster. The PC and 360 database clusters have as much memory as is possible. The PS3 cluster has room for more memory though. We will add it. Redesigning the tables. The table layout is not designed specifically for BFBC2; the same design is used by many other EA titles. Changing the design would improve performance for most requests by a fair bit. However, the time required for getting such a modification implemented, tested, and live is far too long. We will therefore not do it. Adding more machines to the database clusters. One might think that doubling the number of machines in a database cluster will also double the performance of a cluster. In reality, all those machines need to coordinate their work with each other. Therefore, adding more machines only helps sometimes. In some cases, performance actually gets worse. We will therefore not do it. Moving to a newer Oracle version or another database altogether. Again, the turnaround time for doing this to a live system is far too long. We will therefore not do it. Or one can reduce the amount of database accesses. Making game clients request fewer stats. The game client is already doing a small fetch before doing a full fetch (in case score/time or a couple other stats have changed). If the client doesn't update all the stats in its cache, the main menu will not be able to show the player's ingame progression correctly. It is perhaps possible to split the stats fetching into two portions - one portion for showing the most important stuff in the main menu (in the case of BC2 PC, the stats-related items in the main screen), and another portion for showing all the achievements/trophies etc. It is under consideration. Making the game servers cache stats for players. The servers could have a cache like the game clients, but cache stats for many different players. This would help with people who play near-exclusively on one server. It is doubtful if it would make a difference (I don't have statistics on this, just guessing). We will not do it. Making the game servers request fewer stats. Fetching fewer stats will make the game server unable to evaluate the full player progression. We will therefore not do it. Making the game servers write fewer stats. If the game servers would write stats to the backend at each Nth round instead of at each round, then there would be fewer unique stats written. There is a tradeoff here - is there a risk that players lose their progression due to server crashes? - but N=2 or N=3 keeps both risk and impact very small. We have already implemented this change for both consoles, and will implement it for PC. Once one set of changes is in place, we will then reassess the situation. Etc.
__________________
Follow Battlefield on Twitter at http://twitter.com/OfficialBF2 and http://twitter.com/OfficialBF2142 and http://twitter.com/OfficialBFBC2 Last edited by MikaelKalms; 21-01-2011 at 08:17 PM. |
|
|
|
|
|
#2 (permalink) |
|
Forum Junkie
Join Date: Aug 2010
Location: Norn Irn
Age: 20
Posts: 4,622
|
1st thing's 1st. I do not know an awful lot about what I'm about to say, so don't quote me on this.Sounds like that'd use up a lot of resources, why not just have the client connect to the backend servers when stats need to be updated or read by a stats website?
__________________
My E-peen CPU: Intel Core i5 2500K @ 3.3GHz RAM: Corsair Vengeance 2x4GB DDR3 @ 1600MHz GFX: PNY GeForce GTX 560Ti OC @ 850 MHz HDD 1: WD Caviar Blue 500GB x2 PSU: High Power 800W |
|
|
|
|
|
#4 (permalink) |
|
Forum Guru
Join Date: Jan 2009
Location: US
Age: 31
Posts: 1,030
|
Looks like there is basically a cap on how many customers can play at the same time, and that there is very little you're able or willing to do about it?
For your next game, can you just tell us how many players you can handle, so only that many people buy it? |
|
|
|
|
|
#5 (permalink) | |
|
Forum Junkie
Join Date: Aug 2010
Location: Norn Irn
Age: 20
Posts: 4,622
|
Quote:
__________________
My E-peen CPU: Intel Core i5 2500K @ 3.3GHz RAM: Corsair Vengeance 2x4GB DDR3 @ 1600MHz GFX: PNY GeForce GTX 560Ti OC @ 850 MHz HDD 1: WD Caviar Blue 500GB x2 PSU: High Power 800W |
|
|
|
|
|
|
#6 (permalink) |
|
Elite
Join Date: May 2010
Location: Originally hyperuk
Posts: 5,385
|
all I see is alot of excuses, and alot of We will not do it.
what I would like to know is this system the main cause of lag and bad hit detection. Also, why is it only since the release of Vietnam that the problem is very bad? |
|
|
|
|
|
#8 (permalink) |
|
Rookie
Join Date: Jan 2011
Posts: 4
|
I am unable to login at all. When i try to login after the "updating stats" message i get the error "failed to connect to ea online". this has been for 2 weeks now.
My question: Could it be possible that my stats have corrupted on the db, hence the reason for the error every time i login?? |
|
|
|
|
|
#9 (permalink) | |
|
Elite
Join Date: May 2010
Location: Originally hyperuk
Posts: 5,385
|
Quote:
|
|
|
|
|
|
|
#10 (permalink) |
|
Elite
Join Date: May 2010
Location: Originally hyperuk
Posts: 5,385
|
sorry to quote myself but could limiting the number of soldiers to 1 help the system recover, as I'm sure you will have lots of people with a lot more than 1 soldier, which will obviously be creating a lot more stats and requests than is necessary.
|
|
|
|
|
|
#12 (permalink) |
|
Elite
Join Date: Jun 2010
Location: Vienna, Austria
Age: 33
Posts: 6,696
|
First off all:
Kalms, i LOVE your threads, it's awesome to have a dev giving some detailed insights in the development/maintenance process. First of all, those threads explain some basic things to the regular player who's got nothing to do with programming - or in thise case backends - so they get a clearer insight and don't need to be fooled by random rumors. And then of course, they are also showing the other portion of the player base how exactly you guys developed and run this game. Just something i noticed: * Plexus (or whatever) servers are outdated and low performant * Tables were not optimized for BC2 * No rebuild schedule for BC2 tables There are quite some things disturbing about the sizing/priority of BC2. After some threads by Kalms it seems that BC2 never was expected to be so attrictive to so many players. Okay, forgetting the rebuild schedule is one thing, i know such things can be overseen (although it shouldn't happen with a decent release management) but the first 2 points i mentioned actually mean to me that EA never expected this title to sell so often. Which is quite ... well ... interesting. I know you can't reveal this sort of information, but it seems like EA didn't know anything about the playerbase of BF PC. I wonder if this maybe was one of the driving factors to pay a little bit more attention to PC players for BF3 ? (All economic reasons aside that it wasn't worth to optimize FB for BC2, which would have caused to push back the release date by roughly a year) @Phlapskate: If you let the clients update the stats, you open the door for the padders to alter the data sent to the server, which would result in people jumping from 1 to 50 after one 10 seconds round. Hacks also work this way - a wallhack or aimbot only works because you get all the player positions sent every time. So your computer knows where ever player on the server is, if you would for example only get the information of the players ins your sight, wallhacks wouldn't work anymore. Last edited by xnorb; 21-01-2011 at 07:56 PM. |
|
|
|
|
|
#13 (permalink) |
|
Forum Junkie
Join Date: Aug 2010
Location: Norn Irn
Age: 20
Posts: 4,622
|
How about only letting consoles update stats this way then, then let PC's continue using the server to backend system.
__________________
My E-peen CPU: Intel Core i5 2500K @ 3.3GHz RAM: Corsair Vengeance 2x4GB DDR3 @ 1600MHz GFX: PNY GeForce GTX 560Ti OC @ 850 MHz HDD 1: WD Caviar Blue 500GB x2 PSU: High Power 800W |
|
|
|
|
|
#16 (permalink) | |
|
Forum Guru
Join Date: Jan 2009
Location: US
Age: 31
Posts: 1,030
|
Quote:
Once someone fails to connect, or connects to see that their stats are gone in-game, he tends to continue logging-in and out in an attempt to "force" a "refresh." That only exacerbates the problem. The only fix for the problem as it stands now is for people to just quit trying to play, and to go find another game. When enough people quit playing, it works for the ones that stayed. It's happened since the game came out. Any time the numbers jump, usually due to a mode pack or patch or expansion, things go south. We're basically being told that we should take a look at Dice's competitors next time we want to hand out money for a game. Any fixes will either take too long or are just too hard. They told us the same thing about patching, due to their cooking process. As terrible as Black Ops is, in my opinion, at least it's possible for them to hotfix the game. |
|
|
|
|
|
|
#18 (permalink) | |
|
Forum Guru
|
Quote:
I'm thinking an intelligent switch that splits requests to one of say 3 identical ram drive raids ? When there is low traffic the raids compare and repair. The write requests go to all 3. But the read request is taken from whichever is available. Plus you get 200% redundancy over a single setup. Don't ram drive prices now make multi gig raids practical ? It would be like putting the whole database in cache, probably be faster to 'crunch' too ? Would that improve it ? I don't know enough to say either way, but it maybe sounds like it should ? If the bottleneck is at the read request queue ? But I guess it depends on the size of your database for each version ? But if it's a single look up table of reasonable size it should be possible ? Maybe ? Meh, you probably do something a lot smarter than that already. But just my interested 2c. |
|
|
|
|
|
|
#23 (permalink) | |
|
Elite
Join Date: Jun 2010
Location: Vienna, Austria
Age: 33
Posts: 6,696
|
Quote:
@UKGN: Storage is cheap for you, but for companies we're talking about the price for the hardware + backup storage + maintenance costs + energy and so on. So price for a live system storage is quite expensive, and don't forget, there's no monthly fee, so their budget is limited to sale income minus maintenance minus development costs minus profit. |
|
|
|
|
|
|
#24 (permalink) |
|
Forum Junkie
Join Date: Apr 2007
Location: YOU HAVE BEEN DICED
Age: 35
Posts: 3,491
|
good read as always although it only confirms what many people already guessed , BC2 runs on lousy hardware/software conditions that isnīt even optimized for the game itself and the costs of correcting that are something EA will not assume.
nothing new here. still i would like to read your opinion of how this can affect the hit reg/rubberbanding issues ot not. if the stats were turned off would the game run better or much better? how about patching the game to stop reading the stats every single time we leave a server? im sure that would drop the requests load by a fair amount? or better yet , only retrieve/update stats when a player presses a refresh button in the stats menu? i could live very well with only updating my stats every week or so if that would make the game a better experience. offcourse a new player would want to update a lot more cause of the unlocks and stuff but it would still be less than one at login and one at each server drop,and after a certain level that need to update grows much smaller so i think that if this is douable it should be considered too.
__________________
|
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
|
|