Skip to main content

7 posts tagged with "Networking"

View All Tags

· 4 min read
Kevin Glass

At Rune, the majority of the games on the platform are multiplayer. This is largely because we provide an SDK that enables JavaScript developers to build multiplayer experiences very easily, and our player base has come to expect it. Of course, as mentioned in Modern Game Networking Models, this means we focus on making the backend networking something special.

There are a lot of ways of making games multiplayer, from hot seat to shared screen and of course networking itself. Even in networking, there are multiple models to choose from each of which is suitable for a different type of game or programming complexity.

If you’re building a network layer for a single game or a bunch of very similar games then choosing the network model that’s the easiest and satisfies those game constraints is the best move.

However, at Rune, we’re pretty opinionated about a single model that works for all cases, predict-rollback. We need to provide a single common framework for all the games on Rune and so we focus on one networking model that supports the massive variety of games on the platform.

Predict-Rollback

In Modern Game Networking Models we talked in a bit of detail about how predict-rollback works. In summary, we essentially let all clients continue moving forward predicting the current game state based on the inputs they know about. If another client provides a new input (via the authoritative server) that occurs before the game time the current client is at, we roll-back the game state, apply the input, and then re-predict the current state.

So why do we think predict-rollback is the future of networking games and the best fit for a generic networking framework?

  • Some great games have used it to provide excellent multiplayer experiences, like Rocket League and Street Fighter. They also do an amazing job of hiding the rollback/changes when they occur.
  • It works for all cases, whether it's turn-based, RTS, or faster-paced twitch games; predict-rollback provides a stable, consistent approach. Even in turn-based games, where there should be no rollbacks, the simple simulation modified by inputs approach still fits the bill.
  • There’s growing library and platform support. Unity, Godot, and even Valve’s Source engines all have plugins that support this model.

What’s so great about the model then?

  • Low bandwidth—you only need to send the initial state and changes to that state. That’s pretty powerful right there. The variance in networks especially with the emerging nations becoming a huge consumer of games means this is super important.
  • Best player experience—in many cases, it means that clients can run forward without latency between player input and response. Of course, you need to deal with conflicts when they occur, but this seems to be much easier than the alternatives.
  • Most consistent implementation—once you’ve got determinism handled, it’s the most consistent approach across platforms and devices.. Every device acts the same and gets the same results.

What are the downsides? The process of rolling back and re-calculating the game state can be CPU heavy. Depending on your approach you may have to calculate many frames of change quickly based on the new input. However, this is why it’s now the right choice. Devices have reached a point where CPUs are extremely overpowered for what they’re trying to achieve in games - so there’s room to have a smart and utility based network model.

Of course, if you’re building a network model for a specific game, there are many tricks and game-specific approaches you could take.

If, however, you're building a library/framework that supports many types of games in many different environments and on different devices, predict-rollback is the right choice for now and the future.

Want to learn more about our approach or simply want to discuss the content of this article? Stop by our Discord and let’s chat!

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 7 min read
Kevin Glass

I’ve built a few game servers over my career, including session-based mini games and an MMORPG. At Rune, that experience comes in handy as we’re building a game architecture that needs to support 10 million players. Getting the architecture right is key to having something scale in the long term.

There are lots of resources around the web on game server architecture, but it’s also worth noting that the requirements for gaming are very similar to voice/video conferencing (maybe that’s why WebRTC fits so well?) - so it’s worth checking out the architectural approaches there too.

The following are some of the issues and requirements to consider when building your own scalable multiplayer game architecture.

Matchmaker vs. Real-Time

Most games have some form of matchmaking, or working out the best pairing of players. Some games also have social and configuration type activities that don’t require player-to-player interaction. These are normally categorized as “matchmaking.” The requirements on latency in these scenarios are reasonably low. If a player is choosing a map or finding another set of players to compete with, a second or more response time is acceptable.

Once the players actually start playing the game, they immediately have a different set of requirements. In the “real-time” phase, player latency changes the game experience dramatically. It’s important that we’re optimizing for low latency on the network and high throughput on the server activities.

Since there are two different sets of requirements, it’s good practice to split these two scenarios into different components in the architecture. In the diagram above, we have the split between the central server - responsible for our matching type activities, and the real-time regional servers used for relaying in-game messages.

Regional Servers

As mentioned in Modern Game Networking Models, one of the first things to think about in any networked game is making sure that the connection between client and server is the best it can be.

Even if you assume everyone is on a great connection (something that isn’t true!), in the best case, network packets travel at the speed of light. If players are connecting from anywhere in the world, then the distances between a central server and the clients add up to significant latency.

To solve this, it’s ideal to introduce regional servers, that is, servers that are closer to clients. This limits the distance the packets have to travel and hence lowers the latency.

The downsides here are that running lots of servers is costly, and of course, if players want to play together from very distant locations, you have to choose one region for them to play on that might not be optimal for them.

Where players are a long distance from each other you have essentially two options:

  1. Pick a regional server that’s equally distant from both. This ensures they both get a similar network experience - although not optimal for either.
  2. Rely on backbone trunking. Connecting across the public internet can be slow for lots of reasons (e.g., poor cell/wifi, lots of network hops). The backbone fabric that your servers run on is often much faster. We can choose to have the players access a local regional server as an access point and then connect these regions via the faster backbone.

Load Balancing

Any scalable system needs to be able to add capacity through adding servers and this means load balancing. The split between matchmaking and real-time also changes how we do load balancing.

On the central server, the load balancing can act largely like a web application, that is, load balancing can be stateless and we rely on the database as the synchronization layer. If a player applies a change to their skin, the application can make a change, store it in the database and on the next invocation read it back from the database. There’s no running state that needs to be maintained between invocations.

However, on the real-time server, there will be several pieces of state:

  • The connections from the players themselves. As mentioned in WebRTC vs WebSocket, for high performance we want to establish and maintain a connection between clients and the real-time server. The connection must not be dropped between interactions with the server.
  • The server is running the authoritative part of the game, making sure that the players see the same state and don’t cheat. The server may also be running part of the game logic such as computer-controlled actors and in-game events. This needs to continue to run whether players are taking actions or not.

Since the real-time server is stateful, the load balancing needs to connect to the same server for all players in the same session. In an MMORPG this means the zones the players are allocated to a server and all players in those zones connect to that server. In Rune, this means that the room/game combination is allocated to a server in the same way.

More generically in load balancing, this is known as “sticky sessions.” When a connection is made to the load balancer an attribute/parameter of the connection is used to determine where the connection needs to go. This of course makes the load balancer that bit more complicated and often leads to custom load balancing solutions.

Database

For most online games, databases are key to maintaining the long running state and storing player profiles, levels etc. The central server uses the database to maintain state between invocations so it’s heavily used and operations often rely on results from the database. This behavior is common in web application architectures too and means being very careful with your database performance.

On the other hand, the real-time server use of a database cannot block operations. Real-time exchanges are measured in millisecond latency and any blocking based on a database is likely to degrade player experience. On the real-time server it’s generally preferred to avoid any database read access in the core network flows - instead pulling the data required at the start of the session and holding it in memory.

There are of course cases where actions in the real-time server where the database needs to be updated to reflect changes the player has made. Whenever possible keeping these interactions asynchronous is the best approach.

The matchmaking and real-time servers should have a separate database (or at least a different set of tables/schema) that they act on. This allows us to have different rules of engagement to the database in each case and to be able to tune the database in each scenario for its specific expected use.

The final point of database interaction is where the updates on the real-time server need to be passed back to the central server, or vice versa. Again, whenever possible this needs to be an asynchronous process so that it doesn’t interfere with the real-time server run-time operations.

Metrics

Final, but important, point: don’t forget metrics/observability. When scaling any system it’s key to be able to understand how the system is operating and even more how any change that you make affects the performance and stability.

Applying metrics after the fact is actually pretty hard, trying to instrument or interpret database everything once it’s all in place and the implementation has passed is time consuming and error prone.

When building a scalable system design and add the metrics to the features as they’re implemented. Having well-thought-out and intentional metrics is the only way to really tune and improve an architecture.

If you found this article interesting or have questions, drop by the Discord.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 6 min read
Kevin Glass

At Rune, we’ve got a platform where millions of users are now playing casual multiplayer games. As I blogged about previously, we care a lot about making the multiplayer networking excellent. No matter which model you’re using, making the best use of the available internet transport is key.

When it comes to real-time games, the majority of modern releases use UDP rather than TCP. The web, however, for a long time only had access to TCP (via HTTP) and developers have found novel ways to make use of the reliable stream to build real-time games. However, the reason most real-time games outside of the web use UDP is its unreliable nature.

Why is unreliable good?

Real-time games rely on prompt message delivery, or your ‘ping’ from a gamer’s point of view. With TCP and reliable delivery, if one packet gets delayed so does everything sent afterward. With UDP and unreliable delivery, packets aren’t dependent on each other so delays don’t compound. This does leave the developer working out which packets/data need to be reliable and how to ensure it, but the finer level of control gives options for a better game experience.

As I said above, the web used to be TCP only, but with the advent of WebRTC, and a few years after its initial release, we were given RTC Data Channel which, while not pure UDP, can act like it.

What are WebSockets?

WebSockets have been with us since about 2008. They were the first game development-usable two-way communication available in the browser. Before this, we had server-side pushes and some polling technologies, but the response time was high.

When a browser retrieves pages, it connects out to the server, requests the page, and gets a response. Browsers generally reuse their connections, so the TCP connection was staying open to the server for subsequent requests. The server, however, can’t send anything with HTTP unless a request has been received that it's going to respond to.

Enter WebSockets and the 101 Upgrade / Switching Protocols messages. Since the TCP connection is staying up anyway, can’t we just leave it there but let the server and client exchange data (or frames) as and when they want?

Now we have two-way communication between the client and server, and this works well for certain types of games. However, the connection is still TCP, which isn’t great as I’ll describe in more detail below.

The great thing about WebSockets is how easy it is to implement on both client and server. There are a wide range of libraries easing the development of WebSockets, most notably Socket.io, and the amazing availability of cheap hosting for them.

What is WebRTC and Data Channel?

WebRTC is a technology to allow voice and video communications directly in the browser. It arrived around 2011, got formalized, and was widely supported by 2015. As part of the specification, the RTC Data Channel was added. Data Channel provides a real-time method to send raw data between browsers and servers, or “peers” in WebRTC language.

Since WebRTC originated from the voice/video world, when they added data channels, they used a telecoms-based protocol called SCTP. Luckily, SCTP is based on UDP and it can be configured to act like raw UDP, i.e., send a packet once when I tell you to.

WebRTC is considerably more complicated to work with than WebSockets. The initial setup of the data channel requires the peers to exchange signaling information (SDP, a text payload). This means you’ll need some other transport (normally WebSocket) to pass the signaling before you can even get a channel set up.

Since the final connection is using UDP, network traversal can also be more complicated. Again, WebRTC used the telecom standards for determining the best path through a network to connect peers using STUN, ICE, and TURN.

  • STUN to determine your own address
  • ICE to describe possible ways to connect
  • TURN as a relay when other connection methods can’t work.

The infrastructure you need to run WebRTC is more intricate although there are now free and inexpensive cloud services available.

Library support is also limited with respect to WebRTC, with the majority of the few libraries available focusing on the voice/video side of the API rather than the data channel that we’d like for game use. There are some great examples like node-datachannel that are becoming mature enough to use in production services along with great cloud services like livekit.io.

Head of Line Blocking

So, given that WebRTC is harder to use and develop with, why would we use it?

As mentioned above, there’s a significant difference between TCP and UDP that matters for real-time games, Head of Line Blocking.

TCP attempts to fully manage the connection between endpoints for you. This includes resending packets for reliability if the far end has not acknowledged them, windowing to make sure we’re not attempting to send “too much” and congestion control to detect when the network can’t handle what's waiting. This is great for general connections - the internet is a difficult place to work in and managing the connection like this takes a lot of work off the developer.

However, for real-time games, every millisecond counts and having TCP ensuring order and delivery for every packet often gets in the way. As the Gaffer article above describes, if one packet on a TCP connection doesn’t get through then every other packet is now waiting for the first’s delivery and the developer can’t access any of the data (that may have already arrived) until that first packet completes its journey.

In short, if you’re wanting to do real-time communications, you need UDP, and in the current web browser, that means WebRTC.

Is it worth the work? Yes.

The Future

Of course, with all things web, there’s a new standard coming that might shake things up again, WebTransport. This new standard looks like it’s going to give us game developers everything we might want.

Unfortunately, at the time of writing, the standard is still changing, browsers aren’t quite there yet, and the server-side support is very limited.

I’m still holding out hope that WebTransport is the answer, but what do you think?

If you’re interested in this topic or the Rune platform, drop by our Discord and let us know.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 10 min read
Kevin Glass

In the second tech-demo, we look at making a platform game multiplayer using the Rune SDK. For anyone following along the base code is the same as the last technical demo, but we'll cover most of it in this article.

You can give it a go on the tech demos page.

First a re-cap of the architecture of a Rune game. We separate the rendering and logic like so:

It’s good practice to separate your data model and logic from the rendering, i.e. the MVC pattern. With multiplayer this isn’t just best practice, it’s absolutely required to let us run copies of the game logic on both the server and client.

The logic should contain the data that is required to update the game and how winning/losing can be evaluated - i.e. the game state. We want to try and keep it fast since in the predict-rollback network model (that Rune uses) we will be running multiple copies of the logic. The logic is implemented to maintain determinism and to allow it to be executed both on the browser and server.

The renderer, or client, is the code the renders to the game for the player and accept their input. The client can be implemented using any library or framework that can run in the browser.

Let's get to the code. If you need directions on creating a game project they're in the docs. In this demo we’re going to have a map and some players. So first let's declare some types to describe those:

// the extra data for the player
export type Player = {
x: number
y: number
sprite: string
playerId?: PlayerId
// the state of the controls for this player - this
// is the bit thats actually sent regularly across
// the network
controls: Controls
animation: Animation
vx: number
vy: number
// true if the player is facing left instead of right
// as the sprites are designed
flipped: boolean
}

// the controls that we're applying to the game state
// based on which inputs the player is currently pressing
export type Controls = {
left: boolean
right: boolean
jump: boolean
}

For Rune to synchronize the clients we'll need to define the shared data, in Rune that’s as easy as this:

// this is the core of what we're trying to keep
// in sync across the network. It'll be held on clients
// and server and the Rune platform will keep it
// in sync by applying deterministic actions
export interface GameState {
players: Player[]
}

Next we can initialize the logic for the game which all clients will start from before applying changes they receive from clients:

Rune.initLogic({
setup: (allPlayerIds) => {
const initialState: GameState = {
// for each of the players Rune says are in the game
// create a new player entity. We'll initialize their
// location to place them in the world
players: allPlayerIds.map((p, index) => {
return {
x: 20 + (index + 1) * 32,
y: 260,
playerId: p,
type: "PLAYER",
sprite: PLAYER_TYPES[index % PLAYER_TYPES.length],
animation: Animation.IDLE,
controls: {
left: false,
right: false,
jump: false,
},
flipped: false,
vx: 0,
vy: 0,
}
}),
}

return initialState
},

In the game logic we need to declare what the clients can do and how the game should update each frame. In Rune, the game update is defined as part of setting up the Rune SDK like so:

update: ({ game }) => {
// go through all the players and update them
for (const player of game.players) {
player.animation = Animation.IDLE

if (player.controls.left) {
player.vx = Math.max(-MOVE_SPEED, player.vx - MOVE_ACCEL)
player.flipped = true
} else if (player.controls.right) {
player.vx = Math.min(MOVE_SPEED, player.vx + MOVE_ACCEL)
player.flipped = false
} else {
if (player.vx < 0) {
player.vx = Math.max(0, player.vx + MOVE_ACCEL)
} else if (player.vx > 0) {
player.vx = Math.min(0, player.vx - MOVE_ACCEL)
}
}

player.vy += GRAVITY
...

The game logic is configured to run at 30 updates a second and on each update we’re going to move the players based on what their controls are - i.e., are they pushing left/right.jump. We're going to have two sets of collision, one against a tile map for the level and another between players. This lets players use each other as platforms!

The collision code is brute force, look for tiles that we might be colliding with and then check rectangle/rectangle collision for the players. You can see that in isValidPosition.

So how does this synchronize the clients?

The Rune platform runs this logic on the server and each of the clients. When a change is made to the game state is first applied locally - so latency in controls is very low - and then sent to the server and subsequently to all other clients. This is all timed so that the local client isn’t applying the changes too early and gives the server time to schedule the change at the right time.

Everyone playing and the server have a copy of the game logic which they’re keeping up to date based on the changes they receive. This relies on the game logic being fully deterministic but from a developer point of view means you don’t really have to think about how the sync is happening. As long as you keep your updating code in the game logic, the clients will stay in sync.

The client will run a copy of this logic and update() loop so will immediately update is run. The server will also run a copy of this logic and update() loop but slightly behind the client to allow for any action conflict resolution, e.g. two players try to take the same item. When the server has resolved the conflict the client will rollback its changes if needed and apply the new actions from the authoritative server putting the client back in the correct state.

The final bit of the game logic is how the "changes" to the game state can be indicated by players, what Rune calls actions.

// actions are the way clients can modify game state. Rune manages
// how and when these actions are applied to maintain a consistent
// game state between all clients.
actions: {
// Action applied from the client to setup the controls the
// player is currently pressing. We simple record the controls
// and let the update() loop actually apply the changes
controls: (controls, { game, playerId }) => {
const player = game.players.find((p) => p.playerId === playerId)

if (player) {
player.controls = { ...controls }
}
},
},

The actions block defines the set of calls the renderer can make to translate player input into changes to the game state. In this case we simply take whatever the client has said the controls from the player are and store them in the player entity. As mentioned above, because the client is running its own copy of logic these changes are quickly applied.

You can see in this case we’re sending the controls rather than explicit positions, which at first might seem a little strange. This makes sense when you consider one more factor, conflict resolution.

If two players both make actions on their local copy of logic that conflict in some game specific way then the clients have to rollback their game state, apply the actions in the correct order and recalculate game state. Let’s say they both try to take an item at the same time, because their logic is running locally they’ll both think they took it. Once the actions reach either end it becomes clear that one player took the item first and the Rune SDK calculates the state to match the correct situation.

Now, if we sent explicit positions this conflict resolution would result in significant jumps - where a player’s actions were completely disregarded because they were in complete conflict. If we send the controls then the resolution is much smoother, the player still pressed the controls and had them applied, just the resulting game state is a little different. A lot of the time this can be hidden altogether in the renderer.

Now we have the game logic, the players can update controls and they’ll move thanks to our update loop. The final part is to get something on the screen and let our players play! The tech demo uses a very simple renderer without a library or framework. It just draws images (and parts of images) to an HTML canvas and uses DOM events for input. Check out graphics.ts and input.ts if you want to see the details.

First we need to register a callback with Rune so that it can tell us about changes to the game state:

// Start the Rune SDK on the client rendering side. 
// This tells the Rune app that we're ready for players
// to see the game. It's also the hook
// that lets the Rune SDK update us on
// changes to game state
Rune.initClient({
// notification from Rune that there is a new game state
onChange: ({ game, yourPlayerId }) => {
// record the ID of our local player so we can
// center the camera on that player.
myPlayerId = yourPlayerId

// record the current game state for rendering in
// our core loop
gameState = game
},
})

The rendering itself is purely taking the game state that it’s been given and drawing entities to the canvas:

// if the Rune SDK has given us a game state then
// render all the entities in the game
if (gameState) {
// render the game state
for (const player of gameState.players) {
const frames =
player.animation === Animation.JUMP
? playerArt[player.sprite].jump
: player.animation === Animation.WALK
? playerArt[player.sprite].run
: playerArt[player.sprite].idle

drawTile(
player.x - 16,
player.y - 16,
frames,
Math.floor(Date.now() / 50) % frames.tilesAcross,
player.flipped
)
}
...

The only other thing the renderer needs to do is convert player inputs into that action we defined in game logic:

// we're only allowed to update the controls 10 times a second, so
// only send if its been 1/10 of a second since we sent the last one
// and the controls have changed
if (
Date.now() - lastActionTime > 100 &&
(gameInputs.left !== lastSentControls.left ||
gameInputs.right !== lastSentControls.right ||
gameInputs.jump !== lastSentControls.jump)
) {
lastSentControls = { ...gameInputs }
lastActionTime = Date.now()
Rune.actions.controls(lastSentControls)
}

There’s a couple of conditions put on sending actions. We don’t want to send unchanged controls into the game logic, it won’t change anything. The Rune SDK also ensures we send a maximum of 10 actions per second from any client to prevent swamping the network.

That’s it, we have a game logic that will keep the client’s game state in sync and a renderer that will let our players play.

If you have any questions or comments on the tech demo or Rune in general, be sure to join our Discord.

Assets from Pixel Frog.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 6 min read
Kevin Glass

I’ve written a few multiplayer games and seen patterns in the things I got wrong across projects. As part of my work here at Rune I’m getting to see more and more developers making network games and there’s definitely some common themes in the problems they face.

When you write a lot of single player games it’s easy to get into habits that make game development faster. For instance, knowing that your data model is as fast to update and as your rendering is a real bonus when trying to move quickly from concept to MVP. Some of these habits can make multiplayer game development harder, so here’s a few things to think about when you’re building your next hit game.

Players Leaving and Joining

In a session/room based game players can leave and join at will. This might be due to active choices, connection issues or being removed from the game. For each player we’re normally holding state - that could be their position, the items they're holding or their score.

What happens when a player leaves the game? The state could just be deleted and the other players carry on without them. There’s room for interesting game design here though, where a leaving player gives their state/progress to other players. Maybe the player needs to leave a marker where they were. Even more, what if a leaving player has a detrimental effect on others - encouraging people to keep playing? Do we need to adjust the difficulty of the game based on the number of players?

What happens when a player joins the game mid-way through? They could become a spectator of the game, they could jump in with the other players at their current point. Is there any bonus for players joining? Again, consider if we need to scale the difficulty of the game based on the new player count.

What happens if a player leaves and then re-joins quickly? This might mean they just lost connection, so consider a grace period. Rejoining players often get their state reset, check if this can be used as an exploit. If there is a way to gain through rejoining, players will find it.

Determinism

As described last week, determinism in multiplayer games is often key to the network model. It’s quite easy to make a deterministic data model that ends up being non-deterministic on the rendering side. Pay special attention to how the renderer converts the inputs given from the network data model - a common issue is running animations independently from the synchronized data model resulting in different player interpretations of whats happened.

Be intentional when making effects take place on the client side that aren’t driven by shared data.

Player Interactions and Rollbacks

Interactions between players are always hard in networked games, of course it’s where the fun is too. Thinking about interactions up front is critical to making a game “feel” right. These interactions generally fall into the following categories from easiest to hardest to get right:

  1. Data exchange - like swapping or sharing objects and consumables. In these cases there’s nothing real time visual to see (other than stats or inventories updating) so delay doesn’t feel bad. These are an easy win to make a game feel multiplayer.
  2. Out of band effects - like power-ups that affect other players. We see these in popular carting games, you collect an item, use it and all the other players get their screen blacked out or become mini versions etc. The nice thing about these is that they can be delayed slightly from the player input without it seeming wrong - meaning that latency can be absorbed.
  3. Ranged Interactions - like shooting an opponent. These are still latency and rollback sensitive but visually the rollback won’t look so bad because it’s so far away. If you appear to hit but actually miss the player doesn’t feel quite so cheated.
  4. Close Interactions - like two players pushing each other around. These are by far the hardest to get right, which is why so many multiplayer games ignore them. The latency really hurts here. e.g. the remote player stops moving and you get a rollback causing a sudden movement in the shared object. They can be mitigated with slow moving objects and by using acceleration in the player.

Categorizing your interactions early on can help you both choose a networking model and de-scope features that will make your game either too hard to implement or unsatisfying as an experience.

What to Synchronize

In contrast to Determinism above, sometimes the first stop of a developer working on a multiplayer game is to synchronize everything. I know I fall into this trap regularly. At the least it means you’re sending more across the wire than you need to but at worst you can be causing rollbacks for things that don’t matter.

I repeat, be intentional when making effects take place on the client side that aren’t driven by shared data.

Player Base Size

One of the hardest things as an indie game developer is getting from your early access build to something popular enough to have regular players. This means when you’re designing your multiplayer experience for large groups you need to keep in mind in the early days it’ll likely be much fewer, or even solo play.

Even if it's temporary, having a design that's flexible enough to make it fun for solo play and then grow into multiplayer is great. Once your game is off and running you can throw the solo aspects away if needed.

Equally knowing the upper end of the players supported by your design is important. Designing for small groups brings different complexities than designing for massively multiplayer numbers. Of course depending on the game you might end up with some aspects for each.

Real Life Internet

As mentioned in game networking models, the real internet isn’t always smooth sailing. High latency is pretty common as is varying latency on a single connection. Try to think about how the game will feel with players on significantly different network connections. Accessibility-wise you want to get as many players on board as possible, so this may well mean that not all connections are equal.

I’ve seen a couple of games out there that have a lag bonus - if your connection was poor or unstable you get an easier ride in the game. With the right tuning this feels like it could be a good solution.

There are probably many other gotchas and areas to think about when designing multiplayer games, these are just a few that have come up for me. If you have other areas you think should be mentioned or want to discuss any of the above be sure to visit our Discord.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 8 min read
Kevin Glass

Rune is a platform for multiplayer games, so as you can imagine the networking code is core to what we do.

The initial approach that people take to game networking is to send all of the game state across the network every frame. This doesn’t scale, imagine you’re playing an RTS with 1000 units each - the network simply won’t support sending that much data all the time.

Most network strategies only send the inputs from the client across the network, so in our example: send these 50 units to position X. This works by having all the clients maintain their own game state and applying the received inputs to that.

Games make sure everyone starts from the same point. Then we apply the same changes to all the game states and expect them to stay in synchronization. This is called determinism - if we’re at state X and we apply change Y then we end up at state Z.

This sounds like the sort of thing a computer would be really good at, but unfortunately the JavaScript specification leaves some things open to interpretation. This means that different implementations of the JavaScript runtime have slightly different behaviors. These slight differences can result in significant changes to the output.

When it comes to determinism, details count, especially when you’re dealing with a wide variety of devices and runtimes. If the result of math operation on device A is 0.001 different to device B then the resulting direction and final game state can be massively different. Consider if that value is the angle that you’re flying in a game, and then you go on to fly 10^6 units forward. Your positions are now significantly different.

Here at Rune we have this exact problem, JavaScript isn’t consistent across devices and runtimes. We do of course need all of them to react to game state changes in the exact same way. The bright side, thanks to JavaScript, we’re able to patch the parts that aren’t deterministic.

So what needed patching?

Math Functions

The first place to start is all of those functions on the Math object, and the warning sign is right there on MDN page:

Note: Many Math functions have a precision that's implementation-dependent.

This means that different browsers can give a different result. Even the same JavaScript engine on a different OS or architecture can give different results!

Outside of the constants nearly all the functions need patching to ensure that everyone in a game returns the same result. This simply means that all the results need to be rounded to a common precision, namely Single-precision floating-point format. Luckily there is a Math function that does exactly that.

The functions that needed patching for our use case are:

abs, acos, acosh, asin, asinh, atan, atan2, 
atanh, cbrt, ceil, clz32, cos, cosh, exp, expm1,
floor, hypot, log, log10, log1p, log2, max,
min, pow, round, sign, sin, sinh, sqrt, tan, tanh,
trunc

So, does this break anything? No, not really. Results are still accurate enough for games to function normally. There is one exception to all of this that we’ll see next.

Random Numbers

The second piece of the puzzle is random numbers. One of the things that most developers know is there’s no such thing as random numbers in computers, only pseudo random numbers. So you’d think it’d be perfect for determinism - unfortunately not.

The Math.random() specification is JavaScript is deliberately vague to allow runtime implementers to have freedom to build appropriately. Worse than that the specification doesn’t support passing in a seed so there is no way to start the random number generation in a predictable manner.

First stop, we’ll patch Math.random() with a custom seeded random number generator function. There are many well documented out there in the public domain. Brilliant, that gets us deterministic! I've used mulberry32 many times in the past so it was a pleasant surprise when I joined Rune to find us using the same algorithm.

function randomNumberGeneratorFromHash(hash: number) {
return function () {
let t = (hash += 0x6d2b79f5)
t = Math.imul(t ^ (t >>> 15), t | 1)
t ^= t + Math.imul(t ^ (t >>> 7), t | 61)
return ((t ^ (t >>> 14)) >>> 0) / 4294967296
}
}

In Rune however we use the predict-rollback networking model, which requires us to be able to rollback time and reapply events to our game state. But what if we generated random numbers? A pure seed isn’t enough anymore because we want to generate the same random numbers we generated back when we first ran that time step of logic.

To solve this problem we have to keep track of the seed independently and allow for rolling back to previous seeds. With that approach we now have a fully deterministic random numbers even with rollbacks! We use the hash of each named update loop and xmur3 to generate the seed:

function hashFromString(str: string) {
for (var i = 0, h = 1779033703 ^ str.length; i < str.length; i++) {
h = Math.imul(h ^ str.charCodeAt(i), 3432918353)
h = (h << 13) | (h >>> 19)
}
const seed = () => {
h = Math.imul(h ^ (h >>> 16), 2246822507)
h = Math.imul(h ^ (h >>> 13), 3266489909)
return (h ^= h >>> 16) >>> 0
}
return seed()
}

Sorting and Shuffling

A very common action in games programming is to sort an array. Whether that’s for z-sorting in 2D or line of sight checking. What’s more, shuffling an array for game logic is often done using a combination of Math.random() and sorting. JavaScript of course has array sorting built in but unfortunately the exact details of that array sort vary between implementations - especially in regard to strings.

Again MDN has a little hint:

Due to this implementation inconsistency, you are always advised to make your comparator well-formed by following the five constraints.

There’s a decent chance that a developer will accidentally use a comparator that doesn’t quite end up with the same result on different implementations. To remedy this we can patch the Array.prototype.sort function with a default comparator.

Reading around on the topic, this Stack Overflow was the implementation that seemed most appropriate for us, which ended up looking like this:

const defaultCmp = (x: any, y: any) => {
// INFO: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort
// ECMA specification: http://www.ecma-international.org/ecma-262/6.0/#sec-sortcompare

if (x === undefined && y === undefined) return 0

if (x === undefined) return 1

if (y === undefined) return -1

const xString = toString(x)
const yString = toString(y)

if (xString < yString) return -1

if (xString > yString) return 1

return 0
}

For details on how this applied see the next section.

How to Patch

Just a quick note on patching the functions above, in case you haven't done it before. JavaScript is flexible, so much so you can override default functions of the core libraries it self. Since functions are properties of objects they can (in most cases) be overridden using Monkey Patching. So in our example of sorting above we do:

// override the sorting operation on Array 
// to use our deterministic function
globalThis.Array.prototype.sort = arraySort

Where arraySort is a sorting function that makes use of a default comparator.

Since we don't want to change the functionality outside of the game logic, we can also take a copy of the original function and reset it after our code has executed:

const originalSort = globalThis.Array.prototype.sort
globalThis.Array.prototype.sort = arraySort

try {
return gameLogic()
} finally {
globalThis.Array.prototype.sort = originalSort
}

Bonus Content!

The other thing we can do is help developers to recognize when their code might produce non-deterministic results. There are many possible causes of this e.g. access to global scope and using locale related functions. In the JavaScript world eslint is a common tool for applying a set of rules to code as the developer is working and at build time. At Rune we provide an eslint plugin that encapsulates all the common errors we've seen so developers are warned right inside their IDEs when something might cause an issue.

As you can see determinism in JavaScript isn’t straight forward but it is obtainable. We now have games running in perfect synchronization.

If you have comments or want to learn more head over to the Discord.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!

· 12 min read
Kevin Glass

At Rune we’re building a place where developers can share their multiplayer web games with the world. This of course means we care a lot about game networking code and how it performs in the real internet at scale.

Networking in games isn’t easy, you’re dealing with a conflict between the speed of light, the distances between people playing on the internet and player’s expectations. Worse still a player’s emotional state, i.e. whether they win or lose, can often be dependent on the networking model that’s been used and whether it seems “fair”.

There are several types of game networking models used in modern games. Normally when you’re planning out a network game you’ve got to consider:

  • Latency for the Player - or “how does it feel?”. Will players still feel attached to their characters or units? If there’s noticeable latency between controls and action, can it be explained away in some game fitting way?
  • Available Bandwidth and Battery - some network models can have high network usage and can be expensive on battery (as well as bandwidth). Can your expected player base absorb that?
  • Is it difficult to implement? - We often try to aim for the most simple solution, assuming that the differences between the average and optimal solutions won’t matter. That’s not the case in network code. Is a simple model good enough?
  • What happens when the network goes wrong? The internet isn’t consistent. Across all the different connections you’ll see a lot of variance in latency and constraints. More annoyingly, you may see a large variance in a single connection. A great connection may still have lag spikes at 5x the average latency. What happens in your network model when there is a lag spike?

That fourth one is really painful. It’s extremely difficult to test properly meaning you’ll most likely only find out there are issues once the code is out in the wild. Game networking is a lot like video conferencing - if there’s a problem your users will just know. They’ll detect even tiny corner cases and it’ll distract from their experience.

The internet has constraints that make it hard to predict what it’ll look like when you try to play a game:

  • Packet Size - We’re all very used to being able to download large files. Libraries often abstract developers from the underlying transport and allow them to send whatever they want. However, the actual packets that run across the network have a fixed size (about 1500 bytes - MTU) so if you care about latency you need to be thinking in the packet model.
  • Physical Constraints - The internet is fast, but its not that fast. Let’s assume there was one fiber line around the world. A packet, even at light speed through fiber would still take ~200ms to travel around the world. Now add in routers and switches on that path, and remember that most of the internet isn’t fiber. Don’t forget the last hop to the end user device. This tells us it’s important to think about regional servers (outside of any network model).
  • End User Devices - There are a great variety of devices on the internet. Mobile games (where Rune targets) can have a wide spectrum of device power and connection availability especially in emerging markets. The game of course still needs to feel fair.
  • Congestion and Loss - The internet is a shared resource. Most of the time this doesn’t matter, most routes and backbones are hugely under subscribed. When congestion does become an issue the internet has one solution - drop low priority packets (that means yours).

It’s pretty clear it’s a hard nut to crack but of course we do see many successful internet playable games. Clearly there are network models that work. Let’s look at a (non-exhaustive) list.

Turn Based

Some games don’t need real time network or for that matter low latency controls. If the game is either actually turn based - or can be treated as turn based - we can use a simple authoritative server and message passing.

Player 1 takes their turn. They send a message describing their actions to all the other players via the server (or peer to peer). Each receiving player (and the original player) plays out those actions in their game state. The next player then takes their turn and so on. If needed the server can also take a turn to move the computer controlled elements. Games like X-COM did this with great success. With additional rules like overwatch, the game still felt dynamic and fun but the networking didn’t need to have any pressure at all.

There’s a couple of caveats here.

  1. The players must start from the same game state.
  2. The application of the actions must be deterministic, that is playing out the actions must result in the same state on all instances of the game logic.
  3. If a player joins mid-game you still need to be able to serialize the complete game state and pass to them.
  4. If a player lags then all the other players get to wait too.

Huge pro on this one, it’s really easy to implement!

Brute Force

Some of the early real time shooters out there did networking but they didn’t obsess over the details of making it “fair”. If your players can cope with accepting what they saw was wrong and they died anyway then you can skip trying to be clever.

The approach was simply to spam the network with packets describing where you said you were and what you said you were doing. The server would accept what you said as fact and try to resolve any interactions within a set of rules. e.g. if A says he shot B and they are close enough then he must have, tell B to take the damage and die.

Surprisingly the games were still fun and players weren’t screaming about it being unfair. They simply didn’t know anything better and just accepted lag deaths. In most cases the rules on the server would be good enough to make things seem ok and players enjoyed the games - until the event of unofficial modding of clients.

As soon as you’re trusting clients you’re in a dangerous place where a nefarious player can modify their games to let them cheat by just keeping within the bounds of the server rules. Games makers went back and forth with the cheaters trying to implement more and more complex rules to prevent the modified clients from working. Of course there were only a few devs and 1000s of players so it was a losing battle.

Lock-Step

In lock step all the clients in the game are peers of each other. They all run their own copy of the game logic. Each frame the client sends the player’s current inputs to all the other clients. Clients only move the game forward once they’ve received the inputs from all players for any given frame - since it’s only then they know they have enough information to work out what needs to be shown.

This worked very well on local networks but became more painful on the internet. In the lock step model if one player is lagging - all of the players lag since they need to wait for a step from that player.

Again, since the clients were being trusted, games were open to cheating. This didn’t stop some games being very successful using the model. Traditional RTS games (like Starcraft) used the model along with some older shooters like Doom 1.

In the RTS/Click-To-Move games the lock-step model works particularly well, since players expect a certain delay between telling a unit to move or act. This delay period can be used as a buffer for any lag between peers. Diablo 1 used this approach, and found a really interesting bit of information out about players. Players didn’t hate the delay as long as it was the same delay every time. Diablo actively managed the delay between the player’s click and response by moving the delay window until it could remain constant.

Delayed Action

Another common network model is delayed action. In this model each player runs a copy of the game logic / state. When a player takes an action it’s sent to the server for scheduling. The server game logic is ticking along, when it receives an action from a player it schedules it in its game logic and notifies the clients (including the client it receives it from). Each of the clients schedules the action at the time the server has determined. The client and servers copies of the logic execute the action and then the results get played back.

It feels a bit like turn based, but the server time step is always moving forward. Two players can take actions at the same time and the server will schedule them deterministically on all clients.

Sounds great, but there are few issues:

  • Client game states can’t move forward any faster than the server, so the client is waiting on a tick message from the server. If there's a lag spike client’s feel it immediately. Clients generally run along behind the server for a few frames to give a buffer should a spike happen.
  • There’s a latency between when the player pressed the button to move the unit and it actually moving since the message has to get from client to server and back again before being executed. Since clients are generally running a bit behind this latency can be larger than just the network delay.
  • If the client gets a lag spike it might well get behind by a significant amount since it’s waiting on ticks from the server. A client will then get a whole series of packets in one go causing it to have to speed up. Games do this very subtly in the hope players don’t notice.
  • Game logic has to be deterministic such that when actions are applied on all clients they end up with the same state.

Predict-Rollback

Many modern games use the Predict-Rollback model which tries to balance low latency client controls with authoritative servers that prevent cheating.

In this model clients and a central server keep a copy of the game state. Each client is running ahead at its own rate trying to stay somewhere close to server time. When a player takes an action the client schedules locally at a time based on its local measurement of the delay to the server.

It then sends the action to the server which schedules it at the same time the client decides - assuming that it's not in the server’s past. The server passes the action out to all clients with the scheduled time. Since we’ve accounted for server delay in the scheduling in many cases all the clients and the server apply the action at the same point in game logic and everything stays in sync.

The client is predicting what the outcome of its action will be and letting the player continue on the assumption it’s correct.

So where does the rollback happen? There are a couple of cases:

  1. If the time that the client wanted to schedule the action at was in the past for the server, then the action needs to be rescheduled at a different time in the server’s future. All clients will be told the new time (including the original sender)
  2. If the client that’s predicting the outcome of the local actions receives an action from another client that’s in its past. This may make its game state prediction incorrect.

When these cases happen the clients have to rollback their game state to point before the incorrect or new action happened. Then play forward again the actions until the current time - thereby getting to the correctly synchronized game state. The clients then continue from that point predicting the outcomes of player actions.

Some implementations have rollback functions that are “anti-actions” - in that they undo whatever the action would have done. Others keep a copy of the last known valid server state on the client by maintaining two copies of the game state.

Predict-Rollback is great because for many cases the prediction is accurate and the player gets very responsive controls. It also supports an authoritative server without stopping clients that are doing the right thing from moving forward. Lag is limited to the player with the bad connection.

Tricky bits with this model:

  • It’s hard to implement. It’s a complex model that requires engineering to get right.
  • When there’s a conflict and a rollback occurs there’s a spike in CPU that needs to be managed carefully.
  • Rollbacks that cause significant change need to be managed by the game rendering - hiding issues and making things seem natural.

All that being said, the pros of quick input response time, low bandwidth and conflict resolution mean this is becoming more and more the standard approach to game networking.

Summary

As you can see there are different ways to make game networking work, and as you can tell from the games mentioned people have been trying to solve it for decades.

At Rune we implement predict-rollback, it’s complicated to implement but the bright side is you don’t have to, it’s just part of the platform. If you want to learn more about how to implement a multiplayer game on Rune, check out our examples, tech demos and documentation - or join our Discord.

Subscribe to our newsletter for more game dev blog posts
We'll share your email with Substack
Substack's embed form isn't very pretty, so we made our own. But we need to let you know we'll subscribe you on your behalf. Thanks in advance!