3D Dice at Scale
dddice is a simple platform to roll 3D dice with friends. Well ... we say simple but the infrastructure that powers over 60,000 rolls (and growing) has also grew in complexity. Our users depend on us for their game nights and there is nothing worse than something going wrong during a fun night with friends.
We sincerely value our amazing community and in order to support the next 1,000,000 rolls and users, we have taken on some unseen but necessary backend work to ensure we are shipping consistently stable and performant software.
If you enjoy articles about "how the sausage is made", then read on!
What powers dddice?
Since our humble beginnings, dddice has been powered by a monolithic PHP backend using MySQL and Redis databases to serve up pages and WebSocket connections. We believe in choosing boring technology that is predictable and dependable.
All of our sites have been hosted on a single Virtual Machine (VM) living in the cloud. During our growth, we only upgraded this server once - from a 2 vCPU/4GB RAM VM to a beefier 4 vCPU/8GB RAM VM.
This VM, up until now, powered our main application, several background workers, cron jobs, this blog, our documentation site, an internal company handbook, a WebSocket server, and a chromedriver instance used to generate previews for custom dice themes.
It wasn't much, but it was honest work and it served us well.
While nothing was particularly wrong with our VM, it was clear that dddice was growing and so were our users' expectations. We wanted to ensure that every single update was ready to handle the traffic that a large gaming group brought. Not only did we seek to meet our users' expectations, we wanted to exceed them. Obviously a single VM was not going to make the cut for a global-ready platform designed for everyone to enjoy.
In order to ready ourselves for a global audience, we began the arduous process of analyzing our MySQL tables, dissecting every bit of our infrastructure, and making the necessary changes needed to ensure we stay performant and stable throughout our growth.
Picking the right horses
Our company handbook meticulously outlines how to architect our applications. When we began, we set a high-standard to ensure everything we did was "portable". This meant if we wanted to drop a hosting provider or a particular service and move to another, we could with little to no changes to our codebase or deployment system. Take into consideration that dddice is built by two people working part-time, we needed processes in place to make changes sane to develop and deploy. Building a gaming service is no easy task especially given the high standards that gamers often have.
We have since adopted a new philosophy, "picking the right horses for the race." This means we carefully analyze our previous and future needs to pick infrastructure that makes sense for the next 1,000,000 users. Instead of writing messy configuration files for the sake of portability, we instead have decided to write the necessary interfaces that interact with the right services that provide us ultimate performance and stability.
So what are the right horses?
Our new infrastructure uses Fly.io as our hosting platform which allows us to deploy our application close to our users. Instead of a single VM hosted in Newark, NJ, we now have the luxury to deploy dddice anywhere our users are and serve pages and WebSocket connections fast. We can scale to dozens, hundreds, or thousands of servers anywhere in the world all with a single command.
We also dropped MySQL for Vitess, specifically hosted with PlanetScale as our database of choice. Instead of dealing with the headaches that come with MySQL read-write replicas, we now offload those tasks with someone we can trust. Our backend has been updated to support read-write replication which makes page loads faster for our global users. Similar to Fly, we can deploy multi-region databases with a single command.
All of our services and sites are reliably deployed using GitHub Actions which has replaced our growing mess of Terraform templates.
These services give us the peace-of-mind that growth spikes are not going to bring us to a grinding halt.
Testing, Testing ... Is this thing on?
We pride ourselves on stability; however, dddice is considered to be alpha software which means things can be expected to break from time-to-time. It is far from our intention, but we have been attempting to move fast to meet user demands.
v0.6.0 is the first release to heavily focus on testing which is providing more confidence in our systems. We have begun to focus on three fundamental types of testing for maximum assurance: unit, integration, and end-to-end.
Our unit tests are starting to capture the small user-interface nuances and bugfixes we make to ensure these work correctly through updates. Our integration tests are providing clarity into our API and how it responds to good and bad inputs. End-to-end tests are ensuring everything works together in harmony. While end-to-end tests are sometimes slow, we are focusing heavily in this department to ensure our tests capture and assure the exact user experience.
We hope in the next few minor releases (v0.7.x and on), we will start to transition from alpha software to something much more predictably stable.
We have made significant improves to our company culture, architecture, and infrastructure to best support the next cycle of growth. We continue to operate with a tiny-but-mighty team to build the best dice rolling service possible.
We are proud of the platform we built and the community we have fostered so far.
If you're interested in joining our growing community, join our Discord, follow us on Twitter, join the subreddit, and stay tuned for more updates.
We hope to dive deeper into some of our infrastructure in future posts.
Until then - happy rolling!