This is part 2 of my 3-part blog series on real world software development process. In part 1, I talked about the roles and responsibilities of software team members, and why it’s important to have them clearly defined. In this post, I will be talking about development environments and servers, which is also a very essential topic for everyone in a software team to understand.
Both the terms “environment” and “server” can mean different things in different contexts. Therefore, let me specify what I mean by “environment” and “server” in the context of software development process and this blog post series.
Let me first talk about “server” because it’s the easier of the two. In the context of this blog post series, a “server” pretty much holds its traditional meaning: something that serves/stores resources. For eg, a web server serves up web resources, and a file server serves up files, etc. One thing I would say is that the term “server” can refer to either software or hardware, and quite often, both. This sometimes can be confusing to new programmers, but with time they should be able to tell what people mean by “server” by considering the context. Really, it’s all about the context with software development! (Maybe I should write a blog post just on that!)
Now let’s discuss “environment”. When I say “environment” here, I roughly mean a collection of hardware and software resources working together as a whole in order to deploy and run working software. Another way to put it is that an “environment” is a necessary and self-sufficient container that provides a complete runtime for a piece of software. For eg, a developer’s machine is considered a “local” or “development” environment and is usually fully equipped to develop and run working software on, otherwise how would the developer know that his work is done? Another great example would be the “production” environment, which runs the software that actual users are using. Because of that, “production” environments are usually given “special” treatments compared to other environments.
Let me also quote Wikipedia here, which actually gives a pretty good definition on what an “environment” is, in my opinion:
In software deployment, an environment or tier is a computer system in which a computer program or software component is deployed and executed. In simple cases, such as developing and immediately executing a program on the same machine, there may be a single environment, but in industrial use the development environment (where changes are originally made) and production environment (what end users use) are separated, often with several stages in between, in order to allow phased deployment (rollout), testing, and rollback in case of problems.
So, hopefully at this point you have a rough idea of what I mean by “environment”. If not, keep reading. You will get it soon enough.
A typical software development process will include 4 to 6 test and deployment phases that happen in chronological order. These phases usually have a one-to-one correspondence with environments, which means 4 to 6 environments. For the sake of simplicity, I will condense the phases/environments into 4, which should be sufficient for most development needs. The environments are:
Software is deployed into these 4 environments in this order:
Local/Dev –> Test/QA –> Staging –> Production
Let me explain.
The local/dev environment is the local or development environment where the software’s code is written. It is usually a developer’s machine. It contains code editors and IDEs such as Visual Studio, and local servers such as web and database servers. Whenever possible and appropriate, try to make your local/dev environment match up with production environment as much as possible to reduce environmental discrepancies, and thus reducing unforeseen runtime errors.
A lot of times bugs are not caused by your code, but by environmental differences such as different versions of web or database servers used on your local and production machines. These bugs won’t show up during development, compile or build time, but only during execution time, thus making it hard to detect before deployment. That’s why it’s a good idea to strive to setup your local environment to match that of production. Of course this is not always feasible or possible due to licensing issues, cost, technological limitations, and such. In that case just do the best you can.
The test/QA environment is the environment where developers deploy working software to for testing and QA purposes. It is the next deployment phase. This phase is typically the first time the QA team gets their hands on the new software, and they will start testing and reporting problems they encounter. There are several things to mention about this phase. Developers should only push working software that contains new features, implementations, and/or bug fixes to test/QA. By working software, I mean the software should be running without errors in the developer’s local/dev environment first. The reason for this should be straightforward. Without new features, implementations or bug fixes, there’s really nothing to test.
One other important thing: the process of pushing new software to Test/QA should be extremely simple, preferably one step. This is because once software is deployed to QA, the process of tweaking the software and fixing bugs is usually a very fast-paced iterative process between the dev and QA team. We want to make that process as frictionless and painless as possible. For eg, being a ASP.NET web developer myself, once I get bug reports after deploying code to test/QA environment in a sprint, I want to be able to fix those bugs quickly while things are still fresh in my mind, and push them to test/QA easily so that QA team can verify the bug fixes. In Visual Studio, it has this feature called “Web Publish” where once I set it up the first time, it literally is one click of a button to push updates to our QA environment. Very very convenient and powerful feature!
The Staging environment is where the last phase of testing happens before we deploy software into the wild. It’s the last chance for the team to catch any issues before the end users get their hands on it. Its purpose is to provide the software team with an environment that is identical, or very similar to, the production environment so that we have an idea of how our software will deploy and run when it’s deployed to production.
It’s possible that it’s not feasible to have a staging environment that is identical to production due to cost, technological constraints, and various other reasons. In that case, the software team should strive to do their best to imitate the production environment. The value proposition of a staging environment which is identical to production is that it allows you to catch all unforeseen issues that you would encounter when deploying to production before you actually deploy to production, because in this case deploying to staging is essentially the same as deploying to production. Some software teams will find this a worthwhile investment for their needs, others may not.
The Production environment is where we host the resources that power the software that our end users use. That’s the only environment that the end users touch and care about. For this reason, we should closely monitor its performance and keep it optimized as much as possible in order to give the end users the best user experience possible. It is very typical and desirable that production uses the best resources compared to other environments. For eg, the production environment should have the most reliable backups & fail-overs, efficient caching mechanisms, more web farm nodes (if it’s web-based software), most indexed and optimized databases, etc.
Another thing I’d like to mention about the production environment is – access. Who should have access to the production environment, and the power to affect it? There’s no fixed answer for this question, and it will differ from team to team. But I would say the general principle is: as few people as possible. Trust me…I learned this lesson the hard way. Limit access to production to only the people who absolutely need it, ideally one person.
Yes, I understand having only one person who has access to production may cause some inconveniences and bottlenecks at times, but I believe the benefits outweigh the problems. This is because when you only have one person, or one point of contact, to production, when something goes wrong, there’s only one person to talk to about changes made to production that could have caused the issue. One person == one source of truth. This alone can eliminate so much drama and so many unnecessary questions and finger pointing if multiple people could have effected changes in production that caused the issue.
If you work for a large software company where there are many teams and products, I certainly do NOT mean you should still have only one person that has access to the production environment for everything. That would be unrealistic. In that case perhaps I would recommend a careful consideration of “divide-and-conquer” strategies, and each team choose one person to be the production admin for one product possibly.
Now that I’ve given you an idea about the 4 different environments, let’s talk about servers.
Servers are an absolutely essential part of the development process. They are also a big part of the environments we just discussed above. There are many kinds of servers, each serving a particular purpose. If you are new to software development, you may or may not be aware of them. In this section I will list several commonly used servers that exist in the different environments, and what they do.
- Project Management Server
- Build Server
- Source Control Server
- Web Server
- Database Server
- Cache Server
- File/FTP Server
- Email/Messaging Server
- External/3rd Party Servers
Project Management Server
First, let me clarify that I am loosely defining “project management server” here. Everybody probably has something slightly different in mind when they think about project management. What I mean here is that every development team needs a centralized place, a server, to manage their projects so that everyone on the team can use it to track progress and stay up-to-date. No new revelation here. Just common sense.
A project management server and its client tools allow the development team to manage many aspects of software projects, including requirements (PBIs/bugs/tasks/to-dos), sprint planning, reporting, etc. Often times the project management server is also integrated with other servers, such as build and source control servers. Your team WILL end up using some kind of project management tool, and sometimes it might be as primitive as pen and paper, and a whiteboard. Hopefully not for long if that’s the case :-). It’s really worth your time long term to invest some time initially to find a good project management tool for your team.
Since my company is a .NET shop, we drink the Microsoft coolaid for the most part. We use TFS (Team Foundation Server) for project management, builds, source control, release management, and such. It does pretty much everything imaginable under the sun when it comes to application lifecycle management. Of course it integrates really well with Visual Studio. If your company is a .NET shop, I can’t think of a better tool than TFS.
Put simply, the build server is a centralized server where committed source code gets built into deployable binaries and assets according to build configurations. A build server is also known as a Continuous Integration (CI) server. Usually every time a developer commits code into the central repository, it will trigger a build. (This automatic build behavior is the most typical build configuration that I’ve seen, but certainly it doesn’t have to be this way. The build server admin can configure it however he sees fit.)
Every time a build happens, it usually assigns it a unique folder and build ID number based on timestamp to differentiate it from other builds. Often times when a build fails, email notifications will be sent out to relevant team members who need to do something about the broken build. Nobody likes to get annoying emails about broken builds all the time, so that means usually broken builds are fixed pretty quickly, even just for the purpose of stopping the annoying emails.
Each dev should be able to build their code in their local environment in order to run and test the software. The team will use the centralized build server to manage team builds and possibly automated deployments. For a small to medium size software company, a single build server is usually sufficient.
Source Control Server
This one should be pretty straightforward – it’s a server that stores and versions source code. You may not have dealt with source control servers in school because you didn’t have to work on the same code base in a team environment. If that’s the case, that’s too bad because I really believe a software engineering program should introduce students to version control as early as possible. It’s such an essential and intrinsic part of software development, even for projects where there’s only one programmer working on it. I have to admit that I was guilty of this myself. Or, perhaps I should say my college CS department was guilty of this. I didn’t know much about how source control worked out of college.
Any real world software project requires you to use some kind of source control tool, whether you are working on the project yourself or with a team. If just yourself, you might find versioning your code useful in case you want to reference a previous version, or someone else later might pick up where you left off. If working with others, then it’s a no-brainer. There’s no way anybody on earth would want to share, distribute and merge code without using a source control tool. It’s just not possible. I’ve done projects with a team in the past where the source code is not managed using any source control tool. Boy! What a mess! Esp. when someone left the team and nobody knew where the source code he worked on was! Things like that kept me up for many nights because I had to keep track of the versions in my own head! Not fun. That’s one thing I won’t do again.
As an example, TFS supports 2 types of source control: TFVC (Team Foundation Version Control) and Git. Visual Studio integrates really well with both. We use TFVS for our projects and it works like a charm. I’ve never really had any major problems with it working in team projects. When the source control server is doing its job, it’s a beautiful thing to work with a team and pull in your teammates’ code with the click of a button and instantly see new features. It’s team work at its finest! But be warned that you need to “respect” a source control tool and use it like you are supposed to, otherwise it can bite you where you don’t expect and it can make your life miserable for a while!
Since I’m a web developer, I work with web servers on a daily basis. In a nutshell, a web server serves up web sites, pages, web apps, and other resources such as images, texts, JSON/XML data, etc. The web server can also be a loosely used term, but pretty much any time resources/data are involved that travel over HTTP, chances are they are served up by a web server somewhere. It would be hard to imagine a development environment without a web server these days, since HTTP is so ubiquitous. I’m sure you’ve seen or heard about Apache and IIS, which are 2 of the most popular web servers used today.
Note that in a QA, staging, or production environment, it’s very common to see web resources being served up by a load-balanced server cluster such as a web farm, the purpose of which is to handle more web requests at a time and improve overall user experience. To the end user the server cluster is acting as a single web server with a single IP address, but in reality many web servers are working together with a network load balancer in front of them.
As an ASP.NET web developer, I can run web apps against the web server (in my case: IIS or IIS Express) in my local environment. When the web apps are ready for testing or deployment, I can deploy them to QA, staging and production, each of which has its own web server.
This is definitely another ubiquitous type of server – a server that stores and serves up data. The vast majority of apps we use today are data-driven, so you will definitely have at least one database server in your environment. I would say this is another under-rated and under-taught topic by software engineering schools.
Today, there are 2 major types of data: structured and non-structured data. Structured data are usually relational data that can be queried using SQL (Structured Query Language), and are stored in RDBMS (Relational Database Management System) database servers such as Microsoft SQL Server, MySQL, Oracle, etc. Non-structured data are usually stored in NoSQL (Not Only SQL) database servers such as MongoDB, Redis, CouchDB, and have gained tremendous popularity in the last decade due to certain advantages it has over structured data. I’m not here to talk about which one is better than the other because each one has advantages over the other. It depends on your needs.
However, as a software developer, you need to have a good understanding of how a database works, or at least how it works in relation to the applications that your are building. Your application’s data and schema can affect the architecture and design of your code. Without a good understanding of your data, your software project is doomed to fail even before you start. Hopefully as a software developer you can be involved in the initial data model design process. If that’s not possible for you, which can happen due to the presence of a dedicated database team, then do your best to work with them to get a good handle on the data you are working with.
If your data-driven software has any amount of success, then the users are using your app often, which means they are interacting with the database server by generating and modifying data. This is, of course, a good thing because you want lots of users and lots of data in your system. But, this frequent change of data will mess up the database’s indexing, which will slow it down over time. So in order for a large, busy database to run smoothly, it needs a lot of “babysitting” so to speak. Meaning it constantly needs monitoring, indexing, and other types of maintenance. So usually a software company will have one or more dedicated team members who constantly maintain the databases and servers to make sure they are always running efficiently, esp. for the production environment.
Another major factor when making decisions about database servers is cost. Usually database server licensing fees are very, very expensive. Because of that, developers may not have access to a full database server in their local environment. In that case they will just have to point their software to run against databases hosted in other environments. But QA, staging and production should have their dedicated database servers in order to create the totally isolated environments for testing.
A cache server is a server dedicated for caching purposes and can help dramatically improve software performance. Otherwise why pay the $$$ for another server?! A cache server is not as ubiquitous as a web or database server because caching can be done in other places other than a dedicated cache server. With that said, if you want your application to perform well across the board, then caching should be a major component in your software’s architecture and should be considered in every aspect of your app, instead of an after-thought.
Often times the cache server can also be used as a short-term persistent store. This gives the benefit of a fast-performing temporary persistent store without having to make expensive database calls. Another positive way to look at a cache server is that it can serve as a “shield” for the database server. If needed data are found in the cache, then there’s no need to make an expensive call to the database. Getting data from the cache server is not nearly as expensive as getting it from the database, so the cost-savings might add up over time. All the above reasons may warrant the need for a dedicated cache server. Something to think about if your team doesn’t currently employ a dedicated cache server.
Many software applications require the tranfer and management of files. Obviously every major OS has a built-in file server. But here I’m talking about a dedicated file server whose sole purpose is to store and serve up files, instead of doing computational tasks. A dedicated file server can be a central location for storing documents, audio files, videos, images, etc, so that other servers wouldn’t have to use its resources for these I/O intensive tasks.
If you’ve been a software dev for a while, chances are you have crossed paths with a FTP server one way or another. When it comes to accessing files over the Internet, the FTP (File Transfer Protocol) protocol is probably one of the most commonly used. Hence you will hear people talk a lot about using a FTP server for file sharing. Of course sharing files over the web using HTTP is another very common way.
So in this context I’m not talking about your company’s corporate email server that your desktop email client (like Outlook) connects to. I’m talking about an email server that your software talks to in the background to generate emails. Most of the time these emails are generated based on some kind of business logic and pre-existing template. They then are automatically sent out to the relevant users at specified times to notify them of some kind of event.
As software users, you know exactly what I’m talking about. We’ve all seen plenty of these auto-generated emails by now and probably quite often feel overwhelmed or sick of them. But regardless, email notifications have become an essential and expected part of a great software experience. Your team will want a dedicated email server that you can program against.
In the case of email server, you won’t need one in your local environment. As a matter of fact, your team will probably be ok to just have a single email server for all your environments, since its job is to just send out emails.
External/3rd Party Servers
Owning your own servers gives you the freedom to do whatever you want with them, but they certainly come with a price, and it is usually not cheap. You need to buy the hardware and pay for software licensing fees, and hire people to constantly maintain them. Before your team heads down this path, I hope many considerations have been made weighing it against 3rd party and cloud based solutions. It will be worth your time. We live in a day and age where we can have technology companies running their IT infrastructures entirely in the cloud, and it might even be cheaper and more favorable to do that!
Some good questions to ask yourself would be: What are the core values that our products bring to our users? What are we doing a lot that are not making a significant difference in our core values? Can we and shoud we outsource those things to somebody else, so that we can spend more time and energy focusing on our core values?
As an example, my company uses Amazon AWS for many things that we don’t want to bother with, such as email messaging with SES (Simple Email Service) and serving up images and files with S3 (Simple Storage Service). These external resources provide us with the infrastructure we need at an affordable price so that we don’t have to worry about maintaining them, and they help alleviate the demand on our core infrastructure. My point is that there are many cloud based solutions available similar to Amazon, and you should include them in your team’s decision-makign process. Do your homework and see if there’s one out there that is a good fit for your team’s needs.
In this blog post I discussed what people mean when they talk about environments and servers in the context of software development. I hope you understand by now why this is an essential topic to grasp. If you are new to software development, chances are a lot of these concepts are foreign to you. They were to me when I first started at my job. I wish somebody sat me down and went over these fundamental yet essential concepts with me that you can’t really find in a textbook. Whether you are a newbie or a seasoned dev, I hope this blog post has given you a good overview of these concepts. I believe it will help you keep the big picture in mind when coding and remove a lot mental barriers when working in a team environment.
Stay tuned for my next blog post in this 3-part series where I will talk about the development process step by step. I will list out every single step with lots of attention to detail and my personal take on it. I bet you won’t find this kind of information anywhere else. Thanks for reading!