In light of recent events that occurred during my most recent sprint, I’m going to break my post into two parts. The first will be a wonderful and meaningful story of an aspiring SRE and his triumph over his the industry and part 2 will be my first couple sprints. In case you haven’t guessed already, the last part of my first statement might not have been entirely accurate and here’s why. Yesterday I committed my code for this weeks sprint and in doing so, managed to kick off a chain of events which eventually went on to completely brick our whole Module. After discovering this and getting hit with a thorough reality check, I spent the next couple hours figuring out what went wrong and how I can prevent it from happening in the future.
Part 1
It begins with an improper understanding of code committing “procedures”. I say procedures with the quotes because we don’t have any actual documented procedures for committing. If they did exist, the procedure would be something like the following.
- Complete current project
- Commit to local Git
- Run Pester testing module (If successful step 4, otherwise….. you get the idea)
- Sync to Master Repo.
Not only did I mess up this crazy extensive and difficult four step procedure, I also managed to make half of repo sub-directories disappear. “How the hell did you pull that off?”, you might be asking yourself. Well, it takes some serious skill and here’s how I did it.
First Mistake: Being unfamiliar with Git, I’ve been hesitant to work directly out of my local repo (the exact opposite of what I should have been doing). The whole point of the local Git is so you can’t make changes to the Master. Instead, up until today, I’ve been working on scripts out of a separate directory on my local machine. After completing a script, I just run a quick command in Powershell to move it to my Git directory and then commit to the local Git. I’m sure you’ve seen this command before…
copy-item **path/file** -destination **destination path** -recurse -container
I know, it’s like I’m basically David Blaine right? Maybe not so much. It turns out that my first mistake was the -recurse and -container modifiers. I used this command a while back at my last position to migrate end users’ local files to their share on our server. This command went on to be the primary source of my problem. If you are a Windows Powershell user, pay close attention to the next paragraph. Either you will know what mistake I made before actually reading it or this will be very valuable information for you if you’re not familiar with the copy-item
When I copied my script over, I mistakenly copied it to the root directory of the module and realized it the moment it finished. “Oh, no problem, I’ll just copy it to the correct directory.” Up arrow, update destination path by adding sub-directory, and then press enter. ……………..OOPS. The file was then copied over to the correct sub-directory, I happily committed it to my local Git, and synced with the Master, like a good boy should. What I did not realize, is that I also managed to copy all of the other directories located in the root (all of my sub-directories) into the specific directory I was trying to copy my script to. Along with my script and all sub-directories, I also copied the module’s data files, including the module file itself. I’m not a specialist yet but I’m pretty sure this is what broke the functionality. Even the Pester module wouldn’t at this point. If I would attempt to load the module, it would begin running the last script that I committed the day before. All I could think is, “Shit, what have I done?”
Lucky for me, my “Oh shit” moment was short lived as the wonderful folks that developed Git knew that someday, in the not so distant future, an ADHD toting SRE would make a dumb mistake like this. Because of this they created a version control platform which emerged to save the day. It only took me a few minutes to figure out what mistakes I made and I corrected my errors, committed to my local, tested it to make sure the module was still healthy, and finally synced with “masta”. This was a hell of a learning experience for me and is my inspiration for what I’m going to attempt to do, for every post.
What did you learn?
- The importance of a well documented and practiced development, testing, and committing procedure. This was not, in any way or partly, my colleagues fault for not having this documented. As I said before, he’s been running this role solo for 7 months and developed probably 95% of the module from scratch. Instead, the fault is 100% on me because of the next item.
- This might just be in my case but DO NOT rely on your memory when you are training and a simple four step procedure is mentioned to you. WRITE THAT SHIT DOWN.
- Version Control – Not only did this kind of save my butt by not rolling it out to prod and the rest of the enterprise, it also helped me solve the problem by detailing every single change that I made. I had no idea that there was this much functionality involved with it and with that, it’s time to move on to part 2.
Part 2
My first sprint consisted of some more basic tasks. I updated module function, committed a dashboard page that I threw together to Git, and “publicized it” by sharing it out to various support teams. Admittedly, I made this page for shits and giggles and was only planning on using it as a personal tool. However, my colleague saw it in passing, asked what that was, and said that is awesome, you should commit that ASAP so I can play around with it as well. The page is a pretty simple HTML webpage that displays up to four of our monitoring sites on one page using Iframes, as well as some quick links to tools, KBs, etc and a search function for our internal wiki. This eventually lead to what is currently a TBD future sprint, during which, I will be building my first LAMP stack. Not to sound too nerdy but I’m really excited about that.
My second sprint was a little more in depth but consisted of mostly writing functions on various scripts from when my colleague first started. For example, one of the scripts contained a function that obtained information about the server that a specific computer was connecting to and returned as a string. To mitigate this, I created a return object in the function to accept the ComputerName, IP, ServerName, and connection test Result parameters in the form of strings. This really was a great learning experience for me as I was very unfamiliar with custom-objects as well as configuring required parameters for a function. That said, it didn’t take very long for me to complete this task but the knowledge obtained will very likely be used over and over.
What did you learn?
- Objects, objects, objects – At my last job, the majority of the scripts I wrote prompted for input or required some form of intervention. As a result, I was able to get away with just storing strings in variables. What I learned to is that, In doing so, this function works but is pretty much useless to the rest of the module. Having a return object is what enables us to automate this script or have other scripts of functions utilize this one.
- Param declaration – Specifically declaring a “required” parameter. I’m still wrapping my head around this part. So far what, it’s my understanding that doing this is an effective way to ensure that functions aren’t ran incorrectly and subsequently break something or cause errors. It also enables us to pass the object of this function along when it the function is called with it’s correct parameter.
Overall, I feel that my first two sprints went well. I got my feet wet, learned quite a bit, and got to have my first experience with releasing my own code into the wild. That in mind, my next post will discuss my third sprint, you know that one where I broke our entire module. However, this time I will be discussing and sharing the script that I was committing when I broke the module, the first script that was 100% designed and written by yours truly.