Lessons learned in the past 3 hours
Since last week my pile of architecture mistakes are up, thanks 2024
Hey you, today I’m going to share with you some lessons learned in the last 3 hours that may help you on the journey of the cloud architecture designs.
Usually, I design my projects with a lot of certainty, even if I never used that tech before. And typically this strategy works, with some adaptations along the way.
Last year, I designed a project considering a premise that I didn’t validate. I just assumed that I would be able to use some shared infrastructure, in this case a Kong Gateway, and everything would be fine.
The advantages of using the shared Kong infrastructure were clear: I would not have an infrastructure to maintain, I would delegate to the users the configuration of the integration, and only integrate my part of the project of something that already works. Delegate everything and focus on the value of my product. It was perfect.
Lesson one: don’t be too certain
Fast-forward to last week, when the first part of the project was ready, and I reached out to the team that owns that Kong Gateway to see how to integrate with it. Just to discover that I can’t deploy to it. It doesn’t have a testing environment, and they don’t offer a pipeline that can build custom Kong images with the plugins that I was building built in.
From Rusticus…I learned to read carefully and not be satisfied with a rough understanding of the whole, and not to agree too quickly with those who have a lot to say about something.” ~ Marcus Aurelius, Meditations, 1.7.3
I don’t know if you know the book Daily Stoic, but the lesson of January 24 was like a slap in the face after I figured out that the integration would not be possible. It’s wonderful, like sometimes the lesson matches what’s happening in my life.
The certainty of this premise was my first mistake. And now I started to pay the price.
Lesson two: don’t be too certain - again!
Since this project is time-sensitive, I entered on panic mode, and took a day to redesign the project using AWS API Gateway. I just looked over the docs and figured out that a private API Gateway + NLB + ALB would fit to a solution. Most of the code that was written for the Kong plugins could be reused in a Lambda Authorizer. To my relief, not everything would be thrown away. Most of the infrastructure would be serverless, so no maintenance needed.
But spoiler alert: I was wrong — again!
I spent the whole day developing this new architecture, but an hour ago something hit me: The way that this thing is being built, it won’t scale. All I wanted from the start was to keep things simple, and by the way that I was writing my terraform, the simplicity was being gone line by line.
This API Gateway was supposed to integrate with several resources, including more than one Application Load Balancer(ALB). Since I was going with the REST API mode of the Gateway because of the requirement that this should be a private resource, I needed to use a Network Loading Balancer(NLB) to integrate with the ALB.
After I created the target group and listener for one of my alb’s, something felt off. How would I configure more than one alb using the 443 listener? NLB’s does not have rules for listeners like the ALB has… And then I was reading the NLB docs just to find this:
And I was like: fuck. I did it again.
My boss asked me before this redesign, if it would’nt be easier to get our own Kong Gateway deployed, and I said no. Because I didn’t want to have infraestructure to maitain. Going with the API Gateway seemed right since: serverless. He agreed with me, and I started the redesign.
However, as I realized an hour ago, the architecture wouldn’t scale. The terraform configuration for other integrations would be a mess. My plan to have less infrastructure was going to hell because of the amount of boilerplate that I needed just to serve ONE LOAD BALANCER in the API Gateway.
So I was wrong again, blinded by my panic and certainty that the API Gateway would work like my version of the Kong Gateway.
Lesson 3: Let’s try not to be wrong a third time
These past 3 hours messing with terraform and trying to figure out in the AWS docs how to proceed was really a challenge. It is not clear the pros and cons of going with the private integration of the API Gateway. And since I was on “panic mode”, more errors were made.
This was my first time playing around with AWS API Gateway and since the start it looked a bit off for me. Too much configuration and resources linked to do a simple thing that was to expose internal API’s with some authorization handled by the gateway. But this limitation of ALB’s that could be integrated, was something that I wasn’t expecting.
It is so frustrating. However, on the bright side, it is good to be wrong. I’m used to being right, to not being challenged on my choices, and that was working so far. My boss was right on his advice, but I didn’t listen to him, and now I’ve lost a week of development on re-architecting these solutions and trying to build the infrastructure for it.
But, at least, not everything is lost. The Kong plugins code still exists, I just need to deploy our gateway and see what it happens. I just don’t want to hope that this will be easy because in these past weeks, nothing is being easy for me.
Maybe, I can write a second part of this adventure. I was planning to write an article about how Kong docs are awful and how building Go plugins are even worse since every tutorial around the web doesn’t go beyond the “Hello World”. But since I was dropping everything in favor of the AWS API Gateway, that plan was not going forward. But right now, I may write it.
There’s half an hour that I told the boss that I was wrong -again- and that we needed to get back to the original plan and to his suggestion. I’m certain that my PO will likely want my head on a spike. Although those are tomorrow's problems.
Deep down, I’m happy. I know that this can sound to you like arrogance, but if all the time I’m right, things start to lose their shine. I was hoping for a while to be challenged or wrong. I just wasn’t expecting to be wrong in a time-sensitive project.
Yesterday I was watching “The Playlist” series on Netflix, that is a dramatization of Spotify creation. On “The Coder” episode, the developer says that in the computer everything is clear. Only 0’s and 1’s and no ambiguity. I resonated with that, because this certainty of my opinions comes from this “perfection” of the duality in tech. But I forgot the primary lesson of everything in tech: it always depends.
But it is what it is. And if I have learned another lesson from the Daily Stoic, is that I can only control how I react. And my reaction right now is sharing with you this story, to turn off my “panic mode” and put in words what happened and think through my decisions.
Tomorrow is a new day. And the lesson for the next days is to think through everything, to not have more blind spots, and to not panic when something that I planned goes wrong.
But, can we agree that we shouldn’t have a lot of resources just to access an ALB besides an REST API Gateway in private mode? C’mon AWS, your work is to make my life easier not harder.
That’s all folks!