Here's the recording from our DevOps “Office Hours” session on 2020-02-26.
We hold public “Office Hours” every Wednesday at 11:30am PST to answer questions on all things DevOps/Terraform/Kubernetes/CICD related.
These “lunch & learn” style sessions are totally free and really just an opportunity to talk shop, ask questions and get answers.
Register here: cloudposse.com/office-hours
Basically, these sessions are an opportunity to get a free weekly consultation with Cloud Posse where you can literally “ask me anything” (AMA). Since we're all engineers, this also helps us better understand the challenges our users have so we can better focus on solving the real problems you have and address the problems/gaps in our tools.
Machine Generated Transcript
Let's get the show started.
Welcome to Office hours.
It's February 26, 2020.
My name is Eric Osterman and I'm going to be leading the conversation.
I'm the CEO and founder of cloud posse.
We are a DevOps accelerator.
We help startups own their infrastructure in record time.
By building it for you and then showing you the ropes.
For those of you new to the call the format is very informal.
My goal is to get your questions answered.
So feel free to unleash yourself at any time if you want to jump in and participate.
If you're tuning in from our podcast or YouTube channel, you can register for these live and interactive sessions by going to cloud costs office hours.
Again, that's cloud posse slash office hours.
We host these calls every week will automatically post a video of this recording to the office hours channel as well as follow up with an email.
So you can share it with your team.
If you want to share something in private that's OK.
Just ask and we'll temporarily suspend the recording.
With that said, let's kick it off.
So I have only two talking for you today, one of which we had been on the docket for the last few weeks.
We never got to end one you won which are some practical tricks for change management ego or on the slack team.
Asked about this week.
So I thought I'd just jot some notes down.
This is the draft is not finalized, but these are just some of the things that we do when we work on our engagements with customers.
So with that said, I can turn it over to everyone who's attending.
Are there any questions you guys have related to terror from Kubernetes cloud policy DevOps you name it.
Maybe me.
All right.
Go for it.
I mean, I ask is think anything actually or like this mean more or less.
I mean, let's try and keep it topical to you know what.
What we do here.
Yeah suite ops.
But yeah OK.
Yeah Security right.
So actually they've written a course is sweet, sweet upstairs because I was googling these questions and your website is quite well positioned.
Google's who I am.
And when I was looking for things like what I mean.
If there's any tool out there that can help with the automation of software deployment to devices like, for example, let's say like I have like a session is out there, then we have software like for example, like a couple of repositories of love.
And it's a bit too chaotic to keep all those devices in the same way the same thing like I'm always running overseas with different software because then go offline and then go online again.
What I have right now is I have an unstable people running continuously on a loop, which is not very efficient.
I believe that it must be tools that can make this easy or maybe they can provide some kind of best for or graphic interface can help me keep track of fixing.
And like there are so many tools out there, and they have no idea what could be better than my incredible Look.
That's a good question.
So just to recap what you said basically, you're looking to learn what are some of the strategies for managing deployments and I.t. scale across lots of devices.
And to date you've just been using and simple in a continuous loop.
So I.t. is not a specialty that we have a Cloud passes.
So I can't really speak to that.
I do know that Amazon has some offerings specific to IoT and I believe just recently, they announced another offering related to IIoT and deployments.
It was just this past week or something.
I think I'd have to look it up.
But maybe there are people on the channel here have other insights anyone else here doing IoT deployments or any on armchair commentary I'm not doing.
I'll try my video on.
I'm not doing two deployments right now.
But I did interview a company that had hundreds of thousands of devices doing energy monitoring.
They wrote everything from scratch and had all of their energy devices pulling for updates.
So it's a very interesting like metrics system, because you get partially things that hadn't checked in a while like pop backup on.
And then be really old and have to get updates.
So they always had to maintain some update process from older stuff to like from their first initial stuff to the latest they ran into problems that they built all of their code and go.
So it was a single binary that we got shipped.
I can't remember exactly their process for shipping it.
But yeah yet unlike maybe it's just because people have different.
There's a lot of different chips you know embedded systems is very kind of like Android diverse operating system.
It's something that every company seems to do it a little bit differently.
But I would just like check out the bit like the cloud offerings for like their I.t.
Edge Gateway type stuff and see if there's any documentation on how they like getting started tutorials there.
Can I ask what is your I.t. system like what are the devices doing and how complex is a structure.
Yeah, thanks a lot.
Thanks for your answer.
So our system is very similar to what you would do this code before.
So we have energy management monitoring systems we have devices receiving data from sensors all these in a sense, to our platform hosted on a number of us and basically, that's what it must be that we keep all these devices connected through different to networking possibilities it can be LTE 4G 3G or ethernet or Wi-Fi because these devices are placed on factories sometimes these factories in remote areas in mountains.
So we know how internet connection is very likely that they only connect one to the.
And this is why we need to look for a solution that is going to realized when those devices.
Again or line and deploy and sometimes these two who were in the loop not fast enough to pick up those devices.
You also have to be worried about what happens if you break the device right.
That's always the thing I'm so impressed about with things like I run Sonos for example, at home.
Imagine if they send out some update that breaks you know two million devices.
Oh my god.
Yeah, I broke a couple of devices that I have to go there to this one thing the next week replacement.
But then knowing me doesn't happen anymore.
So I'm happy about that.
Yeah, I know what you mean.
Right you joined suite ops like team.
Yeah Yeah.
Yeah Oh.
Are you in the office hours channel.
I didn't know what your slack username was.
I guess I didn't work.
But make sure if you're in the office hours channel, you'll see the link that I just shared there, which was what I was thinking of at least.
But no experience with it.
OK Yeah.
I just saw the link.
Thanks for sharing.
Yeah Cool.
All right.
Any other questions from anyone.
Thank you.
I have a question that I posted in the doctor channel yesterday about.
Let's see.
Yeah, it's basically about I see Ducati just like eating a whole lot of memory, especially when we don't run deploys very frequently, and we're using the version of Docker that is in Amazon.
A young.
I don't know if that is different to the mainline version.
But yeah.
There seems to be some sort of memory leak.
We don't think it's coming from our services because it's Ducati that ends up eating up all the memory.
So I don't know if anyone else have any experience with that.
Yeah anybody seeing Doctor D memory leaks lately.
Yeah here is the message in the slack team.
It seems to scale with the number of requests that they're receiving to the doctor Damon.
So I mean, it seems to guy with the number of requests we see to our services.
Oh, god.
You got you guys I know.
That's why I think it's I mean, like log related because we like to stand it out and apply elsewhere.
Yeah and you don't see it.
Recover So it can just be a buffering thing.
No, it just goes up because up and up until we start off the day.
What so from first hand experience the up and up and up thing is sometimes still up and up and up to a point, and then it will recover like with Prometheus.
We thought there was a memory leak.
But it turned out that it needed it for in our case, it needed a minimum of 12 12 gigs of memory.
But we set our limits what we thought was still pretty high.
So interesting.
So maybe it's just not.
Maybe it seems like a leak.
But it's not high enough.
Yeah, I guess I just had this interesting go.
When we do redeploy.
Like it drops down dramatically and then when you redeploy the or demon itself.
Yeah, exactly.
No Yeah.
So it starts off.
Yeah Yeah no, no first hand account.
Just gut says that maybe it's probably not a leak.
And just buffering.
OK, cool.
All right.
Any other questions.
Yeah, this is a hurry and we have a different kind of scenario.
We use commenting on fights between multiple deployments.
So we are trying to find the best solution to start that.
Come on and get on with fighting between multiple projects.
In fact.
Yes, certainly.
I mean, I'm sure there are a lot of ways you can solve this.
I don't know about your specific use cases.
But I mean, you're saying environment files you're actually.
Like files that have environment variables in them.
That's got to.
Yes And so.
So the good thing there is your applications themselves support environment variables.
Now we just want to maybe consider an alternative interface for passing those settings with environment variables.
Are you familiar with a tool called chamber by segment.
Oh no.
This is new for me.
Yeah So this a great little tool to be aware of.
Are you on Amazon by any chance AWS are using the different cloud provider.
No, we don't use ws.
OK, sorry.
So my advice here is.
OK So this particular tool, I was going to recommend doesn't really apply to what you're doing.
But the pattern translates to another tool.
I'm going to share in a second.
So just for everyone else's sake, I'm just going to explain in 30 seconds where chamber is if you're not familiar with that.
So chamber is a clay tool.
It's been written and goes.
So you can download a single binary and what you do is when you call chamber you pass it the SSN like service or SSN namespace for your environment variables like production.
OK, sorry.
This is exact.
This is not chamber here here, where is chamber being used chamber exact.
So you see you call chamber exactly.
And the service name and then your command.
And then it'll export all those environment variables from that service.
But you can add any number of services there just listed separated by spaces and it'll concatenate or it'll merge those service name spaces and the environment variables.
They're in into one overwriting them in order.
This will help you.
So basically, if you defined a service namespace for your apps you could then very easily share environment variables between them.
So you said that you're not using that are you.
Is this bare metal or are you using a different cloud.
We use cloud.
OK, which cloud.
What provider Oracle appropriate for.
OK And so yeah, that's definitely outside of our wheelhouse.
But if you're using hashi court.
Do you guys have hashi court bolts.
No, we haven't started using it.
Do you have console.
So console by period has she caught actually the one that or what we are using is we are mounting we are creating a pass system to our news cycle that we're keeping these files inside the content of the amount.
OK And then we are using ocean command to source sourcing as in a number of Yeah.
So I mean, that's certainly a common way of doing it is having a shared file system like that.
But I mean, it does put a lot of it makes that file system, your central point of failure and traditionally scaling the storage is a or in the I/O and that could be or the availability on that could be trickier.
So console is used for service discovery, but also basically sharing configuration or settings in a highly available manner.
I believe he uses the raft protocol for consensus.
It's relatively easy to deploy.
It's very common in enterprise and other kind of settings.
If you don't have a console today it sounds like that might be a gap actually in what you guys are running.
The reason why I bring up console is that if you use console together with a tool called end console, you can you can achieve the same outcome as you can with the chamber commandos telling you.
So with any console, you can you can have these shared settings that are distributed in this highly available distributed key value store console.
And then expose those as environment variables to your commands.
So it's not exactly.
Maybe the answer you were looking for.
I mean, if you're going with the like the NFL route, I would just generally avoid the NFL draft if you can from a best practices perspective until I guess I have a related question.
If we were to used chamber.
How would you get those secrets into like running powerful.
Oh, yeah.
So Well, you have two options.
That's a great question that you ask.
I will.
I'll give you two options for using chamber would Terraform.
And I'll show you kind of the progression that we've taken a cloud passes because you know we also learn what works and what doesn't.
So for the longest time, what we were doing is just setting all those settings in chamber as the same parameters.
So let's see if we go back to Jimmy Carter.
So Terraform supports the use of environment variables for setting the values of Terraform variables.
But there's one annoying thing that Terraform does is so Terraform requires that Terraform variables look like a window here.
So if we were using Terraform for example, you'd say like export TFR foobar equals 2, three.
And now foobar will be available if I call Terraform plan or something like that.
So when you're using chamber you right.
Chamber right TFR you know foobar 1, 2, 3.
Then you can now call a chamber exactly.
I did this slightly once.
It's going to be chamber right.
My namespace.
So let's say prod and then TFR foobar and then I think it's 23 I might get the syntax slightly wrong, but you get the gist.
And then chamber exotic CRUD Terraform plan what this is going to do is it's going to fetch the seat fetch the variables from the prod service namespace and export them before I call Terraform plan.
Now one thing that to be aware of with chamber is that it automatically normalizes the case of everything.
So actually what happens is these things become TFR foobar which means that in your variables file in Terraform what you have is like variable foobar typically you would have something like this.
And this is lower case.
Well, if you're using chamber here's the thing that sucks.
You got to call this upper case foobar but there are two ways of looking at this.
One is that it makes it very clear that you expect this to be set as an environment variable and upper case environment variables is more or less a standardised convention.
So that's one way of looking at it.
But so iCloud because I didn't like that too much.
So we wrote a small little utility called TNT at the end is not a Version Manager TF end works like the end command and TAF end will map will re export all of these environment variables in a way that is consumable by Terraform.
So I just I'm just presenting this as an alternative way.
So if you're using TFA n you can you can use it together with chamber and then past the environment.
All right.
I'm happy to go into more details on this.
This was a little bit hand wavy but the reason why I want to kind of skip over it is I want to point out that if you're using chamber and you're using Terraform there's almost no reason to use them together at the same time.
And let me explain.
So all chamber is doing is reading and writing settings to assess them well Terraform supports SSL natively.
So if you use the Terraform data.
Yes, that's the same provider and provider resource you can just fetch those parameters directly from SSL.
So here's an example of reading a parameter named food from SSL.
If you do this.
So basically, what we're describing is something similar to what could be achieved with remote state.
And Terraform where you're pulling the outputs from some other plan or some other process.
But the difference is because it's an SSN it's usable by all kinds of services, not just terrible.
So of those two solutions presented.
Is there one you'd like to know more about or it was that in sufficient detail.
I think that's a great starting off point.
Thank you.
Get your we also it's a little bit out of date.
But policy question or do you just search our docs.
We have kind of how we used chamber here.
It's more or less current.
But because the interface.
The chamber hasn't changed too much.
But we explain a lot more of this here.
All right.
Any other questions I can help out with from anyone.
I can ask something that I probably already have part of answer to.
But no extra validation might be good.
So we just had a competition on our team about secrets.
We've recently started to put secrets in vault. I related to a previous question here and we recently deployed a whole bunch of stuff in home and got upset with him and pulled them out, and they're now playing our own stuff here customise and then other stuff your home.
It seems to work in OK fashion.
We've got Argo city shipping those which is pretty sweet.
Really like what we've let it go.
So far.
Thanks for the help.
Previous questions really did that work.
But are the way that different applications want secrets.
Some need them as variant variables Garcia and there's a lot of volume and others we've like injected them into and very variable variables not like through there.
So we have three different paradigms for injecting secrets out of all.
Yes, there is.
And that's what I wanted to ask.
The group is sort of like, OK, in those three situations.
What are the bad things that I can do.
So I can write best practices like for example, we just changed a bunch of our not in secret volumes, which are.
I think my current preferred version to do to have them in memory instead of on disk.
I forget the weird incantation that you put something called a memory in to get your mental volumes to be a memory.
But is that better in memory or worse in memory.
How else can I make it harder for people who have maybe broken and took a bit anyways or found an excellent improvement.
It's just not published and make at least a little harder for them to get the crown jewels of our kingdom.
Any opinions anyone.
Andrew Roth on the call.
Yeah So I'll chime in.
Yes, I think first to recap kind of your question here is that you bring up a good point that we so we talk often about secrets management, and like there is kind of like a canonical way of doing it.
But you're really ultimately at the mercy of the applications you use.
So if you're using some help chart you're at the mercy of how that help chart was written, how it was managed its secrets or if you have your own custom built in-house applications.
Yeah, pretty much every option available to you.
Because you are in control of how that you work.
Sometimes apps support environment variable.
Sometimes apps support configuration files.
So the reality is of secrets management is that one size does not fit all in the reality of integrating lots of different software.
So I think that what we should think about is prioritizing what you're doing for your internal apps and then going down from the other consideration with all of this is local development like one can create some pretty robust solutions for managing secrets.
But it's also going to complicate how you're writing that code or using a code perhaps for local development.
And when I say that it's kind of pointing to certain native bindings to AWS or I as to how she caught VoLTE and to like ESM Amazon Secrets Manager.
So if you haven't if you're making your app natively talking to these I think on the one side you're getting a better, more secure application.
But you're also vendor locking and making it harder to develop those things locally versus using environment variables or perhaps a config file for Yeah.
That's a very valid point.
When you go.
I just never go back.
So one thing that I've seen done is to give every developer access to it like a development namespace with some dummy stuff.
So when you do at home to play in a specific environment of things, then you can pull secrets from dev in the same way on your menu cluster that you would in any case cluster from your product thought namespace i.e. it's going to be a lot of tooling to get that to work smoothly.
But that is what I guess our team is doing right now.
I don't know if it would scale to the whole enduring thing.
And then like a lot of the other apps we have like a separate kind of branch like non-production branch of how do you run local development right now that's like Docker Compose file that has a file that gets put in the right place.
And then the app is happy with that.
But we've got a lot of fragmentation because things are done differently across different microservices, which is concerning to me.
So yeah, the.
And then the downside of I think putting it as a very verbose is if you inspect a container you can get most of those you can, which is the downside.
So I mean environment variables are not the most secure.
They're just the most portable.
Yeah but which is you're always weighing pros and cons right.
And maybe.
So I guess let me rephrase that question.
So it sounds like you'd prefer American verbose over.
I'm not just for portability which is.
And I've also been on calls with spec ops teams, which bring up exactly the point you talk about like environment variables are a best practice.
The other 12 factor pattern.
But the 12 factor pattern is not necessarily a best practice for security.
So these can be at odds with each other.
So I suspect if you're at risk since you're running a few more Kubernetes clusters and we are that you've also thought about other spec first pieces that we don't really have.
I mean, we send logs and like and metrics to write logs dysfunction metrics in Iraq to my dismay of open source lack of open source and we've got some metrics on various different things.
If weird stuff happens to customers.
But what are you doing to make it harder for somebody to break in and inspect a container.
Or do you have any sort of basic like recommendations for somebody wanting to just over in general secure no secrets.
But their culture in general.
I don't think I have the answer that you want to hear her second one.
But OK you fire away.
Taylor Yes go for it.
All right.
So there is Falco which is the CMC to.
What the call called again, one more time.
It's called Falco.
Palca Yeah.
So that was like a one time you run to stagger or Falco or other stuff we're getting into it quite there yet.
Currently, we actually use twist lock that actually does the same thing at one time.
So it's like the scanning images.
What function is the US was the open source version of achieving that.
So fuck me mean fresh memory up too many things that I can't remember like process free analysis is something that I could trigger a psycho alert off of.
So you said your game server would tell me about that process.
How much time would you invested into playing the open source tool.
Did you look at just giving take a pile of money to use their stuff, which gives you some of that for free.
What do you know.
Yeah, so we're trying to get away from that space and perhaps replaced Locke with something very similar.
I've actually looked at a few of them or not a lot of time invested really at this point too much to actually make that investment over.
I can Matthews take the post.
Watch out once they can deal.
It's a little hard to hear you.
Are you are far from a microphone or.
Yeah Yeah.
I keep thinking I can talk to the screen and I can hear much better.
Yeah Yeah.
Yeah Cool.
Yeah So I can post a link to all the conference talks.
Terms of how will follow.
It was used to actually managed and helped to evade any intruders.
Continues so that's one thing.
But in terms of an actual abuse you haven't actually gotten around to that sense.
Yeah, it's been a while since I left.
If I go to a much larger company right now that accorded to stay for a while and we deployed that whole structure as OPEC.
I don't know if we ended up engaging with them.
But I love working that team.
There are really smart engineers there is similar to how Chicago and any other company is they've got like a bunch of their stuff open sourced and then you hit this cliff.
And if you want the rest of the features you have to pay, which is totally fine.
But I haven't found any anybody that's using Falco in a purely open source fashion and being really happy with it.
But I would love to look into that if something actually related to this.
Mahesh asked earlier today about something kind of related, which was about ways to lock down life.
Is there a way without our back to lockdown.
The ability to exact control other than maybe eliminating shell altogether.
That's probably the right answer right there.
I was good, but even then you can attach to a running process with Capela to match something or other and do something there.
I can't remember how exactly be familiar with attach.
What does.
That where does that attaches you to the two.
That process is console the running process console.
Let me not fail at answering this question and instead just an overflow post.
But and then meet my own second focus that I just like to interact container.
So maybe you still need a shell to attach uncertain.
I will try at some point to remove bash and in cydia or remove all the shells and see if I can attach you'd have a tough time running anything without a lot.
Yeah, you certainly reduce it.
Dale do you know after your head.
I do not.
Gotcha all right.
And don't only to sense that I would add to this.
I mean, it.
So I think you have a right.
Also the general thing in security is always to have the different layers of security.
And I think one of the first layers is to eliminate the vast population of your company.
Being able to keep an interface directly with the clusters in the platform and instead move towards the get driven workflows for all of this stuff where you have audit trails you have an approval workflow, you can do limiting you can do policy enforcement you can do all those things.
The second you want to try and create policies for humans.
We're in for a world of hurt and the scale and complexity, the scope rather and complexity of doing this explodes rapidly.
So that's what I mean by that is this is not really the answer you wanted because, well, this doesn't solve it.
OK So what do we do about our SSD maybe or other personnel.
We need to access the clusters and do things for those different checks and balances.
All right.
Any any other questions Eric.
Are they anybody using hasn't file you know participants.
We started using them fight a lot of.
Yeah, I mean, are you saying we use held file every day all the time, all day long.
I think a lot of folks here now are using help file as well.
Is there some specific killfile question you have.
Yeah the same question when we use Helen file.
So I think that is it bad that we can share.
Fighting between multiple deployments.
I think we haven't started using fight.
But I hard.
Like, there is a possibility that we can share a common ground and fight Yeah Yeah.
Yes, you can have an environment file and then use that.
Basically when you call hell and file you specify the environment that you want to use with a flag on the command line and then that environment file more or less becomes your custom schema or your custom interface to your helm charts.
So the problem right with helm charts is that there is no standard a schema every helm for helm file are all or start.
Help chart author has their own schema that they come up with.
And it's improving with how 3 three, right.
And that you can have it was it.
Jason JSON key specifications to control it.
But it doesn't standardize it more than we have today.
So when you get to use defining your own environments and help file you basically get to define your own interface.
Thank you.
Yeah Hey Eric to kind of pull off of that.
We followed your code examples to set up home fire a little bit.
And eventually it wasn't working.
And then ripped it all out and don't have any help finding more.
Sorry, that's OK.
Could have been easily fixed probably.
But the other reason that we worked it all out was because we wanted to use our go to deploy not only our own apps.
But also other people's apps.
And so we have that all setup in our city to do.
I don't know if I could describe that quickly or pull it up.
A series.
I don't know what's deployed.
So I don't have a demo environment that I can sort of like look at that maybe I could one second.
I mean, this is not terrifying.
One thing I'd say that we are moving away from period that we have been doing in Harlem file.
So we're actively engaged in an engagement in a project right now using challah file as well.
But what we're doing is we're moving away from the copious use of environment variables because of the complexity of managing configuration code with environment variables.
So therefore, we're moving to using environments in the hell file basically, long story short.
When we started using challah file environments that didn't exist environments came much later.
And now we're looking to leverage environments a lot more.
Yeah which is sort of the answer to that previous question a little bit.
How do you get that basic com file to get deployed to you know stage fright et cetera.
Yeah Yeah.
I can't share my screen cause a because you're sharing.
I guess.
But let me stop.
There we go.
It's time to share hopefully.
Nothing nothing's terrifying.
Can you see my screen.
Yeah Yeah.
So we.
This is what our guys see it looks like.
You can sing refresh.
I'm not going to click too much.
But you can see we've got a bunch of stuff being deployed from one of our own apps, which looks like this has like various different processes running in it too like cert manager and like our metrics and these types of things is working as expected.
We've only put one of our main services into the node service, which is arguably less like if this goes down, our users won't like really, really upset.
So we're waiting until we feel comfortable with all the setup before we put more of our important services behind it.
We'll take it a bunch of angry users.
But you know they'll have customer support and you know the world won't go on fire.
There won't be any New York Times new sets about it.
So like figuring out how to make all this thread with racv is something that we have made a lot of progress on.
But it is a way where we would never use compile and you would use, how far to deploy a bunch of these things via home instead of the way that argc is playing it correct.
Yeah, we also might have different like ambitions or different things we want to optimize for.
So often a individual company can make some trade offs that we can't make in individual company can build kind of a system here for deployment for deploying these apps.
That is highly tailored specifically to your already unique requirements of what you want to deploy.
And you know what those requirements are today.
We so sorry.
What are those requirements is so that it's easy for us to maintain it and to grow our team and hire people who know how to maintain it.
Which the more customized it is, the higher that becomes.
Yeah or one of the reasons why we're using could raise itself.
For example, is to make it easier to containerized our services in a way that can be maintained by people who we don't have to train from the ground up.
Because it turns out every LCD incriminates too.
For example.
Yeah, I'd love to be able to follow the same principles with more nuance to aspects of our system.
But you're right.
We're getting into the weeds like, OK, we still use this Splunk thing and Splunk doesn't only support secrets one way.
So we have to do secret their way in order to get this third party vendor service running for this forward or I should say Yeah Yeah.
I hear that all the time.
And I am sympathetic to it, and I don't want it to be this hard.
I don't want it to be this complicated.
I guess we're just still in the early phases of letting this stuff get figured out.
Plus the term best practices is so fleeting in this world, because the capabilities are changing faster than the practices can evolve.
So making generalized statements on how to do this.
This is a quick way to know if I want to buy ones or ones words in about six months from now and we'll change everything 15 times before I wake up in the morning.
The harmful verses are our thing.
I'm sure there is a way where you could use just diversity for your own internal applications and help file for just your external applications.
We're using a combination of home charts and customized manifest files around our applications in part because of the way our city restricts you.
I'm not actually doing at home voice.
I can't even go back to your home.
It's doing a film that's expanding it, and then applying to Manifest Files.
But the other features.
You've got it.
I shouldn't click through to show like those manifest you can see the difference between deploys of each manifest file in the UI.
Which is great for developers to sort of understand different and basic liberties pieces.
Do you have like an idea from your assurances of where home file shines versus where something like this.
Yeah let me let me show a little bit of both.
Also Alex was kind of asking to see an example of what I was talking about.
If we go here too.
Why does that.
Yeah So I'm going to go back here to the cloud possibly home files and show first I guess.
Let me go here.
So arguably one of the most talked about taking note does not even talk about Helm or helm file yet right now.
What are the characteristics of systems that have proven massively successful almost viral in their adoption.
Back in the day we go back to Ruby and then we had Ruby jams and it's that ability to have this registry where you can easily download all these gems and get immediate benefit out of it.
Moving on to Python same thing.
And then moving on to Docker like the idea of containerization existed long before Docker came about with Docker did is they combine these concepts made it might size and easy to consume and had a registry.
So that it was highly reusable.
Let's step forward to help.
The reason why I'm also still a big proponent of helping is that we still need a way to package the knowledge of how to distribute apps apps, which maybe were never designed to run on Kubernetes in a way that we can run them on communities.
And I get all the negative critique of home.
But I don't.
But I don't see any better alternative that achieves the other things that I just talked about.
So like we have the customize these other things.
I still don't see a registry.
So to say of distributing that know outside of your organization.
And if that stuff isn't built from day one to live outside of your organization.
What we're doing is we're building snowflakes so go do it.
So I'm a huge fan of help and a proponent of it, especially with how I'm through with the downfall of tiller which we have not Puerto over to you.
But will I specifically was looking at him file like the use of deploying charts through.
Yeah because we don't have any intentions of moving away from all of our vendors that we pay money to maintain outright.
So we're going to up because that's the easiest way for them and to handle upgrades and whatnot.
We're plugging it into our TV, which takes away some of the value of home.
But not any of the things that you just said.
But by playing into our city in order to get a bunch of other value for our customized which we had home plus customize you can glue those two together.
They're not exclusive.
But in the way it's being held as a template engine not as a package manager and I want to say that these are distinctly different things where I look like if I were using Helms strictly as a template engine.
I know I'm less excited about it.
But yeah, I mean, the idea is we would publish our home charts.
So that the world can use them and then deploy them in our cluster our way without home to play.
Yeah, you're totally fair to do.
You're totally allowed to do if you wanted to do that.
But here.
Let me just withheld file.
And I brought up the example of Terraform a why I love Terraform so much.
And I think it's been massively successful it's partially because of modules and how easy it is to distribute the knowledge of building infrastructure with modules.
Compare that to CloudFormation which I mean, it hasn't myopia in my mind hasn't had the level of popularity of Terraform possibly because of what the text is ugly as sin, but also because of usability across organizations to get started.
They're doing a little bit more of this.
I think today with TDK but I'm not I can't speak intelligent about that.
My point here is then with how in files we get the same thing so much.
So what is the problem with health.
The problem with helm is every vendor out there gets to define their own schema for how to install and manage that application.
Also, sometimes there are additional things that we want to achieve that they help chart doesn't do.
So we need the escape hatch to be able to do that.
That's what we're getting with Helen file is basically this ability that we can share the knowledge of how to install the Elasticsearch exporter.
So all a kind of need to do is this.
And then we as cloud posse define our own schema for how to install this.
OK, I'm I want to pause there for a moment.
And then say, what I'm unhappy and this is why I'm unhappy with what we have been doing a little bit with alum files.
So we were using environment variables to the extreme.
And I can I'll on a separate office hours talk about kind of my soul searching on that.
But let's go and look at one of these guys like, hey, I am so busy and in here we.
This is kind of like the schema that we need to follow.
And in order to be able to install all the custom CFD using the raw chart.
Then we come down here.
And so there's just this.
Don't get me wrong.
There's a lot of ugly here.
But we're still being able to wrap all this this support provided upstream by the help chart maintainers.
So then our interface has been environment variables.
But for aforementioned reasons I I'm less jazzed about using environment variables to the extreme to the extent we have.
And that's why now.
What we're doing is using the environment files instead.
And the environment files are designed to be more digestible by developers or by the cons by the users of homes while I'm just going to try and pull up an example here.
If I can quickly.
I mean another window here is, if you don't see what I'm doing just wait the second this relates to your question, Alex on what this is look like.
Sorry I'm still logging into this system.
Well, I love it.
Objects on phones.
Yeah, this provision.
So this is what it ultimately looks like now when you're using environments.
So here's the EFS provision or here's the home file.
Here's what that looks like.
So they've defined their schema.
This is upstream with the help chart maintainers want.
This would be a law task of an ordinary developer to figure out.
But by using environments we reduce this.
So this is the configuration file format that we follow that we use.
So this is all that they need to share this all that they need to set up and configured.
And we can add or subtract from this, we can define defaults.
So what we've defined as we've defined our default environment environment.
So these are all the settings that we have by default.
And then these are what we override per other environment.
So this is what I mean.
I like the interface that we're then ultimately able to expose withheld files, and you might be able to see the parallels between Terraform and Helm file when you architect it this way.
So basically, our held files become modules for home and our environments are like the variables that you pass in TFR files in Terraform but we're doing it normal files with help from is is that more clear now.
Adam Yeah, right.
It's helping make you sort of understand more about how you guys are doing.
I understand how it is.
But that it would be difficult for me to do direct apples to even apples to oranges comparison because we also have three clusters and an application to play each of the three clusters with different values.
As it stands right now, the when we update those or get pushed.
So I think I would rather it like Jenkins job to monitor for a helpful change.
And then automatically deploy it.
I would like to control aspect of things a little bit.
But yeah.
So in this case here.
OK So I hear that.
OK So in this case here, we in lined the home file.
But this could have been a remote file.
The reason why we're enlightening in here is because we're in transition moving to this format from our upstream ones that use environment variables.
So So what happens here is that we treat this as a model repo and then when this repository changes we're able to apply systematically those changes.
And since the configurations are per environment you said you deploy to multiple clusters.
Well, so do we deploy multiple times to multiple clusters.
That's not a problem.
It's just more environments that you would be defining here right.
I think so.
I meant to say that these are good useful functionalities that both define the paradigms.
The custom piece for us.
Let me think.
So when using some home charts we wanted to patch things and you can't do that easily.
And that is the idea that you can.
Which is one of the signs.
And so it's easier for me to compare customized versus home or customized draft home because I know a lot about those.
But the wrapper around home using home file.
I can't speak to us as strongly.
So was trying to get a little bit more of the specifics on that about.
Yeah, we'll see.
Good Yeah.
No there are.
You're right that there are certain types of things we cannot do with this strategy like if we need to change the actual structure of what was generated by the helm chart.
Yeah, no go.
We can maybe rely on some other third party things that might inject things in there like an Etl to inject the pods.
But we're by and large, we haven't needed to do that as much as we've needed to add additional resources like oftentimes, we need to change the way ingress works.
Well, most of the time, the authors allow a way to disable that built ingress and then so we always is we disable the built ingress and then we use things like the rod chart or our motto chart to add all those additional resources we need.
So here's a perfect example of where deploying cert manager.
And cert manager doesn't do everything we wanted to do.
We want to install the Cia's for example.
So here what we have is we're using the Kubernetes raw chart to deploy each of these CRT is just defined as in line values.
I remember asking myself, why does a cloud party chart not need to install CRT separately.
And you just answer that question even though I didn't ask it, which is great.
I asked that question like a month ago.
We're not using your chart right now.
We have a separate manifest file that get deployed after that chart gets deployed to deploy them.
But looking at this, I would love to switch over to this.
Well, this is also that helm too has limitations that helm three dozen and managing C or D, the helm 2 versus some three is different.
So this.
I don't believe we've adapted this strategy to embrace the capabilities of 3 and maybe cert manager as well as updated the later releases to support the CRT books or whatever they're called home free.
They didn't when I was looking but that was now a month ago.
So maybe they did.
Yeah Yeah.
My example that came to mind where I was saying that we wanted to catch something which was actually my which I know you guys have an easy CSA lettuce.
Yeah like it.
We didn't like running and lances in Cuban areas because then you have a god pod that somebody can exact god to do anything.
Yeah, it's pretty terrifying.
Hence my security questions that I was just prompted in the beginning of this.
Yeah Hands off truckers.
Yeah like I as much as I love Kubernetes as much as he can.
Just because you doesn't mean you should.
And that's kind of why we went the farthest look.
Our first V1 of Atlantis was doing it in Cuba in 80s.
And we were successful with that almost immediately.
But then I realized do this.
This is scary as all hell.
The other pod sitting there, which is God of you.
In that account, at least or whatever roles you give that pod to do things.
And the fact that companies out of the box lets you exact into that.
That wasn't sitting happy with sitting well.
We put it in a dedicated tools cluster, which has different things like the Argo city and vault and I just noticed a cluster that basically doesn't get used by other teams and is properly segmented in some ways that one to us cluster does have.
God mode over at the rest of our clusters.
Yeah So it's a great command and control plan.
If you want to get into that cluster.
What's the host, by the way.
Hold on.
Let me guess.
But and then behind a VPN and et cetera through those aspects of the previous ID we had was just running on a standalone to instant spot up with Terraform and had like a maybe not super polished playbook, but absolutely.
But so this seems like an upgrade in many cases, it's much more AJ we've added a bunch of metrics.
And in the process and got all of our all the other benefits at the same time.
Right but an extra plus I started to worry more about the security of our clusters.
So we're also improving our security by being scared out of your mind.
Plus you're streaming all the output from Terraform into your log.
So you know you're already as passwords and everything you know are using shared that way.
And we haven't put any passwords into Terraform, which is a pain in the butt.
So because it means that there's a few pieces of structure that are not mean came by in order to get those passwords.
There's a few ways you can ask them and not get them streamed to logs and figure that out.
But yeah, we went down the masking route.
But it's like whack-a-mole.
Again, there's a lot of things you can't mask out of the box.
Yeah, you can pull it out of like came out or altered to do some stuff and try to get it not like have it Terraform do it after the fact.
Basically Yeah Yeah, thanks.
This conversation was helpful for me to feel like.
Sure Yeah, sure, sure.
We didn't get to cover everything today.
But this was great.
Good conversation, guys.
We reached the end of the hour here.
So I got to wrap things up.
Thank you for joining us today here are some things to check out if you're new to cloud past your sweet ops and what we do.
I'll make sure to sign up to our Slack team if you haven't already.
If you're tuning in from like a podcast or other media make sure you register for office hours.
So you can attend these in real time.
Going to cloud posse slash office hours.
We syndicate these as a podcast.
So if you'd like to consume that way.
Go ahead to IPOs the slash podcast.
You can subscribe whenever you use.
Anyways a recording of this is going to be posted to the office hours channel in a little bit automatically as well as posted to all our social media formats.
Thank you guys.
Talk to you next time.
Yeah, thank you.
Thank you, everyone.
Yeah, thanks.
Thanks Thank you.