Cloud Crunch
Cloud Crunch

Episode · 2 years ago

S1E14: Best Practices for Cloud Optimization

ABOUT THIS EPISODE

What are the top 5 challenges we see with clients when it comes to cloud optimization? We talk the future of cloud optimization and key best practices to achieve and maintain optimization.

Involve solve evolved. Welcome to cloud crunch, the podcast for any large enterprise planning on moving to, or is in the midst of moving to, the cloud, hosted by the cloud computing experts from Second Watch. Ian Will be chief architect cloud solutions and Skip Berry Executive Director of cloud enablement. And now here are your hosts of cloud crunch. Welcome back everybody. It's a cloud crunch to another week's episode with my cohost, Skip Berry, and we have a special guest today is one of our coworkers, Willie Sen it. He is passionate about all things data and beer, but not necessarily in that order. He's been a colleague of ours for over five years. He is in charge of our optimization practice here at Second Watch and as a very data driven individual. Willie, thank you for joining us today. Welcome to the show. Absolutely glad to join. Yeah, this is going to be a, I think, a very fun and interesting conversation. The three of us have worked together extensively for quite a while now and really want to get into a lot of the aspects of optimization in the cloud, and I think Willie has I've had a pleasure of working with him for quite some time and watching his engagement, it is a very fascinating approach that he takes. Again, the data driven aspect is the crucial piece in the secret sauce. I believe he may disagree, but certainly we're going to dive into that as well. So welcome again. They're welcome. No tell anyone else, but this is the one that I'm waiting for. Yeah, you know, all our other colleagues will probably get upset about that, but anxiously waiting your rival. Thank you, Willie for Jordan. You just help. We talk about beer. There you go, and we can we can dive into bear as well. So let's go ahead and get into some questions and skip. Why don't you get into what's on the top of mine that you want Willy to answer? Thank you. In so, I think when we look at this, you know from how we approach customers problems. Willi we go out there really from the optimization perspective, one of the five top challenges that you see around that that customers are facing today, maybe obvious, maybe not so obvious. Talk us through, like when you would go and effectively start an engagement, how we would handle that and get down to the top five if you would. Yeah. Absolutely. The number one challenge, honestly, is just the sheer complexity of the task at hand and the complexity inside of a given cloud, be at a bus or AZURE GCP is constantly evolving and changing and trying to stay on top of that as an individual business is it's a mountain of effort. So number one is just the sheer complexity all the different ways that you can optimize and save money, be it, you know, introduction of the brand new savings plans that, I guess or what seven months old now, the other pillars involved, you know, ETPS in play, EA's and play, all the different levers that are necessary to pull to be really good at optimizing. There's layers and layers of complexity involved in each one of those. So that's sort of the number the number one challenge. The number two challenge, honestly, is the uncentralized business units you see in a larger organization and the levels of disparate stakeholders that are involved in the process, be a central light or when you get down to the actual individual APP teams in business unit owners, if there's not an overarching structure around governance and responsibility, then it becomes almost an impossible exercise. You know, if you think about optimization in general, in a lot of the surveys that are out there, almost everybody will tell you that they know they are overprovisioned and they're spending more than they should and they've known it for some time, but they still can't solve the problem. And some of it is just there's there's lack of overall governance. So...

...we talked about complexity, the sort of decentralized nature of a lot of large enterprises. And then it's a very data driven exercise and I'm constantly surprised about the lack of data at times when we're first brought it into an engagement that they may not even realize they're missing. It could be data around metrics on individual applications. A lot of people think they have complete metrics but find out at the end of the day they're not capturing and monitoring memory, which is a huge piece of the puzzle. If you start to think about is my individual ecto instance, for example, overprovision, they're trying to make decisions with only part of the puzzle and oftentimes it's memory that's the one that is is most constrained. And then, you know, inside of the the data realm you get tagging is so incredibly important. Optimization, in my mind, the way I think about it, isn't necessarily maybe optimization that a lot of the entities out there, the tools that are out there will talk about optimization. It's really a cloud economics exercise that requires you to have data beyond what you may normally be thinking about, such as, you know, the operational metrics, the tags. Who spending it, what are they spending it on and how much are they spending? The lack of visibility, which is the next one I'm going to talk about, is entities don't often know actually who spending their money and what they're spending it on. And if we go into a large entity for the first time to do optimization, I've stopped being surprised now about how many individual linked accounts we find that nobody really knows who owns them, and then when we do find out who owns them, they're like, oh, that person has been with a company for eight months. The first few times I found that I was shocked. I've stopped being shocked anymore. So it's a pretty consistent thing. And then probably the last one. I would really say it's just a lack of technical expertise internally to keep up on all the different ways that you can optimize, because they're often not intuitive with respect to the task at hand. For the technical and business teams, they're not necessarily thinking about optimization. They're thinking about, you know, some other performance are technical aspects and don't put on an optimization hath that. You know quite honestly, that I don't even own the hat to be able to think about optimization. It's not a topic that is for front. They can make two decisions and in their mind they may be technically the same, but one has a much different optimization decision and because they're unaware of it, they'll go the wrong path. Right, and interesting different business drivers really at the end of the day, right, Yep, yeah, deterministic behavior. So that's good. I appreciate that. Is it's amazing that. I don't know how we got so far away from the charge back show, back days of the original on set here. It's like it's broken. You know, we broke it. I think when I say we, you know we as an industry out there. We've kind of, I don't know, somehow missed the step or forgot about something, but those are interesting. Definitely I'd rate them high up their top five as well. So a lot of people we approached the say we can optimize, we already have some tooling and things along those lines. Then they say that's all we need right, why not just use a tool? Can you talk a little bit about that and some of your experience where you've run into this? I think those are blighting words. Yet well, I can, and I'm going to add some fuel to the fire a little bit there. I was on a call recently and somebody use the following phrase that is really resonated with me, and that is a tool with the tool is a fool, you know, and oftentimes it comes down to that. It's too easy of a solution. That's what people are looking for, kind of a push button solution it, and it just doesn't work that way. Let's be clear, however, though. Tools are a very key part...

...of an overall solution and it's important to select the right one for your needs. However, tools alone just they won't solve it. If you go back to the challenges, if you think about the decentralized business units, lack of governance, lack of, you know, visibility to some extent, depending on how you want to do it, a tool doesn't solve your data gaps for you. And one of my biggest issues with some of the tools in and it's kind of been an issue for me from the very beginning, is the tools have specific algorithms behind them in their recommendations and they will make said recommendations whether they have complete data or not, and that just ends up creating noise and because a lot of people want the solution to be just to have a tool because they honestly don't want to do the work that is necessary to get to the best end result. In it does take work. Someone has to be able to take the information out of the tool in put it into context for the various disparate decision makers and stakeholders that I reference earlier as well too, because the conversation changes depending on WHO's in the room and what problem you are going to solve, and tools just they don't have that capability. They are incredibly key piece of the solution, but they are not the solution. In the other thing I would say with respective tools is I probably get an email a day with some brand new tool that is touting they can optimize and they all kind of do the same things. And if you get internal stakeholders where we talked about, you know, those disparate stakeholders, everybody may get behind a different tool and it creates noise internally as well too, for somebody, maybe the the CTO or whoever's trying to make a decision about a tool. And you get these tool champions inside of an organization that just create noise. Even if you think about you know there's there's individuals inside of an entity that maybe your fans of zoom and some maybe fans of chime and some maybe fans of teens. Right, it's the same basic concept with respect to optimization tools. So tools can't and won't solve your problem on their own. And the other piece is find one, pick one stick with it. It's great, I think, to it when we see our clients and other people out there going to the multiclode world this. Well, it's just I think it even adds another a layer of complexity that makes it very hard for tools to work alone correct right and and even the ability to understand different services in different clouds and how they line up. Most tools won't give you a single, single panic glass view on your on your cloud spend, even if you're just thinking about it from a cloud economic standpoint. It'll be oh, I can see my amspend or I can see my as Youre spent, or I can see my GC's piece spent. I can't see them all together. Yeah, it also doesn't have that soft skills of understanding how to negotiate between all them as well. So I think that's a very good point to know. So you see, that kind of a Segue to the next question I have. Where do you see this going? You know, over the next two years? If you look at it from a historical perspective, in the space junk, quote unquote, that we've created here now, where do you think Willi and over the next two years? Will we get better at this? Will we continue to some both long here? Yeah, I think the conversation is going to change over in it's already in the process of changing, about thinking about the best way to optimize. In the past it was okay, how do I look at my sort of eye as spend? Think of lifted ship right, and I want to spend less on what I'm running in. I ask. The real conversation going forward is, okay, that's great, that's low hanging fruit. Let's but you know, I'll get some savings plans in place. It's buy some of our eyes where they're appropriate. Maybe some you know, auto parking, some other pretty standard pillars of opten as they inside of your eye as spen. The real gains going forward going to be moving things out of standard eye. As an up the cloud maturity model, writing and identifying in a data driven manner, because if it's not data driven,...

...it's not scalable. In a data driven manner, trying to figure out which workloads you may be running in a current fashion today and how do you re architect and refactor to really take advantage of where the cloud is going? And again another area that I'm not necessarily often surprised, but I see really large entities and I'll get in there and I'll dive into what it is they're running and I step back and it's it's you're in the cloud, but you're not really in the cloud, you're just in somebody else's data center. You lifted and you shifted and you didn't really change anything. You're running everything steady state. You're not taking advantages of what the underlying kind of sexiness of the cloud is to begin with and how to pay for what you use and then don't pay for it when you're not using it, and making sure your size correctly and you're taking advantage to autoskin groups and you're using containers where you should and you using spot when you should. That's kind of the next level of where I see optimization going. Is Often Times where we hit a what I like to call an adoption hurdle. We've we've identified a candidate set for an optimization savings, but as soon as you start to have the conversation with the application owners in the business teams, you realize pretty quickly they're not architected correctly. Need to be able to take advantage of the savings to begin with right and that's I think that's where everything is moving. And you know, we talked a little bit about tools. Can't solve your your problem. Going forward, tools will become an even larger portion, I believe, of your overall solution. They're still only going to be a portion of your solution, but getting the right one will be key. Now, if touched on a lot of different aspects of optimization, you know, containers, steady state in aim. It seems like there's kind of an evolutionary path that people can take as well, and maybe sometimes they're not quite ready for some of these steps. So I guess it's a multipart question is is they're kind of a path that we can help people with, and it and what else do they need to do to help themselves get prepared for that that evolutionary step? Absolutely, and it's really about kind of like to think about it as begin with the end in mind, right where you trying to get to, and always having that that conversation internally around this is where I ultimately want to get to and there's a process in an order of operations in order to get there. There's nothing wrong with coming to the cloud in a lift and shift, all right, that's that's a very fine way to get to get in, especially if you're just coming to the cloud for the first time. You get comfortable with it, you get internal stakeholders comfortable with with the cloud and it secure and it's safe and it'll work and it'll run and all that stuff right, but you have to intentionally know that after you are x amount of months into the cloud that you need to come back and then start thinking about the next pieces of the puzzle, the the refactoring in the re architecting. It has to be an intentional piece of the timeline because if it's something that is just it gets forgotten about. You end up spending more and more when you don't need to and after you have been, let's say in the cloud, for you're a large enterprise, and maybe you've brought some Web APPs to the cloud early just to get over the Hump, and maybe you've been in there for eighteen months, you're comfortable, you solved a lot of the internal risk issues. Then you start to change the way that you actually come into the cloud because you have enough knowledge about maybe your business systems internally or your applications to say, okay, I don't need to just lift and shift anymore, I want to actually come in in a rearchitected or refactored manner so I can take advantage of the other niceties of the cloud in different ways to be more efficient, performance driven at cost less right out of the gate. And it's having that conversation with clients so that they understand that this is this is the the time path and there's multiple places along the path that you need to be able to readdress the...

...way that you're running things right. It's so much more successful if you have that conversation up front so everybody knows what's coming. Hopefully that answered your question. Sure did. Now it touched that over provisioning and I like to share this little story first, because you say everybody's ever provision I met with the prospect and they said no, we're not over provision we're in our data centers, we've got everything right size. So we picked a random workload. Two percent memory, four percent CPU. So why is it that these organizations can't solve for this? Okay, and that's kind of funny too. That's one of my favorite conversations when people say we don't have anything to optimize, we're optimized fully already. I have yet to ever ever see that. So it's a couple of functions. One is lack of good data and not everybody necessarily has all the data they need. And the other thing, honestly, is the incentives to be optimized. If you think about a way, a lot of central idea or a lot of technology groups in large enterprises come to the God there. Their entire mission is one to get there into to drive performance. I had a conversation with our CEO some time ago about optimization as a whole and he said, look, it's it's not really the number one driver right now, like cost isn't what's driving people to the cloud in cost isn't necessarily as important as maybe we think it is. And and my reply back to him was until it is. And it usually happens on a dime. Somebody's midway through a year and all of a sudden realize their run rate is, you know, fifty percent over budget and there and panic mode. Or you get something like we're going through right now with covid where saving money is an absolute number one priority and all of a sudden they're more willing to address the needs than they have been in the past. One of the challenges to get there, if we go back and we think about as you step through the process, it's probably a CTEO or somebody at the top that notices cost or out of control. They push down and then you start to get down to the individual business unit or APP owner teams and the minute you meet with them and you talk about optimizing the walls go up, because here's the thought process. Two things. One, if you come in and you tell my boss that we're over provision when I've been telling them we're optimized, you're going to make me look bad or to you're going to come in and you're going to reduce my spend and I'm never getting that budget back right. Eat. They the defense mechanisms are pretty consistent. The first time you start to meet with with APP teams and part of the trick is is to understand with them that, look, this is a good thing because you can reduce your spend on what you're currently delivering from a performance standpoint and deliver greater performance while still being inside of your budget. Right. So it's building the partnership with the APP teams so that they don't think you're coming there to, you know, touch their stuff or take away their cheese. And it's a challenge, but that's part of the reason why is you've got to navigate through so many different layers of the business in order to actually achieve savings and the decision makers are different along the way. Many entities, whether they want to understand it or not, actually lack strong governance at the central I t level or anybody who can drive these policies down through the organization. The ones who have the most success in achieving optimization are those that have a good demandate internally to make it a priority and to drive change down through the organization. Because if it's just you've you show some data, you give it to the APP teams, you say, Hey, here's ways you you could save money. If you meet...

...with them every month, you'll come back with the exact same list because they never do anything about it if they're not driven to make change from from on high, if you want to think about it that way. Right, it's got to be a mandated priority throughout the organization. You know, I think Ian, you and I have worked in in a couple of entities where we've seen it come down on high and it's very effective because people all of a sudden start paying attention. Oh yes, they do. Yeah, it was going to say no. I think it takes to the whole organization. If you look at the normal area of stakeholders, usually about six parts of that and six areas of concern. They all have to work together on it. You know it. I think it all goes to security as well. Operations, networking, it just it goes on and on and on. Understand the different impact to some of these decisions and making sure that there is the proper governance in place. That doesn't mean stopping you from doing your business, but you can optimize and go faster all the same time. So I think that's that's the really key takeaway to that. Yeah, I think that's that's the part that excites me the most too. And if we look at it really guys from a second watch, shameless plug, I would have you key best practices right for places where people you know, could make some easy wins and how to maintain, I guess, being optimized, or optimization really, as it were. It just great to hear that from you, Willie. Yeah, yeah, for sure. If we go back and sort of start from the beginning and understand this very much a data driven exercise. So from a bit best practice this standpoint is having governance and sort of drive in place to fill the data gaps where they exist, because without data you just can't. The tools that are there are so much more effective when they have good data, just like anything. So lack of data and understanding where to get it throughout the organization is in closing those gaps is an incredibly key piece about being able to achieve optimization and maintain it going for the other thing to think about is optimization is not a oneandone exercise. Right. Continuous optimization is a must. If you just think about, you know, every day it yes, it seems like every day is introducing some new instance family that maybe it's the c five as the and the underlying and those are, you know, roughly ten percent cheaper than the standard c fives, and I understanding is that a good candidate. They're constantly introducing new, new families, new technologies, be it, you know, if you think a couple years ago, containers or a couple years before that you've got, you know, servilis and all of those different pieces in puzzle. Have People solved the spot risks? All of those different things are key to maintaining a continuous optimization. So it's getting in where I'm leading towards this honestly, and this really isn't a plug for second watch. I would highly recommend for anybody that is going to attempt to do optimization is to engage with a partner to help. It's not something that I think is necessarily a skill set and needs to be internal to an organization because it will honestly most likely only be a portion of somebody's responsibility and to do this right, it needs to be what you do every single day, every hour of every day. The other thing I would say is tagging practices are incredibly key from so many different avenues. It's not just tagging from that from a technical standpoint. You know, maybe you're using a tag to to launch your instances, so or you're using it to control your auto parking methodology, you're scheduling tag, but it's really around simple things that drive optimization and understanding the data, being an environment tag or an account owner tag or an application tag, so that you really understand what is driving your spend. WHO's driving it? Is it across production environments? Is Develop in a testing environments?...

And talk about it. The area where most people are most overprovisioned in the cloud. It's not in production environments. It's all the money they spend in development and test and staging in Qa environments that they maybe not even realize. And then the extraneous spend where I find that people miss most often when they're building out. You know cost of cost of cloud models is they miss out on all the other services that they'll end up spending money on. You know, get they go. Here's my EC to instance, it's going to cost X and I know I may have an ebs volume of Y and maybe I've got some mess three. If you look at, you know, the hundred plus services from aws alone, and you look at somebody's bill that's been in the cloud for a while, you're talking twenty to thirty percent of their bill comes from services they may never even knew existing at the time they launched into the clouds. It's managing managing those as well to in creating visibility and accountability. That's probably the number one thing you can do is understand your spend, set up a Mecha is internally to review with your APP teens and then holding them accountable to it. It can be as simple as developing a dashboard with tagging grating, because everybody understands what it looks like when you walk into a room with your boss and somebody's going to show a tagging compliance dashboard and has a big F on the top of it. I could talk about creating a visceral reactionism the old stream, that that's the old stream back tactic. Put a little bit. Well, different buddy, different view. You know, honestly, it kind of works. You know it's and I put that really under the accountability right. You have to be able to drive it and drive it through data. The data, I like to say, doesn't really lie. You can lie with data, but the data itself doesn't lie. And a lot of times and teas will say that they have a process in place or a technology in place to maybe clean up all unattached volumes, for example. We have a tool that does that, and then you show them the data and there's a whole bunch of unattached volumes than and the ants are almost always right out of the gators. Oh, we must have a little bit of a gap. Yeah, you do, but it's because you know it's again, they're assuming the tool or whatever process that they put in places working instead of actually checking that it's working, and having that visibility into those things is key. Those are some great practice ideas around that. If you had one call to action for people they had to maybe start with or do today, what would it be? I think I would say empowering someone within the organization to drive the process, because without that it won't happen. Like say, it's got to be a priority and you've got to give someone internally the power to drive it. And one more thing, don't forget. Find a good partner. Makes a lot of sense. Yeah, I was going to say, just to circle back on that, I found even in my own experience before when, a quote unquote, I was a customer, having that in there creates a better bias, right, so you get away from that friction of the internal fighting where it now you can turn to it and blame the Vendor Guy, right, at the end of the day. Yeah, no, you're incredibly correct, skift. It helps so much of this, honestly, is just navigating internal politics, for sure, right and red tape. And you know, even when it comes down to let's say you're going to purchase savings plan Rur, if you don't understand how cash is achieved internally by business units or or so forth, you'll end up driving yourself nuts. Right. So it's understanding how to navigate all the different pieces, because it can also change your recommendations if you understand. You know, cash is something they only get one time during budgeting season and then they can never ask for it again. Then you change your timing or maybe you go for a no apron option, but it's understanding those so that you can make the most intelligent recommendations possible. Very good. I just I know that you love data,...

...love good data. I just want to ask you one day to point before we wrap this up. That is, how many breweries have you visited? Well, I currently have a hundred and seventy four different brewery shirts and I pretty much buy assort at every brewery I go to. I'd say ninety five percent of them I will buy a shirt. So do the math. They it's probably upwards of, you know, hundred ninety. I will get to three hundred sixty five and all fifty states soon, you know, when I can actually travel again. Yeah, yeah, that there's that Roll Yep, waiting for a little normal scene. Well, Willie, I want to really thank you for joining us today. I found that incredibly insightful. I'm sure our audience will as well. SKIP, Co host. Always great to have you here to you know, likewise, likewids, we'll look forward to next week's episode. In the meantime, if the audience has any questions or suggestions. Please reach out to us that cloud crunch at second watchcom. Thanks again. We'll talk to you next week. You've been listening to cloud crunch with Ian Willoughby and skip Berry. For more information, check out the block second watchcom company block or reach out to Second Watch on twitter.

In-Stream Audio Search

NEW

Search across all episodes within this podcast

Episodes (33)