Cloud Crunch
Cloud Crunch

Episode · 1 year ago

S2E05: Data, AI & ML on Google Cloud


If you’re trying to run your business smarter, not harder, chances are you’re utilizing data to gain insights into the decision-making process and gain a competitive advantage. Today we talk with data and AI/ML expert, Rui Costa at Google Cloud, about why and when to use cloud data offerings and how to make the most of your data in the cloud.

...involve solve, evolve, welcome to cloud Crunch the podcast for any large enterprise planning on moving to or is in the midst of moving to the cloud hosted by the cloud computing experts From Second Watch, Ian will be chief architect Cloud Solutions and Skip Berry, executive director of Cloud Enablement. And now here are your hosts of Cloud Crunch. Welcome back, everybody. If you're trying to run your business smarter, not harder. Chances are you're utilizing data to gain insights into a decision making process and gain a competitive advantage. We all need that competitive advantage. Today we talked with data and a I m l expert Rui Costa at Google Cloud about why and when to use cloud data offerings and how to make the most of your data in the cloud. Welcome. Really? Thank you. Thank you. In Yeah, it's great to have you on here. I'm gonna tell everybody a little bit about your background here. Ruby has worked at Google and various roles, most recently is learning consultant, working with strategic customers and partners to create and execute on the Google Cloud learning plans. He is also the founder of speech analysis framework, which is successfully graduated to become a product within Google. Rui is also releasing a book from O'Reilly Media Title Building Cloud, Native Applications on Google Cloud. Wow, with all that. Sounds like you're pretty busy. It's fun. It's good busy. Yeah, it sound pretty passionate about this stuff. So I think this is gonna be ah, lot of exciting information. How long have you been a Google? Now? For years. So in cloud years, that's, like 28. I believe things change every day and move so quickly. Yeah, I think obviously you love Google's culture. We work with Google as a partner as well, and we find it very, uh, interesting and exciting. Thio dig deep with you guys, and you guys were building something pretty special over there. Thank you. So, yeah, we're gonna be talking a lot about, you know, the data side of the cloud today. So maybe we could just start off and talk about the summary of the data offerings that DCP has and why you would use those in certain circumstances. Yeah, sure. And I apologize if I miss one or if one was just recently announced that I'm not aware that's a thing Yeah. So everybody knows we're recording this in December of 2020 still the middle of pandemic. But as you know, with the cloud, obviously there could be an announcement in an hour that we weren't anticipating. E think most people are probably familiar with our big query. And that's really our core. I think, to a lot of our, you know, data manipulation, data transformation, we think cloud and what a lot of customers you know do choose Google Cloud for is there are enterprise Data Warehouse offering, which is Big quarry. But outside of Big Query, we have other database offerings we have the traditional that you're accustomed to and probably using. And it's provided as a managed service, like my C goalpost grass. So we have those as well. So those are more for the traditional relational databases. We also have our spanner and spanner came again. And most of these products that I talked about outside of things like my sequel in Post Dress on Things like credits that are open source projects or other vendors are distributing are created by Google. All right, so so, like, when I talk about big query, that's something that we organically created to service our own needs. Google's needs, right? And that's for YouTube, Gmail and Search and all other one billion products that we have. So we talk about spanner is another offering, right? So now you want a little bit of both, right? You wanna old lap and well and relational, right? And the transactional databases. And we needed something internally and organically. We just built spanner, and that's what serves a lot of our ad business. And now we've externalized that to our customers. So I I like to think about it, you know, think about having the ability to have two data bases across the regions and being able to, you know, committed transaction and then being available in both regions. Right? I think that's just super powerful and what you don't usually see and as if you're a database of ministry, you know, that's, like, very difficult right to to accomplish. Yeah, I'd have that acid compatibility across geo far away. That's it's absolutely amazing. Absolutely. We have another one that, you know, came from the Fire Base world,...

...which I'm very passionate about, which is Fire Store. That's another no sequel database offering we have. And I'm passionate about it because I love to, like, you know, kind of build these little mobile app, so just have some fun, right? To speak out a little bit. Yeah, and I'll build an app like this to say, and then I wanna have this app to dynamically get new updates. Right? How do you do that? Right. And you start thinking about like, I'm gonna do a push. Notifications will, Fire store has, like, I can attach a listener to the database and I can look for, like updates. Deletes. So if someone makes a change, the database, I'll get notified on my application. Now, I could render this right to the user. So fire store, you know, came from Firebase. It's now part of Google Cloud and is another powerful no sequel database option for our customers as well. And within these, just so you know, like there's other ones. They're like, data store. And there's a few other ones that we have You could even consider our object based storage, you know, a database, because there's so many cool things that we could do with it now with big query, we can actually use standard sequel language to query object sitting in storage in our object storage, even within these products, and there's like a lot of other features within them that make up so powerful. Yeah, I think big table obviously is out there as well. God, about big table C E. I mean, it's hard to keep them all track, right? I mean, there's so many of them and they have such interesting use cases around each 12 Yeah. No, that's That's fantastic. Yeah, we see Fire store quite a bit. Definitely the mobile side. We're talking to a lot more people, and they're very, very interested in in Big Query again, you know, obviously from the data warehousing side. But tie into a i N m l forecasting those types of things. And, you know, let's let's take it a little bit further into Big Query, if you don't mind. What are some of the more powerful business use cases that you're seeing on why it's attracting people? I'm gonna take one that I'm gonna take a little bit different direction. I'll come back to you know why customers are using it. But I'll tell you one thing that customers sometimes don't look a bit query to use and it is an option. And when we talk about dashboards, that's easy, right? You're gonna use lookers or some type of B I to like tableau or looker to, you know, to visualize those dashboards coming off a big query. But other people look at dashboards in depending the industry. You're in a dashboard. Could be for a gamer. Their scores, right. So now we're talking about, and that's just gamers. But for anyone, it could be for, you know, in the government space, maybe a dashboard for in this situation that we're in right now for how many cases of covert there are right. 19 there are. And they want to display this right? And these users, they're not gonna have liquor tableau or some tool right they wanted they needed on the website. So Big Query can even power that piece of it as well. Right can power that that Web based front and as well So customers air coming. And so we start thinking about what? Well, this is pretty cool. So what? You can use B i tools. You can use a Web based Web front and so JavaScript. So there's an sdk from Big Query like yes, there is. Well, wow, that becomes pretty powerful. And that alone is the big reason customers are coming to Google. Is the ecosystem of big query iss so broad that it becomes this? Ah, source of truth for their reports, for their dashboards, for their displaying data to their users, outside of displaying and being that kind of source of truth and having such a large ecosystem of partners that allow integration to be query, Then comes really the magic rate, which is the super powerful engine that it has. How it basically takes your query can shard it across, you know? Think about it. That's exactly what it does. It takes a query charging across all these different nodes and computer sitting in the data center, executes it and then comes back with the response. And we're not We're not talking like, you know, minutes. We're talking seconds, right, depending again what you're trying to do if there's, you know, aggregations or whatever you are doing. But if it's this petty by query, it's gonna come back to you in seconds. Maybe maybe minutes, right? Depends right depends the queer that you're doing and for users that's powerful. They can mind their data. They can start now, producing insights to their their users quickly. I mean, I've heard stories of customers running, and I've come across them where they'll run Ah, jobber once a month or once a week or once a day because it takes so long that they just...

...can't run it, you know? And they can't provide that data in real time to customers. Moving to Big Query, you know, allow them to say, You know what? Now I can write. I can execute these queries a lot faster. And once the data is in big Query, then I can now presented to the user. And I don't have to now run reports once a month or sorry once a day, I can run them. You know, whenever whenever the user needs, it's gonna be readily available for them. So I think that's the huge thing like that's the big thing that customers are, you know, choosing s so it's not just one right. But that's why the customers are choosing, you know, big Query. It's providing a insights in real time or quickly it's providing them a large ecosystem, so they don't have to change the way that they're doing things today. And then for people that are like, Oh, I gotta go Big query. You know how much is different? Is this gonna be? Well, it's not standard single language, like so if you know, sequel like you're gonna know Big query. And it's a manage platform. So, yeah, I like that aspect of it, too. And it was very easy for me to learn how to use it. You know, just populate the data and also like, you know, we'll get into some experimentation things hopefully at the end of this. But the amount of data that's available out there, that's public data, large data sets that you could bring in. I thought it was really, really a great way of learning how to use this. This ecosystem, I think also the other thing is, too, is that and I want to kind of check to see what you think of this. Too often, we have customers that are coming from a traditional I T environment on prim. They may have some type of data warehousing some of the traditional ones that we're aware of. And I know that Google is very easy to import that in, Uh, but are you seeing that people are kind of doing a one for one? Because sometimes what we're seeing two is let's say they may have a very large I would say proprietary but licensed, heavily expensive system out there with a lot of data, and they're pulling it in. They're not going to just one platform, but they may be pulling it into, you know, like for data. Big Query is an example. Some. It may end up into my sequel database or post grass, and then something else is, well, maybe a big table, because they have a little bit of unstructured data or however else they're doing it. So did these tools work pretty well together? Do you feel like I think so? One of the things that if you look at our platform, I mean everything is pretty similar. So once you're accustomed to one of the solutions are one of the product. It's pretty easy to get accustomed to another one, and you might be like, Wait a minute, you're talking about like a compute engine kubernetes How does that. How is that? You know, similar to like, you know, big query or classical, which is what you mentioned around post pressing my sequel. What I mean by that is just the interface itself. It's easy to navigate, right? Nothing really changes when you're going for one to the other. So things within the platform are fairly similar. And you might be thinking, Okay, that's good. But how about you know, endpoints? Well, that's the same thing to since we were able to, you know, we we came to market the last right out of the Big cloud prevent vendors. We had the opportunity to expose our things in such a way that we knew how users wanted to consume. Um, so when you're using one of our endpoints, it's easily the similar to one of our other endpoints. What I mean by that is like if you're going to do like, a post or get whatever the endpoint is, Ah, lot of the verbs that are associated with that endpoint or not really much different than the verbs associate with a different point. So, like Victorian point, different communities and point, the verbs are the same. That makes using the platform pretty easy, like starting to get accustomed to it. Going back to your specific question around, you know, bringing all these data sources in How do I do things? I mean, we have so many really powerful tools. One, you know, again, we haven't really touched on it. And I have mentioned it before around Big query being, you know, a core of Google Cloud and within Big Query. There's so many other things, right? There's Beak UML, right so I can start now doing machine learning on Big Query. But then there's also a big query omni, which is the ability for me now to query data sources outside of Big Query. So if I have data sources it and maybe Amazon or sitting in a you know, other locations, I have the ability to now using standard sequel query, those other data sources. So now we're making a simple we made a simple not just to use Google Cloud by, you know, keeping our a p I similar keeping our platform console similar. We also are making it easier to use other cloud vendors like...

...and access other data sources that are living in other locations by something like big query omni. So now you don't have to go and worry about where the data sits, where you know what it looks like, right? All you have to really worry about is how do I mean? You do kind of have to worry about what that data looks like, right? But you don't have to learn that language, right? You're still working with sequel standard or sequel. So I think from that perspective, Google's done, Ah, phenomenal job. And there's even other tools that we even mentioned that still fall into the data worlds. Like data fusion. Right is another one. Yeah, that allows me to move data from, you know, different sources to, you know, to different targets. So hopefully that provide you some clarity on, you know, I think at least my my perspective. Yeah, and that's a good point, I think, from the aspect of connectivity and other multi cloud kind of platforms. And then also yeah, you talked about data fusion. Amazing. I mean, that was really interesting for me to be able to combine data sources into one and keep that moving. So a lot of cool, cool things so that's very exciting. Now, Now you mentioned a i n m l. Obviously there's forecasting things along those lines. Can you talk a little bit about some of the different AI offerings that Google is offering these days? Yeah. Ah, lot. So but let me put it this way. I like to look at it in three segments, giving a developer access to A I tools without being a data scientists. That's one thea other one is giving access to everyone the ability to do something with a I with not being a developer, maybe just being a data analyst, right, And third being, you know, a data scientists. Those were kind of like the three segments I'd like to break down like the user persona. And we cover all those personas within Google Cloud. So talk a little bit about, like, the first persona the developer. And I think that's the one that's most exciting for me. Being a background software engineer, I can take, you know, a P eyes and incorporate machine learning into my product without ever having to build a model that becomes super powerful. And you're like, OK, so what can I dio? Well, think about something as simple as a video that we're recording and you're on a meeting. It could be zoom. Could be WebEx Could be Google meat. It could be whatever you want. And you recorded the video and you recorded not just one, but you recorded 1000 videos and let's say that their training videos and now you say Okay, now what do I do with these training videos? I'm gonna go and say to my users, Go to these training videos If you want to go look at the video again about some topic, right? And let's just say that these training videos were about different Google Cloud Platform products. So those videos on my sequel, those videos on, you know, Big Query. There's so many videos. Um, I'm gonna go watch those videos. Probably not because they're long. There might be a full day session of big query and how doe I find what's in them. So now you take the developer and you take our AI tools and you say Okay, what can I do with that? Well, how about if I use our video intelligence a p I, and transcribe the video literally take that eight hour video and transcribe it. It's not only going to give you the transcription for the video or forward, it's gonna tell you the start and end time of every single word in that video. So now you start thinking about wow as a developer, that's really cool. So now I could build a search index on top of it. I can then allow users to search for him and because I have to start and end time of every word, I could take the user, right? The segment off that video. So now these 1000 videos that I just recorded for Google Cloud Platform I wanna go look around. I wanna go learn more about the query. Omni, I'm gonna go to my search engine. I'm gonna type of inquiry Omni. It's good. Now give me all the relevant videos and then give me a time stamp of where Big Quarry Omni was set in that video. And now I can go and play those videos. That persona developer, just using a simple ap, I just created such a powerful tool for their community or their user of their business. Yeah, and you guys certainly have a lot of training data.

I think this platform called YouTube that you might be in the ecosystem. So so think about just just keep that in mind and that that's one AP I. That's video intelligence. AP I We have natural language. AP I We have speech to text a p I. We have Texas BJP I we have D o p a p I and these were all in the eye space machine learning There's probably some other ones out there that I just I know I missed Definitely. Because we have a lot of a p I for developers in that persona, right for that developed persona, we take the next step when we go to the persona that again has no data science experience and they're not a developer, but they want to start looking at they want to create a model, right? They wanna maybe create a prediction model or for or yeah, let's just call it a prediction model. They have all this data. What did they dio? We have something called auto ml and what auto Mel does. It bridges the gap between the A. P I and going out and building your own tensorflow model as an example, or if you, if you want to do pytorch, whatever it is, it bridges that gap. And what it allows you to do is say I'm gonna use Let's use actually translate as an example, because I forgot about that one. Anything. Why would I use Google? Translate Auto ML There's so many Die elects. There's so many domain areas of your business that there's certain ways of saying certain things in a certain language that maybe Onley your area. You as a domain expert, No. And you have all this already transcribed. So you have the English version and you have lots to say. It's the Spanish version, and you wanna build your own translate model. You don't want to use Google's right. You wanna build your own. You're not a data scientist. Well, we have something called Autumn L. You can use it in gesture data. It's all graphical. This is like I show this to a couple nonprofits like about a month ago, and they were just like super impressed with this on. These nonprofits cover the world so they're all over in different different languages. But import a Google sheet or any type of sheet. A CSP file with the English and Spanish within Polly. About an hour it will basically build you, Ah, model not for prediction, like we said before. But translate model that now you can send it in English sentence and it'll translate based on your domain based on your taxonomy. Based on the data that you've provided it, that's that's pretty powerful. It becomes super powerful. And auto email is not limited to translate automatically Have auto Mel for other of our a p i s. But it bridges the gap between our AP ice and are, you know, going out and building your own, you know, model. Now you're the data scientists. That's the personal data scientists. Right? And now you wanna have something that we have another great source of tools? Of course, everyone knows, you know, tensorflow, we have open source it. It's one of the go to machine learning frameworks, including, like pytorch. And so that comes, You know, that's at the heart of Google. We built it for again for our own consumption that we have externalized an open source. It one of the things that I always have difficulty with is like, you know, I'm running this notebook on my local computer. Maybe it requires too much. Resource is, Or maybe I need to connect it to Big Query or it just I'm working locally, right? Google offers a platform notebooks but allows me to take this super notebook. Put it in the cloud. It's now there. I can back up the machine if I want to. I can use things like it, and I know you could do it locally as well. But it does now provide me the ability to use GPS, which I might. I don't have it on my laptop asked me to choose how many GPS I want asked me to customize the instance that's running that trooper notebook. It's hosted, It's protected. And if you think from a business organization that actually works out really good for enterprises because now you don't have your data. Scientists working on super notebooks locally on their laptops. They're working with something in the cloud with a little bit more secure, secure meeting from a device you know, being on a local device in the event that it gets lost or such and now imagine you're in this notebook. And now you can connect into different services within Google. I want to create a data frame off data set sitting in Big Query. Okay, No...

...problem. Right? I can connect my Jupiter notebook, too. Big query to a data set in my project or any other project as long as I have access to it. And now incorporate that into, you know, whatever. I'm doing this kind of cool thing with machine learning within my Jupiter notebook. That's just one. I mean, there's there's there's so many other, you know, pretty cool like products within it. Because now, if you you've built this model now you wanna production, is it? We have tools for that as well, Right? So you take this model and that you can put that on our we conserve it out for you. Right. So now we have a platform for you to service, so it becomes like it's really cool machine learning service platform, so Ah, lot within E. I mean, you touched on eon two really important things to Google Cloud right data and then now machine learning so we could spend hours talking about it. Oh, yeah, yeah, yeah. I mean, it's It's great when you start seeing it come together. Obviously, we want people to experiment. I mean, like, this is how I've learned how to use these things is getting in there and get roll up the sleeves, finding some sample data sets, maybe some labs out there, those types of things on. Obviously, there's a lot of training courses associated with this stuff and a certification. You know, there's a big data certification covers a lot of these things, a great way to learn it as well. But, you know, obviously cloud a lot of the promise that we give to people when we evangelize the cloud is is the potential for cost savings. So looking at these data and AI platforms, how do you feel as though it lowers the cost of the experimentation as well as the operating fees associated with it? Yeah, So I think there's gonna be times that, you know, you do move to the cloud that it might not be. There might not be a cost savings, right? It depends what you're trying todo. However, if you plan it right and use the right services, you absolutely can see a significant cost reduction compared to what you're potentially doing today. And what I mean by that is taken back to my Jupiter notebook use case. Let's take that on Prem. What do you have for your data Warehousing on Prem, right? Whatever it is you're paying for this infrastructure costs to host this data Warehouses sitting on Prem. You have these servers or notebooks that are super powerful as an example. Right to run. You know your GPS and run your your models on your device. And you might actually be limited because you're not providing the full scope to your data. Scientists not giving them everything that they could potentially do. So they might be even limited to what they can experiment with, because you might not have the resources to give them. So now you have this cost of running the state of enterprise data warehousing on Prem. You have the associated costs of the data center associate cost of the power cooling. We all know that, right? And then you have potentially You're hindering right data scientists, because you might not have all the resource is to give them. You take that, you move it to the cloud you take big query, for example. Bakery, You pay for what you use. So when some was experimenting and working that only paying for when they're experimenting and when they're working with it. So if I created data frame off some table, sit in big query, you're only gonna pay for when you actually execute that query statement. So I do. Ah, query statement. I put that into whatever the responses, I put that data frame. That's what I pay for. And now I can continue working on my Jupiter notebook and that you're a notebook. Could be sitting in on an instance in the cloud. And now you have the ability to scale that instance. You're testing it and you're like, Wow, it's not working with this tensorflow with just CPUs. Let me change my instances tensorflow with GPS. Now you're not hindering that data scientists. You're giving them the resource and the tools that they need. So now you've saved costs because you pay for what you use and pay for when you run it versus having to have something, sit there, right and not use it or pay for something when it's not being used. Yeah, and you're not over. Provisioning s the other aspect. I've worked with some data scientists in the past, and they know the subject matter so well. But then they called me and go. Why was my bill so I didn't even do anything this month. I'm like, let's take a look. I don't think he did it the right way. You should have used these services instead. So yeah, it works that way. But, you know, obviously we want people to try these things out if they want. Thio. But how... you drive that change? To get to the experimentation inside an organization? Yeah. So you touched on a key word that I think people might have mistresses, labs. So we have a platform called Quick Labs. There's thousands of different labs on it. The big thing that we do is education is getting people educated on the platform, learning how to use a platform correctly. What I mean by correctly, is that you don't do a select star, you do a select and just select the columns that you need. And if if you're new to what I just said, just take a quick Labs inquiry lab and you'll see why you don't want to do a select star and you want to select the columns that you want two things. One is cost and one. Of course, it'll perform a lot better going back that education is the big thing. And then that's dear to my heart, because that's what I do as a learning consultant. And one of the things that we do is we go into our customers thes large enterprise. Well, we're not teaching, you know, maybe 10 or 15, we might be teaching ah 1000 data analysts on. We're educating them on the platform and getting them comfortable with how to use it, that that's the big key and to also show them that it might be a change. The change might not be as significant as they think if they don't know. And second, the change can actually be a really good change, because it's gonna make what they're doing today. Ah, lot more effective. So for me, that word lab is so important is first you could take labs. You can experiment with the labs and you're not running up a bill because it is the environment. But it is. It's about education right. It's It's really about teaching the community about Google Cloud and teaching them that changes depend the change. It could be a good change. Yeah, I'm a big advocate for quick Labs. That's I often go there when I need Thio. Either refresh my memory because I'm gonna be speaking to somebody about something or just like you know, how does this work exactly? Let me try that. I love that platform because I know exactly what I'm gonna pay every month for it. And it's not very much money on. I don't accidentally leave some big query system up and running and executing over and over again. So that's fantastic. I really appreciate your time today. This is a fantastic again. I think one of the calls to action. If you want to get into this, check out Quick Labs. There's also some great books about data science on Google Cloud out there as well. Fantastic Marini material. And then, obviously you know what we also want to try to encourage people to do is find a business case inside your organization. And if you can kind of adapt that experimentation around that that is a very useful exercise as well. Talked about that before on the show. But really, thanks again. Thanks everyone for listening. We want to hear from you, please email is that cloud crunch at second watch dot com with comments, questions in ideas until the next episode. Thank you for your time and we'll talk to you that you've been listening to Cloud Crunch with Ian Willoughby and Skip Very. For more information, check out the blogged. Second watch dot com slash company slash vlog or reach out to second watch on Twitter.

In-Stream Audio Search


Search across all episodes within this podcast

Episodes (43)