Dec 19, 2018 | 4 min read

Conversation with Bill Schmarzo

Podcast #40: From BI to Big Data to IoT Analytics and Beyond

Bill Schmarzo is CTO of IoT and Analytics at Hitachi Vantara and Executive Fellow at the San Francisco School of Management. Our discussion covered Bill’s origins in data warehousing and business intelligence, how and why he had to “unlearntraditional approaches to embrace Big Data techniques. The conversation explores some of the differences in approach between business intelligence, big data and IoT analytics, along with how Hadoop fundamentally changed the economics of data. He shares some of the vision Hitachi Vantara and the application-first approach along with the benefits that come with having Hitachi’s other businesses as a customer. Companies he’s most interested in are those that are automating machine learning to bring it to the masses including Big Squid.  



For analytics, follow Twitter hashtags including #bigdata, #analytics, #designthinking 

Portlandia (TV show)


We'll notify you weekly about new podcast episodes, upcoming guests, and news. You can subscribe to the podcast and if you'd like to be considered to appear on the podcast contact us.


View Transcript

Good day everyone and welcome to another episode of the Momenta Edge podcast. I am Ed Maguire, Insights Partner at Momenta Partners, and today our guest is Bill Schmarzo, who is CTO of IoT and analytics at Hitachi Vantara, and executive fellow of the San Francisco School of Management. Bill is also what we call an OG of analytics, he’s been a prominent leader in driving thought leadership and evolution around analytics. I first encountered Bill back when he was in the business intelligence world, and it’s been fascinating to see him end up in Connected Industry. Bill, it’s great to have you as a guest. 

Thanks Ed, thanks for having me onboard. 

I’d like to dive into a bit of your background, can you share what has shaped your views of analytics technology, and ultimately led you to your current role at Hitachi Vantara? 

Well if we go way back, I think it’s a situation where I’ve always been interested in numbers and analytics, and that probably goes back to my youth being a big fan of baseball, following baseball like I did back in the old days, and even got hooked into this game called Strat-o-matic baseball, which is probably somewhat like what these sabermetric folks were doing around using analytics to help play games. I was really caught up in that fad or that fashion, and it really taught me that if you had a superior understanding of the analytics it would give you an unfair advantage of the game. 

I remember studying the cards seriously, so I knew what players to play and in what situations, I even knew what players to trade for to fill gaps. So, at a very young age I learned that really having a good understanding of the data and the analytics gave you an advantage, so throughout my life I’ve always sought odd opportunities to get involved with data. But data to me is just fuel, I spend a lot of time in data, but data by itself kind of lays there limp on the floor, it’s the analytics that gives it life, and so to me it’s always been what are the things that are buried in that data? What are the trends and patterns, associations and relationships that are buried in that data, that I can really use to my advantage? 

When you started working with Strat-o-matic, what led you to work with data analytics? 

In college I got a degree in Math and Computer Science, then got my MBA in Information Systems, and all along I was really interested in this continuing search to find ways to apply data. Of course in the world of business they’re everywhere, there’s opportunities everywhere, and so I got started out of college at Arthur Anderson back when they used to be Arthur Anderson, I worked in their MIC in their Management Information Consulting Division, and I worked with databases, I wrote some database algorithms to help expediate the ability to pull data out of databases, and soon led to my force-gut moment of my life; everybody’s life is full of force-gut moments, right place, right time, not because you’re tall or from Iowa, sometimes you’re just lucky, and my force-gut moment was in ’84 when I stumbled upon this company called Metaphor Computers who was really trying to define in those days, they call it decisions support system, how could we help organizations better leverage all the data out there. It just happened to be when Metaphor launched, it was also about the same time that electronic point of sale data started becoming available, and so we went from looking at bi-monthly Nielson audit data, six observations a year across 13-15 markets, to also having all the detail transaction data about what people are buying, what they’re buying in combinations. 

We knew so much more about our customers and their buying patterns, and the effectiveness of campaigns, and new introductions. It was one of those moments, if you were a data-analytics junky, that was like being a kid in a candy shop, it was just unbelievable the stuff we could do with it, and very fortunate that whilst I was at Metaphor we were working closely with Proctor & Gamble where they were becoming our biggest shareholder. I was putting in decision support systems across all different parts of Proctor & Gamble, and in the process of doing that I was being indoctrinated into Proctor & Gamble’s data driven decision making process. It all came together, so to be honest with you Ed I got lucky, and that really showed me both the combination of Metaphor as far as their technology that they could really exploit data, and then how Proctor & Gamble were trying to use data to transform its business. It was really one of those lucky moments for me. 

How did the lessons that you learned early-on translate into an evolution through different types of technologies, as you evolve past executive information systems into early generations of business intelligence, and up through Big Data and AI? Were there some foundational principles that are still relevant today? 

Two that have jumped out at me is; first-off we were always business focused, that is before we ever put analytics to the data, we knew what we were trying to do ahead of time. We’d gone through the process of trying to identify what it is we were trying to accomplish, what was important, what wasn’t important, how we were going to use that information. So, before we ever started screwing around with the data, we had a really good idea of what we were trying to do and understood how we were going to measure progress and success, what were the KPIs we were going to use to measure success. So, to me that one of the key fundamental things was that it wasn’t a science experiment, it was a business process that we were going to attack a particular business problem, some cases it was new product introductions, and in some cases,  it was promotional effectiveness, and in one case we actually did some detailed analysis on some potential acquisition candidates for them. 

So, we were always around the business, and then we used that focus to figure out what data analytics we were going to need. So, for me that’s what I learned from Proctor & Gamble and those engagements, focus on the business, start with the business, if you don’t do that then your life will go off and spin your tires, doing things that whilst maybe interesting to you, aren’t strategic actionable material for the problem you’re trying to go after. 

The second thing I think I learned, maybe over time, but is certainly becoming evident, is that the quality of data is always going to be a challenge. Time and time again I see technologies come along and make this promise that they’re going to solve the data quality problems. I love the conversation when companies like Hadoop first came out promised that ETL was dead, ‘It’s dead, it’s not going to be dead anymore’, like really? Really? That isn’t ETL, it’s not just about getting access to the data, ETL is about how do you integrate it, how do you normalize it, how do you cleanse it and augment it in enrichment. It was people really don’t think enough about how important data quality is, and in a time and day when we’re trying to make really important business or even social decisions, you can’t stress enough the importance of understanding what data you have, the quality of that data, and then what you can do to enrich that data in a manner that allows you to make better decisions. 

It’s a really fundamental insight. People in technology are constantly focused on the latest new thing, or new technology innovations. I’d be interested to get a bit of perspective on your evolution, past analyzing business data, to looking at operational data and industrial data. Could you talk about the origins of what brought you to Hitachi Vantara? 

Well, what brought me to Hitachi-Vantara probably had nothing to do with technology and data, and probably had everything to do with a personal friendship in another force-gut moment. You mentioned that I teach at the University of San Francisco, I teach an MBA class, they’re called the Big Data MBA. One of our research projects was on triaging the digital transformation rise and fall of GE, to me GE was a poster trial for digital transformation. It seemed to have everything going right for it, and then all of a sudden it couldn’t do anything right, it just seemed like the bottom fell out, the whole thing fell apart. So, I said to my students, ‘Do me some research and figure out from your perspective what did they get wrong, what did they get right, and what would you if you were the head of that place have done differently?’ The benefit of being a teacher is you get the students to do the grunt work for you. So, off they scattered, they did some research; I teach on a Thursday night, I’m getting ready to teach a class, I run into my old friend Brad Sural, Brad and I used to work together back in the old days at Business Objects. Brad had been the Chief Operating Officer at GE Digital, so he knew the GE story quite well, he had recently gone to Hitachi Vantara as their Chief Product & Strategy Officer.  

We were talking, I said, ‘Brad, I’ve got an assignment for you, would you like to come Thursday night to my class. The students are going to present their findings on GE, and the transformation there. I’d love to have you have a debate with them’. Brad was like, ‘Hey, it sounds great, I’m all in’. So, Brad came to the class, we had this very robust conversation, the students had done some very interesting research, came to some very interesting perspectives. It was a fun class, a lot of ideas passing back and forth, when the class is over Brad looks at me and smiles and says, ‘Bill I have a job for you’, ‘What’s that?’ he says, I need to have a Chief Technology Officer at Hitachi Vantara, somebody who can really guide our IoT and analytics initiatives’, I look at Brad and said, ‘Brad, I’m probably the world’s worst Chief Technology Officer. I really don’t know much about technology, I even care less about it, I’m not a technology guy’, I said, ‘I’m a customer guy, I think economics is more important than technology’, he smiles really big and goes, ‘That’s why I want you for the job’. 

So, that’s how that came about. I’m probably the industry’s worst Chief Technology Officer, because technology is not my motivating factor. Like I said to Brad, I think economics is a much more powerful weapon for us as a company, as we look at all this IoT data, all this industrial data, and all the things that are going on, I think economics is more important than technology, because at the end of the day economics is about the creation of wealth. That’s what we’re trying to do, we have all these new devices, all this new data, and all the ability to do things at the edge to take after the edge that we couldn’t do before. How are we going to leverage it to create wealth? I said, ‘That’s the key challenge’. So, I got the job, and I’m still there. 

I’d like to come back and dive a bit deeper into Hitachi Vantara, but I’d love to get your perspective on how you see the current state of the market, and how its evolved over the past several years. 

I’m in two minds in this situation, I think on the one hand there’s a growing awareness of the business impact that technology can have, especially around how technology can really help organizations to re-engineer the business models around creating new sources of customer and market value, or wealth, but we still start the conversation with technology. Just this last week I had an internal debate, one of our product managers came to me and said, ‘We’ve got a customer. Here’s their data, go find cool things’. I’m like, ‘That’s not how the process works. I explained to him how all the work we do, before we ever put science to data, ‘Here is all the work that has to take place first. Do you understand in detail the problem you’re solving, and who the stakeholders are, and how it’s going to impact? Do you understand the KPIs and the metrics against what you’re going to measure of progress and success? Do you understand the potential impediments, what you’re going to do about it? Understand the risks of false positives and false negatives? Have you gone through a process to identify, validate, value, and priorities the different decisions that are required to support that particular…?’ 

So, there’s all these things that have to happen first, and it’s not hard. It’s not hard Ed, it just takes time, and we are impatient as a race. As a human race we are impatient, we want instant gratification; I’ll give you my data, do stuff with it’. We’ve seen this, this is the whole Hadoop data-science thing broke out, everybody thought if they just get a data scientist and give them some data, great stuff is going to happen, and great stuff didn’t happen. It didn’t happen and so immediately they blamed the data scientist, they blamed they technology, and the whole problem started with the fact that we start with technology instead of starting with the business problem, and thoroughly understand it.  

You’ve got me on a rant here Ed, so it’s your fault! There’s one project I remember I was triaging, they were leveraging cell data from a cellular phone system to better predict when customers were going to leave. The model was really impressive, they noticed all kinds of things, they found all these key variables about it, and ultimately this data could have been very instrumental, not only in their customer acquisition and retention campaigns, but also in a strategy for where they were going to put cells. They did all this work, they sent it back to business users, and the business users looked at them as though they had lobsters crawling out of their ears, like, ‘This doesn’t make any sense to me, I don’t understand how I would use it. I don’t understand how this would support the decisions I’m trying to make’. So, they’d done all this great work, had built these fabulous models, really impressive, and the users looked at it and said, ‘Na, I can’t use it’, and it was done, conversation over! 

We don’t seem to ever learn from that, and here’s the reason why this is such a travesty, it’s because it’s the business users who really understand the decisions they’re trying to make; these business users have been making these decisions for years, maybe decades, and a lot of them use little heuristics to help them make decisions. If we can uncover those heuristics, we’ve a chance to change and term heuristics into math and scale it out, to prove out those heuristics. And so, by bringing the business users into the process right up front, not only do you build better analytics because you now have better insights into the kinds of things you’re trying to do with the data, and the kinds of decisions you’re trying to make, but equally importantly from a cultural perspective, the business users now feel like they’re part of the solution. 

How does the process of advocating a business value approach differ, when you move from working with data that’s generated by business applications to industrial applications, or industrial data where you have a very different operative paradigm at work, at least around the systems that are designed for resiliency, and you may have very different cultures in operational technology versus information technology? How do you bridge that, and what are some of the lessons that may be applicable from working with business data? 

The operations personnel they’re all from Missouri, they want to see how you can help them. Now, that requires a lot of work upfront to understand what it is you’re trying to accomplish, to really understand to walk in their shoes, that means you’re going to get up at 3am and walk through the factory with them to understand what kind of maintenance problems they’re trying to solve, and how they’re going to do it. You really have to walk in their shoes, and unlike meeting with business people whose shoes are black wingtips in a fancy office, these people live in cornfields in the middle of Iowa trying to do maintenance on wind turbines, or they’re in a factory somewhere, so you have to get out and you have to be part of their process, you have to understand how they work. 

By the way, design thinking, persona development, customer journey maps, are all wonderful ways to really understand what it is you they’re trying to do, so that you can better understand where and how you can apply data and help them to do their jobs better. So, you have to really be involved with them, you’ve got to wear the hard hats and walk through the mud with them, to really understand what’s going on. So that’s step one, you’ve literally got to walk in their shoes. Now that they see that you’re involved, show me that you care, you’re also learning all these key ah-ha moments, all these key little heuristics that can actually make your analytics better. 

The other thing I would say, this is sort of not along that question you asked Ed, but the thing that makes IoT for me so interesting, it isn’t just about the data, it’s about viability to act on that data at the edge. I’m not just collecting data, if I was just collecting data to drop it into a data lake and do predictive maintenance and demand forecasts and stuff, I don’t even consider that IoT, I consider that Big Data. Where IoT becomes IoT is, the ability at the edge to stream, aggregate, summaries, analyze, and act as the data at the edge; not having to bring it back to some data lake producing analysis, but the edge itself, I can now make decisions. To me that is the most exciting part of the whole IoT conversation, because what is happening at the edge in these Plc’s and these sensors is, more and more capabilities from a processing and storage perspective, they’re being pushed out there. These are like mini-data centers out there, and they’ve got the ability to house lots of data. I can process some pretty advanced analytics out there, and that’s only going to get more and more true as organizations start to converge on building intelligent products and smart spaces. When we see organizations trying to build intelligent products, intelligence happens at the edge. 

Are there some technologies that you would point to which are significant developments around data management analysis that have been meaningful in advancing capabilities, advancing the ability to generate business value During your career what are the pivotal technology changes or advancements that you’d point to?  

Just generally speaking I think the whole idea around data enrichment, how do I take my datasets and make them more valuable? That’s more than just accuracy, granularity and latency with data, but what can I really do to transform that data, to give it more characteristics, more metadata about the data that really helps my models. I think that’s a really big thing, I think this whole area around feature engineering and the role of that is going to become more and more important. I think the role of a data engineer is going to become more and more important, because I think the more the data engineer can do to get the data and enrich it, eventually allows my data scientists to build better models. So, I’m not sure if it’s technology per se, it’s almost a discipline that the data engineer and future engineering, that they skills that are happening there I think are evolving pretty quickly, of course, we’ve got much more proper technology. 

The technology, processing power, CPUs, we’ve got AI specialized, CPU and machine learning, specialized CPUs, storage keeps getting cheaper and cheaper, smaller and smaller, so all of these different technology evolutionary things are happening at a point in time where I think it’s going to allow us to do more with that data, quality data enrichment challenges. Again, it gets back to that very fundamental question, how rich is your data, how accurate? If the data you have isn’t accurate, you’re not going to get good results, and I can dramatically improve the quality of my analytic model by enriching the data, bringing the other data sources and blending them together, twisting them together, meshing them together. So, to me that’s not specifically a technology per se, as much as it’s a discipline of folks who are leveraging advances in technology to make that data more valuable. 

In one of our prior conversations you had mentioned that there was a really big difference in learning how to analyze data, and approach data analysis using Big Data technologies. Hadoop specifically compared to the other business intelligence technologies, the traditional business analytics based on data warehousing paradigms that you had worked with before. How is Big Data for you, and what were some of the lessons that you had to learn when you started working with Hadoop? 

Actually Ed, it was the unlearning that was the hard part for me. I’d started off at Metaphor, worked with Ralph Kimble, very strong star schema bigots conformed dimensions, all that, wrote a couple of chapters of some of Ralph’s books. I’ve always been a star schemer guy, and in the VI space that’s how things worked, you had to have a schema, before you could do anything you had to have a schema, so I was also always a schema guy, everything happened with a schema. When I would talk to a customer, in my head I was literally building the star schema in my head, what were the dimensions, what were the dimensional attributes, what were the facts… but I was doing all that kind of stuff in my head. Then I go to Yahoo, I went there on another force-gut moment, they were building office Hadoop capabilities, it allowed me to get access to data in a manner and at a speed that I could never do before, because I didn’t need first to build a schema.  

Yahoo is full of modifiers, semi-structured data, all separated by common. So, I didn’t have to build a schema first to get to be able to analyze the data, the data came in as this. The ability to load data quickly and not have to build the schema first, dramatically changes the kinds of stuff you can do, not only the speed of it but the kind of data I can get at, because now what I’m doing is I’m basically building schemas on query. When I want to pull data, now I build a schema, and by the way, my schema is a flight file, it’s not a dimensional file, it’s a flight file, because flight files I can build lickity split. Through all my data warehouse days I would have had a connection because flight files are a horrible waste of storage space, you’re repeating all those fields over and over and over again. I don’t care anymore, like storage is free almost, storage costs are dropping every day. The cost of storage isn’t my key factor, the time to make an accurate and predictive decision is more important. Storage isn’t my enemy, time is my enemy, and so I had to unlearn a lot of how I thought about data and analytics, it doesn’t start with cables in rows, and schemas; it really starts with what is the hypothesis this is trying to prove out, what data might I need? I don’t care if its structured, unstructured, video, audio, auto… I don’t care, I don’t care. Then once I know what data I might need  then I can start going through this very highly innovative, fail-fast, learn faster data science process.  

That was another thing, being a data warehouse administrator, you spend a lot of time upfront interviewing everybody you can talk to, to capture every question you might ever want to ask to build a schema to support that. Your days are great until somebody knocks on your door and says, ‘Hey Schmarzo, I need to add a new data source, I want to add Facebook data to this data’, it’s like Holy cow, your world comes to a screeching end. It’s like asking for new data was a hideous crime. So, the data warehouse to add a new data source took months, there’s a joke where we say, ‘Six months, and a million dollars’, I don’t care what the data source is, it’s going to be six months and a million dollars. But now with this Hadoop schema-less environment, I can add a data source now, and not need to worry about a schema, and I only worry about the schema when I start constructive queries which I’m going to pull this data from, and then I’m going to build a FLY file 

So, the VI environment was very highly engineered, and we can go into a lot of detail about why we had to do that; it was highly engineered, and it was very brittle. It was also broader, VI was trying to address a much broader range of questions. Data science is very focused, you pick a hypothesis, how do we predict customer attrition? Like you pick the problem and then you start bringing in data, you start testing it, you start enriching it, you try different combinations, you try another data, you’re going through, and you fail and fail… you have to have an environment that allows you to fail enough times before you feel comfortable in the end results, because one of the hardest things about doing predictive analytics, and one of the hard things about doing predictions is knowing when is good enough, good enough? You’re making predictions, you’re never certain of what 100 percent is. Yogi Berra famously said, ‘Making predictions is really hard, especially predictions about the future’, and it’s true, so you’re never 100 percent confident in what you’ve got, you’ve got to get a comfort level that says, ‘Okay, I think these results are good enough’. 

What are some of the skill sets that a data scientist needs to bring to bear that are unique, versus a traditional business intelligence report designer, or business analyst? 

Data scientists have people to build analytic and mathematical models. They’ve got to be able to codify cause and effect, that’s the heart of it, they’re building models that tell them what is likely to happen. They have to know the goodness fits and T-values, and Ts, and up values and things like that because they’ve got to understand, again is good enough, good enough? What is the cost of false positives and false negative, or type 1 and type 2 errors? The key thing is they actually have to codify cause and effect, you’ve got to build models, at some point in time you’ve got to build a mathematical model. When you do BI, you don’t have to do that, you’re building reports and dashboards, you’re taking existing data; other than making sure the data is accurate, you’re not trying to predict, you’re trying to report on what happened. 

By the way I will say, though the data scientist is trying to codify cause and effect, the fastest way to screw up the productivity of the data scientist is to make them also do data engineering work, because then they’re wearing two hats and they’re trying to cycle between those two hats. We think about data science pods, we put together a project team we have a pod that’s got a senior data scientist, it’s got a junior data scientist, it has a data engineer, it’s got business stakeholders, and this will probably surprise you, it has a design-thinking person involved in that pod. That’s how we attack projects, is we use these pods, because I can’t afford to have the data scientist trying to do everything, I’ve got to be able to dole out capabilities, and by having these pods my productivity and effectiveness increases dramatically. By the way it’s a lot easier to hire, so now I can start hiring people, ‘Oh, I need to have a couple of junior data scientists, and I need to have a design person who can help us to uncover these insights buried in the customer’s brains’ 

I’m giving you a long -winded answer Ed, sorry, but the difference is, data scientists at some point have to be able to write code or map the codified cause and effect. 

What are some of the business problems that you’ve been able to address, employing these techniques? Certainly, as you start to look at what business intelligence historically has been able to accomplish, you’re able to identify patterns in sales or inventory, and really see relationships and trends that might not necessarily be intuitive when you’re looking at say the raw data. But with data science, with Big Data I think it certainly has opened up a lot of new types of business problems to be addressed. As you compare the most successful applications of BI to Big Data, and as we start to introduce machine-learning in AI, what are some of the key differences and I most emblematic successful used cases of these different approaches? 

Wow! You can pick almost any industry and I can recite to you situations where we’ve been able to have impact, over especially these past 10 or so years, customer retention, customer acquisition, customer maturation, I’d like to recommend. In hospitals we’ve done projects around unplanned re-admissions, hospital acquired infections, asset utilization optimization. In education we’ve done things around student retention, how you acquire students. In sports you do things around optimizing player performance whilst minimizing wear and tear. In IoT space its around unplanned operational downtime, or reducing excessive and obsolete inventory, or improving first time fix. The challenge Ed is not the lack of opportunities, it’s that organizations have too many. 

It’s almost impossible to walk into an organization, any organization, and not immediately within the first half hour identify all kinds of opportunities where you can leverage data and analytics to help improve operations, they’re everywhere, and so the challenge isn’t the lack of opportunities, it’s that you have too many. What happens when you have too many, it gets right into the crease that organizations do very poorly; organizations do a very poor job of prioritizing, and do a very poor job of focusing. No one wants to do just one, to build up their analytic capabilities and their data capabilities, they want to try to do three or four of these things, and what happens when you try to do three or four of these things simultaneously is, you dilute the capability of the organization. The part of the organization that sees that first are the business stakeholders, they’re only getting half-efforts from the IT and data science organizations to solve their problems. The problem is, you try to solve all the problems, you end up solving none of them. 

So, what we’ve found, and I’m sure Ed there’s many other ways and approaches you could do this, but what we’ve found works for us is, we go through a process, we go through an envisioning process led by our design-thinking organization where we identify, validate, vet, value and prioritize the use cases, we go through a very thorough process, sometimes it will take two weeks to really get the buy-in of the organizations, and when we walk out of that, not only do we know our top priority used cases from both a business value and implementation feasibility perspective, but now we have a roadmap, we know that this used case gets done first, and it’s a precursor to these two used cases here; and now I have a road map that shows me how I’m going to start applying data and analytics across these really key business problems. and oh, by the way, each one of these used cases has a positive ROI. I can pay for the analytics projects used case, by used case. I can’t pay for it if all I’m doing is buying technology and hope that someone will come along and can solve my business user’s problems. 

That really hits at the heart of the issue, which is that all of these applications of technology really have to be tied to positive business outcomes. I’d love to hear a bit about what you’re doing at Hitachi Vantara and understand a little bit more about the portfolio of offerings, and the specific approach that you guys are bringing to bear in the market. 

Vantara is pretty fortunate in that we come to the game with some assets already in place. Hitachi data exists, we’ve already got a very solid data business, we also own Pentaho which is a great data integration and I’ll say an adequate analytic capability. We’re building out this thing called Lumada which is really an IoT capability that allows us to blend both the Pentaho and the H-Jet capabilities to become much more not just IoT centric, but overall improves our ability, and the whole advance analytics, the whole machine learning, deep learning, neural networks, reinforcements learning sort of space. So, we’re building up capabilities, but we’ve torn a page out of what I hope is my book, is that we’re starting by focusing on problems, we’re leading with applications. Our first application is all around maintenance insights, think about how an organization can use the data and analytics, especially the IoT data, to reduce unplanned operational downtime. Operational downtime whether you’re a fleet of trucks, or an airport, an airline, entertainment venue, whatever it might be, unplanned operational downtime costs you dollars, both hard dollars as well as soft dollars or unrelated dollars, things like overtime and inventory, it also costs end-user’s satisfaction. 

So, we’re focused on maintenance insights first to really help organizations to reduce unplanned operational downtime, and getting that sort of laser focus allows us to make certain that what we’re building from, at least from my perspective, the data and analytics capabilities, are very much focused on that. We’re focused on what technologies, what analytics, what data do we need, what enrichment techniques do we need, what data engineering techniques do we need in order to really help to better predict and reduce unplanned operational down time. That problem in itself of course there’s all kinds of sub-used cases that go underneath that operational downtime, what parts need to be fixed? What do you have to fix, and what’s the severity of that? What happens if you wait an extra day, who’s going to fix that? What inventory do I need, what consumables do I need? So, it has this cascading impact where solving just that one problem has a whole series of decisions that support it, and we want to basically improve the effectiveness of each and every decision we are making, because they’re all inter-related, and if we can do that, we get this huge synergy effect that really does help to drive and reduce unplanned operational downtime. 

So, an application-centric approach really focused in how do we help our clients drive value, and let that focus drive what’s happening behind the products that help us get there. 

Are you focused on specific industries, or verticals? 

There’s obviously interest in transportation and utilities, oil and gas, fleet, but unplanned operational downtime is an issue for hospitals, sports stadiums, amusement parks, unplanned operational downtime is a problem across a lot of different industries, so what we’re finding is that our right conversations… again we’ve got to think about who we are, we’re Hitachi, we have business units that make trains, cranes, trucks, CAT scans, MRI machines, so we have a natural tendency to want to go towards those industries because we’ve got brethren already in that space that we can help immediately. But it isn’t just those industries, it’s a number of different industries that can benefit from how I can reduce unplanned operational downtime. 

The Hitachi relationship is also intriguing, and earlier on in the conversation you were talking about the lessons that we may be able to take from GE, but GE has this unique combination of industrial businesses, and a digital business that was really focused on data analytics in many similar capacities to what Hitachi is looking to do; what are some of the advantages or dynamics of being a part of Hitachi? 

The biggest advantage is we have ready access to subject matter expertise. We’re trying to improve the unplanned operational downtime for trains, we have a whole group that manufactures trains, both the rollingstock as well as locomotives. We have a ready source that we can go to, and we can collaborate with to help build out maintenance insights around trains and rail service. The same thing, Hitachi has a healthcare unit that manufactures CAT scan machines and MRI machines, so again we have this ready source of subject matter expertise that can really help us to identify, value and priorities the used cases we’re going after. It also gives us from a value creation perspective a very unique place to sit than probably… I don’t know GE that well, but I would imagine that it’s the same thing for GE, because all of these industrial advices are getting smarter. We’re moving towards an age where these devices are going to self-monitor, self-diagnose, and self-heal, and they’re going to use data and analytics to become much more intelligent. We have the opportunity by being inside of Hitachi to help them create smart devices, against devices that get smarter through every interaction. 

So, to be honest with you, for me Ed it’s a great place to be. It’s a great place to be and I’ve also got to say that in working with the Japanese I spend a lot of time flying to Tokyo and back and forth is, they want a very practical approach, they’re not seduced by shiny new objects, and we think about a train, these trains will run for 20, 30, 40 years, and so they’re not shiny-new object infatuated. They’re infatuated with results, ‘How do you improve the performance of my trains by two percent?’ That’s a lot of money a two percent improvement, and so they too are very much focused not on the technology, but they’re focused on the outcomes that we can drive, and the technology. For me that is great, that’s why I live, starting back from my Proctor & Gamble days that’s what I learnt, and so I’m really in my element here. I very much enjoy it, it’s a great cultural environment, they want to get things done, but you have to be very thorough, and so you don’t find yourself being rushed into technology projects for the sake of technology, you find that you have to do a lot of due-diligence upfront to make sure you know exactly what you’re going to do, and you have the buy-in of the key business stakeholders on what you’re going to do. 

That’s a lesson that every company should walk away from this conversation with; you’ve got to start with a really through understanding of what you’re trying to accomplish, with outcomes you’re trying to drive, and don’t get infatuated with shiny new technology objects, they’re great, but they’re enablers. I’m writing a blog right now about do or do not, there is no enabler to steal from Yoda. Enabling is not, or it is, you’re going to do something, so having these enabling technologies is not sufficient, it’s how you’re using these enabling technologies to drive business outcomes, that’s what’s significant. 

That’s such a key theme, it comes up in the conversations that we’ve been having on our podcasts. I do want to turn the conversation to a technology once again, to get your thoughts on the potential business impact, and misperceptions around artificial intelligence, machine learning in particular. 2018 in many respects had been a year where AI hype, and fear is probably over the top, but I’d love to get your take on the applicability and the value potential of the technologies, and some of the misperceptions that you see out there. 

I’ll start off with that by struggling what AI is, I can’t find an AI algorithm, I could find machine learning algorithms, supervised, unsupervised or reinforcements, I can find neural networks for deep learning algorithms out there, but I can’t find an AI algorithm. Well, what is AI? To me AI is a categorization of all the other different technologies, whether they be supervisory machine learning, or antagonistic machine learning, or blah-blah-blah machine learning, it’s an overarching category, and there’s a lot of things you can do with machine learning that aren’t going to lead to evil robots roaming around the streets of Palo Alto, we’ve already got all kinds of other problems in Palo Alto, we don’t need evil robots. 

So, we really over-dramatize what’s going to happen here, and the benefits that we’re going to see in the short-term will be impressive but not the kinds of things we’re going to make for good movies. The ability to use machine learning to be able to identify when a part is likely to need to be replaced before it wears out, they’re probably not going to make any movies about that, but that’s a big benefit. The ability to flag potentially fraudulent activities or fake news, well maybe fake news might be big news but, these are things that are going to happen with these technologies, and they’re not very awe-inspiring, again they don’t sell a lot of tickets at movie theatres. But that’s the practical story of what’s going to happen is, organizations are going to be using these technologies around certain areas of the business and start to see lots of benefits. What’s going to happen when you start seeing business benefits, when one organization sees somebody using machine-learning to improve their operational performance, well, they’re going to raise their hand and say, ‘Me next, me next’. I think we’re going to start seeing a very evolutionary approach for organizations adopting machine-learning, deep learning, reinforcement learning, and integrated into the organization to drive outcomes.  

I don’t think we’re going to get to a point where it’s going to be movie-worthy, but I do thing eventually we will see at some point in time where organizations will start creating smarter products, I like to use the terms self-monitor, self-diagnose, and self-heal. The tires you are driving will start flagging, ‘This part is going to wear out’, and if it’s an autonomous vehicle it might drive itself when you’re sleeping at night down to the service station and have its part replaced, and then come back home. That’s kind of cool but I’m not going to make a movie about it, because it’s pretty blasé in what it’s trying to accomplish. So, I think the over-hype on AI does us a big disservice, but it sells papers, it drives eye- balls, I get people reading my blogs when I make some outrageous provocative claim. But in reality, nothing is going to happen in a state that’s going to be movie worthy, in my opinion, at least for the next few years. 

It’s mostly about business value, and the predictive capabilities of applying algorithms. 

Yes. Think Ed, how long as an industry we’ve been happy with retrospective reports on what happened, that’s the fame of our industry is that we always build better reports and dashboards that tell you what happened. We’ve made not a lot of progress on predicting what’s likely to happen, there are organizations out there that have done a nice job of it, Netflix telling you what movies you probably should be looking at, Amazon telling you what books you should read, eHarmony telling you who you should be marrying; there are some nice stories out there, but overall, they’re the exception, not the rule. So, we have a long way to go, we have to jump over this chasm where we’re satisfied, we’re just getting reports on what happened, we need to demand more. We need to demand more from our data and from our analysts that says, ‘Don’t tell me what happened, that’s like driving my car with a rear view mirror, tell me what’s likely to happen. Tell me what I should focus in on, become more predictive’. And so, I think industry, I hope we’re getting ready to jump over that analytic chasm and start becoming more predictive, more prescriptive, more real-time, but most organizations aren’t anywhere near that yet. 

I’d love to shift the focus to your predictive capabilities. As you look forward over the next several years, how do you see the industry evolving, and are there some outcomes that you’re particularly optimistic about? 

I think we’re going to start seeing more and more success stories. We’re seeing them already where organization have taken what I consider the right approach around outcomes, I think we’ll hopefully… and I say this, and I know it’s not going to happen though, that we stop chasing the shiny object. The shiny object this past year was blockchain, everybody had to have blockchain and then next year it’s going to be quantum computing, and that’s how that technology is going to solve all mankind’s problems. 

My prediction is we will continue to chase the shiny objects, but that smart organizations will realize that is only a means to the end, it’s not the end, and that one-by-one organizations will start focusing in on the business outcomes. Here is my prediction as I said, if I had to make one, I think the big dramatic improvements are not going to happen with the larger organizations, they’re going to happen at the medium sized organizations. The reason why there’s a better chance for medium sized organizations to have success with all these ‘AI technologies’, is because you have the ability to enact cultural change, to drive alignment and adoption around the use of analytics that drives business outcomes. It’s really hard for large organizations to do that because you have political silos. 

We know that data silos are still a problem, if you work in an organization data silos is no longer a technology issue, that’s a cultural issue, that’s somebody who doesn’t want to share their data. So, I don’t think a large organization is going to take it to the promised land, I think it’s the small and medium sized organizations who realize that we’ve democratized AI and data science, and they have equal access to the same tools as anybody else does, and instead of worrying about, ‘Do I have a data scientist as Joe Blow?’ they’re more focused on, ‘How am I driving business value?’ So, I don’t think its prediction Ed as much as the hope that organizations realize that in the end the conversations, the real value isn’t around the technology, it’s around the economics. 

That’s a great insight and I think the point here that these smaller to mid-sized organizations that have I would say fewer organizational barriers, or organizational obstacles, in your view are right there in a much better positioned to see more substantial gains, than the companies that are so big that they trip over their own shoelaces as it were. 

Yes, and I have experience, I’ve seen a lot of really interesting medium-sized organizations who had a visionary CEO who have done some very interesting things, we don’t make movies about them, but they’re driving the business and are having a lot of success, at a much higher rate than we see in very large corporations, which like you said, can’t seem to get out of their own way, or especially the political insight even stops these organizations from being successful. 

There’s also a lot of innovation amongst small companies, I’d love to get your thoughts on any interesting start-ups or smaller companies that you might have your eye on? 

The general category of AutoML really had me intrigued. I’m optimistic, I’ve got a disclaimer here, I serve on the board of Big Squid, they’re an auto-machine learning company, and I really enjoy and am excitable about what those companies are doing. I think, I hope, that they have the ability to do to machine-learning what Tableau did to data visualization, like Tableau was really successful and continues to be successful in taking and driving data visualization down to the common person. Any business person can use Tableau and get a good feel for what’s in the data. I’m hoping that auto machine-learning can do the same sort of thing by introducing some ways to automatically do some machine-learning. It’s not just about the machine learning, there’s going to have to be a lot of education in the results, so people know what they’re getting out of this machine-learning algorithms. 

I think the AutoML thing is a very interesting space, and I expect to see more and more adoption and growth in that space, and probably the ultimate compliment for a lot of those companies maybe when they get acquired and get stuck inside bigger technology organizations to transform how these organizations are taking the management data they’re collecting. So, to me that’s the one space I find very interesting. 

Great. My last question, I always like to ask for a recommendation of a resource, or something that you would share with a friend or colleague, it doesn’t necessarily have to be tech related. 

I’m not much of a book reader, I really like to get most of my learning from a lot of the research that’s published pretty freely on Twitter, in fact in my spare time I watch the TV series Portlandia with my wife. We used to live in Portland, parts of it are just too true to life it makes me laugh. But over the holidays I was given two books, one called ‘The runaway species’, which is a design-thinking kind of a book about how creativity is formed by taking existing ideas and bending them, breaking them, and blending them. I read the book and thought it was very applicable for the things were doing. I was also given the book, ‘The book of why’, which is more of a data science kind of book that talks about causation versus correlation and things like that. I’ve been reading these books simultaneously and I found the overlap in how they think about creativity, one from a cultural perspective, the other one from a data perspective, to be very-very powerful.  

In fact, I’m in the process of writing a blog roughly titled, ‘Design thinking humanizes machine learning’, it talks about how these two areas really come together when you start blending what is data-science, to data-science discovers the criteria for success buried in the data, and design thinking discovers a criteria for success in the minds of the users. They use different techniques but are both very inner team focused, fail fast, try things, mock-ups, illustrations, those kinds of things, to really get the juice of the organization flowing, because a lot of the knowledge we need is already hinted at by what’s buried inside the tribal knowledge. 

So, I’m doing a weird thing, I’m reading these two books simultaneously, I read one chapter from one book, then a chapter from the other book, and it's very-very unlike me Ed to read a book. I’ve done something different here, because two different friends from two different perspectives, gave me two different books that I found in the end. I guess I’m blending them together to steal from what are the concepts of the book. I find these two books very much blend together. 

Fantastic, those are great recommendations Bill. Again, I want to thank you for joining us. 

This is Ed Maguire, Insights Partner at Momenta Partners, with another episode of our Edge podcasts, and our guest has been Bill Schmarzo, CTO of IoT and Analytics at Hitachi Vantara. Bill, thanks again for joining us. 

Ed, thanks very much. I’ve loved the conversation, a lot of fun. 



Subscribe to Our Podcasts