The next BriefingsDirect big-data innovation case study highlights how Yellow Pages (YP) has developed a massive enterprise data warehouse with near real-time reporting capabilities that pulls oceans of data and information from across new and legacy sources.
We explore how YP then continuously delivers precise metrics to over half a million paying advertisers — many of them SMBs and increasingly through mobile interfaces — to best analyze and optimize their marketing and ad campaigns.
To learn more, BriefingsDirect recently sat down with Bill Theisinger, Vice President of Engineering for Platform Data Services at YP in Glendale, California. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.
Here are some excerpts:
Gardner: Tell us about YP, the digital arm of what people would have known as Yellow Pages a number of years ago. You’re all about helping small businesses become better acquainted with their customers, and vice versa.
Theisinger: YP is a leading local marketing solutions provider in the U.S., dedicated to helping local businesses and communities grow. We help connect local businesses with consumers wherever they are and whatever device they are on, desktop and mobile.
Gardner: As we know, the world has changed dramatically around marketing and advertising and connecting buyers and sellers. So in the digital age, being precise, being aware, being visible is everything, and that means data. Tell us about your data requirements in this new world.
Theisinger: We need to be able to capture how consumers interact with our customers, and that includes where they interact — whether it’s a mobile device or web device — and also within our network of partners. We reach about 100 million consumers across the U.S and we do that through both our YP network and our partner network.
Gardner: Tell us too about the evolution. Obviously, you don’t build out data capabilities and infrastructure overnight. Some things are in place, and you move on, you learn, adapt, and you have new requirements. Tell us your data warehouse journey.
Needed to evolve
Theisinger: Yellow Pages saw the shift of their print business moving heavily online and becoming heavily digital. We needed to evolve with that, of course. In doing so, we needed to build infrastructure around the systems that we were using to support the businesses we were helping to grow.
And in doing that, we started to take a look at what the systems requirements were for us to be able to report and message value to our advertisers. That included understanding where consumers were looking, what we were impressing to them, what businesses we were showing them when they searched, what they were clicking on, and, ultimately what businesses they called. We track all of those different metrics.
When we started this adventure, we didn’t have the technology and the capabilities to be able to do those things. So we had to reinvent our infrastructure. That’s what we did
Gardner: And as we know, getting more information to your advertisers to help them in their selection and spending expertise is key. It differentiates companies. So this is a core proposition for you. This is at the heart of your business.
Given the mission criticality, what are the requirements? What did you need to do to get that reporting, that warehouse capability?
Theisinger: We need to be able to scale to the size of our network and the size of our partner network, which means no click left behind, if you will, no impression untold, no search unrecognized. That’s billions of events we process every day. We needed to look at something that would help us scale. If we added a new partner, if we expanded the YP network, if we added hundreds, thousands, tens of thousands of new advertisers, we needed the infrastructure to able to help us do that.
Gardner: I understand that you’ve been using Hadoop. You might be looking at other technologies as they emerge. Tell us about your Hadoop experience and how that relates to your reporting capabilities.
Theisinger: When I joined YP, Hadoop was a heavy buzz product in the industry. It was a proven product for helping businesses process large amounts of unstructured data. However, it still poses a problem. That unstructured data needs to be structured at some point, and it’s that structure that you report to advertisers and report internally.
That’s how we decided that we needed to marry two different technologies — one that will allow us to scale a large unstructured processing environment like Hadoop and one that will allow us to scale a large structured environment like Hewlett Packard Enterprise (HPE) Vertica.
Gardner: How has this impacted your business, now that you’ve been able to do this and it’s been in the works for quite a while? Any metrics of success or anecdotes that can relate back to how the people in your organization are consuming those metrics and then extending that as service and product back into your market? What has been the result?
Theisinger: We have roughly 10,000 jobs that we run every day, both to process data and also for analytics. That data represents about five to six petabytes of data that we’ve been able to capture about consumers, their behaviors, and activities. So we process that data within our Hadoop environment. We then pass that along into HPE Vertica, structure it in a way that we can have analysts, product owners, and other systems retrieve it, pull and look at those metrics, and be able to report on them to the advertisers.
Gardner: Is there an automation to this as you look to present a more and better analytics on top of the Vertica? What are you doing to make that customizable to people based on their needs, but at the same time, controlled and managed so that it doesn’t become unwieldy?
Theisinger: There is a lot of interaction between customers, both internal and external, when we decide how and what we’re going to present in terms of data, and there are a lot of ways we do that. We present data externally through an advertiser portal. So we want to make sure we work very closely with human factors and ergonomics (HFE) and the use experience (UX) designers as well as our advertisers, through focus groups, workshops, and understanding what they want to understand about the data that we present them.
Then, internally, we decide what would make sense and how we feel comfortable being able to present it to them, because we have a universe of a lot more data than what we probably want to show people.
We also do the same thing internally. We’ve been able to provide various teams internally whether its sales, marketing, or finance, insights into who’s clicking on various business listings, who’s viewing various businesses, who’s calling businesses, what their segmentation is, and what their demographics look like and it allows us a lot of analytical insight. We do most of that work through the analytics platforms, which is, in this case, HPE Vertica.
Gardner: Now, that user experience is becoming more and more important. It wasn’t that long ago when these reports were going to people who were data scientists or equivalent, but now we’re taking the amount to those 600,000 small businesses. Can you tell us a little bit about lessons learned when it comes to delivering an end analytics product, versus building out the warehouse? They seem to be interdependent but we’re seeing more and more emphasis on that user experience these days.
Theisinger: You need to bridge the gap between analytics and just data storage and processing. So you have to present them in-state. This is what happens. It’s very descriptive of what’s going on, and we try to be a little bit more predictive when it comes to the way we want to do analysis at YP. We’re looking to go beyond just descriptive analytics.
What has also changed is the platform by which you present the data. It’s going highly mobile. Small businesses need to be able to just pick up their mobile device and look at the effectiveness of their campaigns with YP. They’re able to do that through a mobile platform we’ve built called YP for Merchants.
They can log in and see their metrics that are core to their business and how those campaigns are performing. They can even see some details, like if they missed a phone call and they want to be able to reach back out to a consumer and see if they need to help, solve a problem, or provide a service.
Gardner: And given that your developers had to go through the steps of creating that great user experience and taking it to the mobile tier, was there anything about HPE Vertica, your warehouse, or your approach to analytics that made that development process easier? Is there an approach to delivering this from a developer perspective that you think others might learn from?
Theisinger: There is, and it takes a lot more people than just the analytics team in my group or the engineers in my team. It’s a lot of other teams within YP that build this. But first and foremost, people want to see the data as real time and as near real time as they can.
When a small business relies on contact from customers, we track those calls. When a potential customer calls a small business and that small business isn’t able to actually get to the call or respond to that customer because maybe they are on a job, it’s important to know that that call happened recently. It’s important for that small business to reach back out to the consumer, because that consumer could go somewhere else and get that service from a competitor.
To be able to do that as quickly as possible is a hard-and-fast requirement. So processing the data as quickly as you can and presenting that, whether it be on a mobile device, in this case, as quickly as you can is definitely paramount to making that a success.
Gardner: I’ve spoken to a number of people over the years and one of the takeaways I get is that infrastructure is destiny. It really seems to be the case in your business that having that core infrastructure decision process done correctly has now given you the opportunity to scale up, be innovative, and react to the market. I think it’s also telling that, in this data-driven decade that we’ve been in for a few years now, the whole small business sector of the economy is a huge part of our overall productivity and growth as an economy.
Any thoughts, generally about making infrastructure decisions for the long run, decisions you won’t regret, decisions that that can scale over time and are future proof?
Theisinger: Yeah, for speaking about what I’ve seen through the job that we’ve had it here at YP, we reach over half a million paying advertisers. The shift is happening between just telling the advertisers what’s happened to helping them actually drive new business.
So it’s around the fact that I know who my customers are now, how do I find more of them, or how do I reach out to them, how do I market to them? That’s where the real shift is. You have to have a really strong scalable and extensible platform to be able to answer that question. Having the right infrastructure puts you in the position to be able to do that. That’s where businesses are going to end up growing, whether it’s ours or small businesses.
And our success is hinged to whether or not we can get these small businesses to grow. So we are definitely 100 percent focused on trying to make that happen.
Gardner: It’s also telling that you’ve been able to adjust so rapidly. Obviously, your business has been around for a long time. People are very familiar with the Yellow Pages, the actual physical product, but you’ve gone to make software so core to your value and your differentiation. I’m impressed and I commend you on being able to make that transitions fairly rapidly.
Theisinger: Yeah, well thank you. We’ve invested a lot in the people within the technology team we have there in Glendale. We’ve built our own internal search capabilities, our own internal products. We’ve pulled a lot of good core talent from other companies.
I used to work at Yahoo with other folks, and YP is definitely focused on trying to make this transition a successful one, but we have our eye on our heritage. Over a hundred years of being very successful in the print business is not something you want to turn your back on. You want to be able to embrace that, and we’ve learned a lot from it, too.
So we’re right there with small businesses. We have a very large sales force, which is also very powerful and helpful in making this transition a success. We’ve leaned on all of that and we become one big kind of happy family, if you will. We all worked very closely together to make this transition successful.
You may also be interested in:
- IoT brings on development demands that DevOps manages best, say experts
- Big data generates new insights into what’s happening in the world’s tropical ecosystems
- DevOps and security, a match made in heaven
- How Sprint employs orchestration and automation to bring IT into DevOps readiness
- How fast analytics changes the game and expands the market for big data value
- How HTC centralizes storage management to gain visibility and IT disaster avoidance
- Big data, risk, and predictive analysis drive use of cloud-based ITSM, says panel
- Rolta AdvizeX experts on hastening big data analytics in healthcare and retail
- The future of business intelligence as a service with GoodData and HP Vertica
- Enterprises opting for converged infrastructure as stepping stone to hybrid cloud
- Redcentric orchestrates networks-intensive merger using advanced configuration management database
- HP pursues big data opportunity with updated products, services, developer program