Powerful reporting from YP’s data warehouse helps SMBs deliver the best ad campaigns

The next BriefingsDirect big-data innovation case study highlights how Yellow Pages (YP) has developed a massive enterprise data warehouse with near real-time reporting capabilities that pulls oceans of data and information from across new and legacy sources.

We explore how YP then continuously delivers precise metrics to over half a million paying advertisers — many of them SMBs and increasingly through mobile interfaces — to best analyze and optimize their marketing and ad campaigns.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

To learn more, BriefingsDirect recently sat down with Bill Theisinger, Vice President of Engineering for Platform Data Services at YP in Glendale, California. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about YP, the digital arm of what people would have known as Yellow Pages a number of years ago. You’re all about helping small businesses become better acquainted with their customers, and vice versa.

Hewlett Packard Enterprise
Vertica Community Edition

 Start Your Free Trial Now

Theisinger: YP is a leading local marketing solutions provider in the U.S., dedicated to helping local businesses and communities grow. We help connect local businesses with consumers wherever they are and whatever device they are on, desktop and mobile.


Gardner: As we know, the world has changed dramatically around marketing and advertising and connecting buyers and sellers. So in the digital age, being precise, being aware, being visible is everything, and that means data. Tell us about your data requirements in this new world.

Theisinger: We need to be able to capture how consumers interact with our customers, and that includes where they interact — whether it’s a mobile device or web device — and also within our network of partners. We reach about 100 million consumers across the U.S and we do that through both our YP network and our partner network.

Gardner: Tell us too about the evolution. Obviously, you don’t build out data capabilities and infrastructure overnight. Some things are in place, and you move on, you learn, adapt, and you have new requirements. Tell us your data warehouse journey.

Needed to evolve

Theisinger: Yellow Pages saw the shift of their print business moving heavily online and becoming heavily digital. We needed to evolve with that, of course. In doing so, we needed to build infrastructure around the systems that we were using to support the businesses we were helping to grow.

And in doing that, we started to take a look at what the systems requirements were for us to be able to report and message value to our advertisers. That included understanding where consumers were looking, what we were impressing to them, what businesses we were showing them when they searched, what they were clicking on, and, ultimately what businesses they called. We track all of those different metrics.

When we started this adventure, we didn’t have the technology and the capabilities to be able to do those things. So we had to reinvent our infrastructure. That’s what we did

Gardner: And as we know, getting more information to your advertisers to help them in their selection and spending expertise is key. It differentiates companies. So this is a core proposition for you. This is at the heart of your business.

Given the mission criticality, what are the requirements? What did you need to do to get that reporting, that warehouse capability?

Theisinger: We need to be able to scale to the size of our network and the size of our partner network, which means no click left behind, if you will, no impression untold, no search unrecognized. That’s billions of events we process every day. We needed to look at something that would help us scale. If we added a new partner, if we expanded the YP network, if we added hundreds, thousands, tens of thousands of new advertisers, we needed the infrastructure to able to help us do that.

Gardner: I understand that you’ve been using Hadoop. You might be looking at other technologies as they emerge. Tell us about your Hadoop experience and how that relates to your reporting capabilities.

Theisinger: When I joined YP, Hadoop was a heavy buzz product in the industry. It was a proven product for helping businesses process large amounts of unstructured data. However, it still poses a problem. That unstructured data needs to be structured at some point, and it’s that structure that you report to advertisers and report internally.

That’s how we decided that we needed to marry two different technologies — one that will allow us to scale a large unstructured processing environment like Hadoop and one that will allow us to scale a large structured environment like Hewlett Packard Enterprise (HPE) Vertica.

Business impact

Gardner: How has this impacted your business, now that you’ve been able to do this and it’s been in the works for quite a while? Any metrics of success or anecdotes that can relate back to how the people in your organization are consuming those metrics and then extending that as service and product back into your market? What has been the result?

Theisinger: We have roughly 10,000 jobs that we run every day, both to process data and also for analytics. That data represents about five to six petabytes of data that we’ve been able to capture about consumers, their behaviors, and activities. So we process that data within our Hadoop environment. We then pass that along into HPE Vertica, structure it in a way that we can have analysts, product owners, and other systems retrieve it, pull and look at those metrics, and be able to report on them to the advertisers.

Hewlett Packard Enterprise
Vertica Community Edition

 Start Your Free Trial Now

Gardner: Is there an automation to this as you look to present a more and better analytics on top of the Vertica? What are you doing to make that customizable to people based on their needs, but at the same time, controlled and managed so that it doesn’t become unwieldy?

Theisinger: There is a lot of interaction between customers, both internal and external, when we decide how and what we’re going to present in terms of data, and there are a lot of ways we do that. We present data externally through an advertiser portal. So we want to make sure we work very closely with human factors and ergonomics (HFE) and the use experience (UX) designers as well as our advertisers, through focus groups, workshops, and understanding what they want to understand about the data that we present them.

Then, internally, we decide what would make sense and how we feel comfortable being able to present it to them, because we have a universe of a lot more data than what we probably want to show people.

We also do the same thing internally. We’ve been able to provide various teams internally whether its sales, marketing, or finance, insights into who’s clicking on various business listings, who’s viewing various businesses, who’s calling businesses, what their segmentation is, and what their demographics look like and it allows us a lot of analytical insight. We do most of that work through the analytics platforms, which is, in this case, HPE Vertica.

Gardner: Now, that user experience is becoming more and more important. It wasn’t that long ago when these reports were going to people who were data scientists or equivalent, but now we’re taking the amount to those 600,000 small businesses. Can you tell us a little bit about lessons learned when it comes to delivering an end analytics product, versus building out the warehouse? They seem to be interdependent but we’re seeing more and more emphasis on that user experience these days.

Theisinger: You need to bridge the gap between analytics and just data storage and processing. So you have to present them in-state. This is what happens. It’s very descriptive of what’s going on, and we try to be a little bit more predictive when it comes to the way we want to do analysis at YP. We’re looking to go beyond just descriptive analytics.

What has also changed is the platform by which you present the data. It’s going highly mobile. Small businesses need to be able to just pick up their mobile device and look at the effectiveness of their campaigns with YP. They’re able to do that through a mobile platform we’ve built called YP for Merchants.

They can log in and see their metrics that are core to their business and how those campaigns are performing. They can even see some details, like if they missed a phone call and they want to be able to reach back out to a consumer and see if they need to help, solve a problem, or provide a service.

Developer perspective

Gardner: And given that your developers had to go through the steps of creating that great user experience and taking it to the mobile tier, was there anything about HPE Vertica, your warehouse, or your approach to analytics that made that development process easier? Is there an approach to delivering this from a developer perspective that you think others might learn from?

Hewlett Packard Enterprise
Vertica Community Edition

 Start Your Free Trial Now

Theisinger: There is, and it takes a lot more people than just the analytics team in my group or the engineers in my team. It’s a lot of other teams within YP that build this. But first and foremost, people want to see the data as real time and as near real time as they can.

When a small business relies on contact from customers, we track those calls. When a potential customer calls a small business and that small business isn’t able to actually get to the call or respond to that customer because maybe they are on a job, it’s important to know that that call happened recently. It’s important for that small business to reach back out to the consumer, because that consumer could go somewhere else and get that service from a competitor.

To be able to do that as quickly as possible is a hard-and-fast requirement. So processing the data as quickly as you can and presenting that, whether it be on a mobile device, in this case, as quickly as you can is definitely paramount to making that a success.

Gardner: I’ve spoken to a number of people over the years and one of the takeaways I get is that infrastructure is destiny. It really seems to be the case in your business that having that core infrastructure decision process done correctly has now given you the opportunity to scale up, be innovative, and react to the market. I think it’s also telling that, in this data-driven decade that we’ve been in for a few years now, the whole small business sector of the economy is a huge part of our overall productivity and growth as an economy.

Any thoughts, generally about making infrastructure decisions for the long run, decisions you won’t regret, decisions that that can scale over time and are future proof?

Theisinger: Yeah, for speaking about what I’ve seen through the job that we’ve had it here at YP, we reach over half a million paying advertisers. The shift is happening between just telling the advertisers what’s happened to helping them actually drive new business.

So it’s around the fact that I know who my customers are now, how do I find more of them, or how do I reach out to them, how do I market to them? That’s where the real shift is. You have to have a really strong scalable and extensible platform to be able to answer that question. Having the right infrastructure puts you in the position to be able to do that. That’s where businesses are going to end up growing, whether it’s ours or small businesses.

And our success is hinged to whether or not we can get these small businesses to grow. So we are definitely 100 percent focused on trying to make that happen.

Gardner: It’s also telling that you’ve been able to adjust so rapidly. Obviously, your business has been around for a long time. People are very familiar with the Yellow Pages, the actual physical product, but you’ve gone to make software so core to your value and your differentiation. I’m impressed and I commend you on being able to make that transitions fairly rapidly.

Core talent

Theisinger: Yeah, well thank you. We’ve invested a lot in the people within the technology team we have there in Glendale. We’ve built our own internal search capabilities, our own internal products. We’ve pulled a lot of good core talent from other companies.

I used to work at Yahoo with other folks, and YP is definitely focused on trying to make this transition a successful one, but we have our eye on our heritage. Over a hundred years of being very successful in the print business is not something you want to turn your back on. You want to be able to embrace that, and we’ve learned a lot from it, too.

So we’re right there with small businesses. We have a very large sales force, which is also very powerful and helpful in making this transition a success. We’ve leaned on all of that and we become one big kind of happy family, if you will. We all worked very closely together to make this transition successful.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:


About Dana Gardner

Dana Gardner is president and principal analyst at Interarbor Solutions, an enterprise IT analysis, market research, and consulting firm. Gardner, a leading identifier of software and cloud productivity trends and new IT business growth opportunities, honed his skills and refined his insights as an industry analyst, pundit, and news editor covering the emerging software development and enterprise infrastructure arenas for the last 18 years. Gardner tracks and analyzes a critical set of enterprise software technologies and business development issues: Cloud computing, SOA, business process management, business intelligence, next-generation data centers, and application lifecycle optimization. His specific interests include Enterprise 2.0 and social media, cloud standards and security, as well as integrated marketing technologies and techniques. Gardner is a former senior analyst at Yankee Group and Aberdeen Group, and a former editor-at-large and founding online news editor at InfoWorld. He is a former news editor at IDG News Service, Digital News & Review, and Design News.
This entry was posted in big data, Cloud computing, data analysis, data center and tagged , , , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s