The next BriefingsDirect analytics innovation case study interview explores how Zynga in San Francisco exploits big-data analytics to improve its business via a culture of pervasive, rapid analytics and experimentation.
To learn more about how big data impacts Zynga in the fast-changing and highly competitive mobile gaming industry, BriefingsDirect sat down with Joanne Ho, Senior Engineering Manager at Zynga, and Yuko Yamazaki, Head of Analytics at Zynga. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.
Here are some excerpts:
Gardner: How important is big data analytics to you as an organization?
Ho: To Zynga, big data is very important. It’s a main piece of the company and as a part of the analytics department, big data is serving the entire company as a source of understanding our users’ behavior, our players, what they like, and what they don’t like about games. We are using this data to analyze the user’s behavior and we also will personalize a lot of different game models that fit the user’s player pattern.
Gardner: What’s interesting to me about games is the people will not only download them but that they’re upgradable, changeable. People can easily move. So the feedback loop between the inferences, information, and analysis you gain by your users’ actions is rather compressed, compared to many other industries.
What is it that you’re able to do in this rapid-fire development-and-release process? How is that responsiveness important to you?
Ho: Real-time analysis, of course, is critical, and we have our streaming system that can do it. We have our monitoring and alerting system that can alert us whenever we see any drops in user’s install rating, or any daily active users (DAU). The game studio will be alerted and they will take appropriate action on that.
Gardner: Yuko, what sort of datasets we are talking about? If we’re going to the social realm, we can get some very large datasets. What’s the volume and scale we’re talking about here?
Yamazaki: We get data of everything that happens in our games. Almost every single play gets tracked into our system. We’re talking about 40 billion to 60 billion rows a day, and that’s the data that our game product managers and development engineers decide what they want to analyze later. So it’s already being structured and compressed as it comes into our database.
Gardner: That’s very impressive scale. It’s one thing to have a lot of data, but it’s another to be able to make that actionable. What do you do once that data is assembled?
Yamazaki: The biggest success story that I will normally tell about Zynga is that we make data available to all employees. From day one, as soon as you join Zynga, you get to see all the data through our visualization to whatever we have. Even if you’re FarmVille product manager, you get to see what Poker is doing, making it more transparent. There is an account report that you can just click and see how many people have done this particular game action, for example. That’s how we were able to create this data-driven culture for Zynga.
Gardner: And Zynga is not all that old. Is this data capability something that you’ve had right from the start, or did you come into it over time?
Yamazaki: Since we began Poker and Words With Friends, our cluster scaled 70 times.
Ho: It started off with three nodes, and we’ve grown to 230 node clusters.
Gardner: So you’re performing the gathering of the data and analysis in your own data centers?
Gardner: When you realized the scale and the nature of your task, what were some of the top requirements you had for your cluster, your database, and your analytics engine? How did you make some technology choices?
Yamazaki: When Zynga was growing, our main focus was to build something that was going to be able to scale and provide the data as fast as possible. Those were the two biggest points that we had in mind when we decided to create our analytics infrastructure.
Gardner: And any other more detailed requirements in terms of the type of database or the type of analytics engine?
Yamazaki: Those are two big ones. As I mentioned, we wanted to have everyone be able to access the data. So SQL would have been a great technology to have. It’s easy to train PMs instead of engineering sites, for example, MapReduce for Hadoop. Those were the three key points as we selected our database.
Gardner: What are the future directions and requirements that you have? Are there things that you’d like to see from HP, for example, in order to continue to be able do what you do at increasing scale?
Ho: We’re interested in real-time analytics. There’s a function aggregated projection that we’re interested in trying. Also Flex Tables [in HP Vertica] sounds like a very interesting feature that we also will attempt to try. And cloud analytics is the third one that we’re also interested in. We hope HP will get it matured, so that we can also test it out in the future.
Gardner: While your analytics has been with you right from the start, you were early in using Vertica?
Gardner: So now we’ve determined how important it is, do you have any metrics of what this is able to do for you? Other organizations might be saying they we don’t have as much of a data-driven culture as Zynga, but would like to and they realize that the technology can now ramp-up to such incredible volume and velocity, What do you get back? How do you measure the success when you do big-data analytics correctly?
Yamazaki: Internally, we look at adoption of systems. We we have 2,000 employees, and at least 1,000 are using our visualization tool on a daily basis. This is the way to measure adoption of our systems internally.
Externally, the biggest metric is retention. Are players coming back and, if so, was that through the data that we collect? Were we able to do personalization so that they’re coming back because of the experience they’ve had?
Gardner: These are very important to your business, obviously, and it’s curious about that buy-in. As the saying goes, you can lead a horse to water, but you can’t make him drink. You can provide data analysis and visualization to the employees, but if they don’t find it useful and impactful, they won’t use it. So that’s interesting with that as a key performance indicator for you.
Any words of advice for other organizations who are trying to become more data-driven, to use analytics more strategically? Is this about people, process, culture, technology, all the above? What advice might you have for those seeking to better avail themselves of big data analytics?
Yamazaki: A couple of things. One is to provide end-to-end. So not just data storage, but also visualization. We also have an experimentation system, where I think we have about 400-600 experiments running as we speak. We have a report that shows you run this experiment, all these metrics have been moved because of your experiment, and A is better than B.
We run this other experiment, and there’s a visualization you can use to see that data. So providing that end-to-end data and analytics to all employees is one of the biggest pieces of advice I would provide to any companies.
One more thing is try to get one good win. If you focus too much on technology or scalability, you might be building a battleship, when you actually don’t need it yet. It’s incremental. Improvement is probably going to take you to a place that you need to get to. Just try to get a good big win of increasing installs or active users in one particular game or product and see where it goes.
Gardner: And just to revisit the idea that you’ve got so many employees and so many innovations going on, how do you encourage your employees to interact with the data? Do you give them total flexibility in terms of experiments? How do they start the process of some of those proof-of-concept type of activities?
Yamazaki: It’s all freestyle. They can log whatever they want. They can see whatever they want, except revenue type of data, and they can create any experiments they want. Her team owns this part, but we also make the data available. Some of the games can hit real time. We can do that real-time personalization using that data that you logged. It’s almost 360-degree of the data availability to our product teams.
Gardner: It’s really impressive that there’s so much of this data mentality ingrained in the company, from the start and also across all the employees, so that’s very interesting. How do you see that in terms of your competitive edge? Do you think the other gaming companies are doing the same thing? Do you have an advantage that you’ve created a data culture?
Yamazaki: Definitely, in online gaming you have to have big data to succeed. A lot of companies, though, are just getting whatever they can, then structure it, and make it analyzable. One of the things that we’ve done that do well was to make a structure to start with. So the data is already structured.
Product managers are already thinking about what they want to analyze before hand. It’s not like they just get everything in and then see what happens. They think right away about, “Is this analyzable? is this something we want to store?” We’re a lot smarter about what we want to store. Cost-wise, it’s a lot more optimized.
You may also be interested in:
- How big data powers GameStop to gain retail advantage and deep insights into its markets
- Data-driven apps performance monitoring spurs broad business benefits for Swiss insurer and Turkish mobile carrier
- How Malaysia’s Bank Simpanan Nasional implemented a sweeping enterprise content management system
- Redcentric Uses Advanced Configuration Database to Focus Massive Merger Across Multiple Networks
- HP at Discover delivers the industry’s first open, hybrid, ecosystem-wide cloud architecture
- How Tableau Software and Big Data Come Together: Strong Visualization Embedded on an Agile Analytics Engine
- Big Data Helps Conservation International Proactively Respond to Species Threat in Tropical Forests
- How Globe Testing helps startups make the leap to cloud- and mobile-first development
- GoodData analytics developers on what they look for in a big data platform
- ITIL-ITSM tagteam boosts Mexican ISP INFOTEC’s operations quality
- Novel consumer retail behavior analysis from InfoScout relies on HP Vertica big data chops
- IT Operations Modernization Helps Energy Powerhouse Exelon Acquire Businesses