Leveraging Data Science
In the last several years, many leading companies have successfully integrated data science into their product operation and are reaping the benefits. But for most companies, these techniques remain foreign and intimidating. Some of these leaders understand the potential of the technology, but don’t know how to start incorporating the thinking into their organization. Most however, haven’t given it the consideration they should.
Data science is about harvesting the data we collect and using specific statistical techniques to glean new, previously unrealized insights. These techniques go beyond traditional data analysis in that the insights they provide are more often about the future than the past. Many data science techniques like clustering, regressions, and machine learning have been around for awhile, but are now becoming very powerful as they are combined with the tools of big data.
In building a data science capacity, it’s first helpful to realize that there are two primary ways to leverage the techniques within a product company: internal-insights and customer-value.
Data Science for Internal Insights: Data science techniques can help a company tune their product experience or other aspects of the business. In this way, it builds on an a traditional data capability to provide insights that are often predictive. For example, a company with a freemium business model may use regression and clustering techniques to segment their users on their likelihood to become paying customers. Initially, product teams don’t know which user characteristics or behaviors are correlated with the desired outcome, but given enough data the regression and clustering techniques may discover relationships. Armed with this information, a product team may tailor the user experience or marketing messages for each user based on their segment in order to drive more users to the desired behavior.
Data Science for Customer Value: More and more, data science is helping companies provide actual customer-facing or customer-enabling value within their product. Here, it augments a company’s engineering capability by providing statistical and big-data functionality that becomes part of the actual product experience. An example here is an engine that uses historical data together with matching techniques to provide high-quality, personalized, in-product recommendations. Another is an anti-spam product that uses deep learning to categorize email.
When building out a new data science capability, here are few tips to keep in mind:
- Organizationally, data scientists can be anywhere. The two most common are as part of the data analytics team, or part of the engineering team.
- In the case of internal insights, data science builds on data analytics. If you don’t have a basic data infrastructure and analysis capability in place (data warehouse, product instrumentation, data analysts), start there before worrying about data science. Chances are that you are missing out on critical insights and basic reporting that don’t require the expertise, technology or data volumes of data science.
- In the case of customer value, if you suspect your product has an opportunity to take advantage of data science techniques but you’re not clear on what’s possible or what it will take, bring that expertise into the company as fast as possible. This technology is quickly becoming critical across a broad range of products and if you don’t already have a plan as to how it can be used with your own product, you’re probably already falling behind your competition.
- When bringing in a new data science hire, it’s critical that they demonstrate a deep interest in the product or business problems they’re solving. This is true for any product, design, or senior engineering role, but given the highly technical and sometimes arcane nature of data science, it can be tempting to hire someone who just knows the math. It’s true that you need someone who knows the techniques, but if they’re not passionate about the business or product problems they’re solving, you can end up with high-precision models that don’t contribute useful insights or customer value.
- It may not be necessary to hire a data scientist to get the job done. These skills may be learned by other roles like data analysts, engineers, or product managers. That said, someone trained in data science will have a broad knowledge of the tools and techniques out there and can spot opportunities that others may not.
- There is a difference between a data scientist and a data infrastructure engineer. While an early hire may do both, these are generally separate roles. The data scientist creates statistical models and the code that implements them. The data infrastructure engineer maintains the big data storage and tools used to operate the models (e.g. Hadoop, Hive). Data infrastructure engineers are often part of the site operations organization, as they are concerned with things like uptime, security, access, etc.
- Don’t silo the expertise. Regardless of your approach to getting the capability into the organization, find ways to spread the thinking throughout the product organization. It could be regular company all-hands, write-ups, chalk talks, or embedding expertise on cross functional teams. Resist the idea of “data science as a service team” and promote the idea of “data IQ” across the whole organization.
These are some of the most important considerations in bringing this expertise into your organization. Hopefully they will give you a place to start as you set your course.