Building a data science team may seem like a daunting task, especially in this market where talent with practical experience is scarce but interest and buzz in the field is extremely high. Here are a few tips for building and running a successful Data Science team.

Find the Right People

What roles must you fill for a complete data science team? You will need to have a variety of people with different types of skills:

  • Data Scientists who can work with large datasets and understand the theory behind the science. Most importantly they need to be capable of developing predictive models that fit your business context.
  • Data engineers and software developers that understand architecture, infrastructure, and distributed programming.
  • Other roles include a data solutions architect, data platform administrator, full-stack developer, and designer.

Build the Right Processes

The key thing to consider with data science workflows is agility. The team needs the ability to access and work with data in real time. The team then needs to be able to understand business problems and opportunities in the company and implement data solutions that solve those problems or facilitate growth. Make sure they are not handcuffed to slow and tedious processes, as this will limit effectiveness and make it harder to retain top talent.

Finally, the team will need to have a good working relationship with heads of other departments, and clear executive support, so they can work together in agile multi-disciplinary teams to deploy solutions that really benefit the business and will ultimately be adopted by business users.

Choose the Right Platforms

When building a data science competency, it is essential to consider the platform your company is using. A range of options is available from open source to paid services from major cloud providers and innovative startups.

We recommend you maintain some flexibility in your platforms because business and technology moves fast, and you don’t want to tether your team to a tech stack that could become a limitation to their growth and flexibility. Hybrid architectures that utilize the right technologies for the right applications are ideal. Talented architects should be familiar with many different technologies and methods and understand how to select the right components for current and future use cases.

Take Your Time

Most importantly you don’t want to rush and choose the wrong people and platforms or not have quality processes in place. Make sure to take your time to create a team that will work well together, has complementary skills, understands your business, and can deliver successful outcomes that get adopted by the business.

Ensure the Team’s Success

Once you have assembled the right team here are 5 things to keep in mind to maximize the impact they can have as they start building data-driven solutions to give you a competitive advantage:

Discoverability

Data science teams that are not practicing discoverability are writing scripts to solve different functions and not publishing them in a common place. In order for anyone to access this information it usually requires contacting one of the data scientists directly and having them send it over in a presentation or excel sheet. This is both a waste of time for the person asking, and the data scientist that has to devote time to re-delivering rather than innovating. A team that is successfully practicing discoverability publishes their work in a central location where everyone in the organization has access to it.

Automation

The difference between a data science team that does not focus on automation and one that does is quite simple; the team that does not focus on automation is continuously producing results by hand instead of letting their models do the work for them. The team that focuses on automation spends their time maintaining the pipeline instead of manually re-running their workflow. While automation can take more time up-front, it pays off in multiple ways when done successfully. Automated pipelines make it much easier to build the insights and outcomes from your team’s efforts into business processes, continuously increasing the ROI on your data science endeavors.

Collaboration

A data science team that focuses on collaboration and consistency will benefit significantly compared to those that do not. Collaboration allows for the strengths of individuals to help the group as a whole. Collaboration is much easier to achieve when there is consistency between how code is written from individual to individual. Those teams that do not have a shared set of standards will have trouble collaborating and end up with individual quality standards, versioning habits, and coding style. Collaborating with business stakeholders and users is also an important component of successful data science deployments. Great models are useless if no one can use them, users don’t trust them, or they were developed without the correct business context.

Empowerment

Data science teams that agree to use the same stack of tools are better at discoverability and collaboration as well. The trick is to get the right tech stack for the needs of everyone in the team. A team that does not have a cohesive tech stack will suffer from an over-abundance of data storage and analysis tools and a lack of collaborative cohesion. Empowering your teams with tools that make their jobs easier and facilitate the collaboration and automation will set them up for success and aid in job satisfaction.

Deployment

There is a big difference between workflow being “in production” and “produced.” Work that is “in production” means failure is ok and work that is “produced” or finished means failure is not ok. A good data science team will make sure to put tools into production that can be trusted and used to benefit the stakeholders. They will not create things just because they can, instead focusing on the problems that actually need to be solved and making the results digestible and usable by the business.

Data Science as a Service

There are also many options for engaging external expert teams that can accelerate adoption of Data Science while also preparing your organization for growing in-house capabilities.

The same principles apply to service providers and consulting teams. Make sure they are equipped to build continuous value for your organization, not just deliver one-time results.

Sources:

https://mapr.com/blog/how-build-data-science-team/

http://lineardigressions.com/episodes/2017/9/24/disciplined-data-science