Tesla

Software Engineer / Data Scientist, Fleet Analytics

Tesla

April 1, 2021

上海市, CN


The Role
Data is deeply embedded in the product and engineering culture at Tesla. We rely on data lots of it – to improve autopilot, to optimize hardware designs, to proactively detect faults, and to optimize load on the electrical grid. We collect data from each of our cars, superchargers, and stationary batteries and use it to make these products better and our customers safer.
We're the Fleet Analytics team, a small but fast-growing central team that helps many teams leverage the data we collect. We help engineers through direct support by doing data analysis for them and through applications and tools so they can self-serve those analyses in the future. To do so, we leverage the internal big data platform that is built on top of HDFS, Spark, Presto and HBase and data science tools such as Jupyter notebooks, Pandas, Bokeh, Superset and Airflow.
We're looking for a talented engineer to join us as a foundational member of the team to provide leadership in the definition and implementation of processes and tools to enable data science at Tesla. Your work will affect many hundreds of Tesla engineers daily, as well as improving the functionality of our cars, chargers, and batteries worldwide.
Half of your time will be dedicated to hands-on data analysis for the Reliability Engineering team, while the other half will be spent building data pipelines, tools and applications to automate those analysis. The Reliability Engineering team works with many stakeholders to ensure the reliability of our cars, both during the design phase where they help build test plans based on damage models, as well as when cars are out in the real world where they evaluate actual reliability and inform the new generation of car components. By working with them, you will have a direct input into design decisions and risk assessment for some of the most important products driving the renewable energy transition in the world.
Responsibilities
  • Work with stakeholders to take a vague problem statement, refine the scope of the analysis, and use the results to drive informed decisions
  • Write reproducible data analysis over petabytes of data using cutting-edge open source technologies
  • Understand and apply reliability concepts in your data analysis
  • Summarize and clearly communicate data analysis assumptions and results
  • Build data pipelines to promote your ad-hoc data analyses into production dashboards that engineers can rely on
  • Design and implement metrics, applications and tools that will enable engineers by allowing them to self-serve their data insights
  • Work with engineers to drive usage of your applications and tools
  • Write clean and tested code that can be maintained and extended by other software engineers
  • Operate and support your production applications
  • Keep up to date on relevant technologies and frameworks, and propose new ones that the team could leverage
  • Identify trends, invent new ways of looking at data, and get creative in order to drive improvements in both existing and future products
  • Give talks, contribute to open source projects, and advance data science on a global scale

  • Strong proficiency in Python, SQL
  • Strong foundation in statistics
  • Experience building data visualizations
  • Experience writing software in a professional environment
  • Strong verbal and written communication skills
  • Strong problem-solving skills to help refine problem statements and figure out how to solve them with the available data
  • Smart but humble, with a bias for action

  • Experience with data science tools such as Pandas, Numpy, R, Matlab, Octave
  • Experience building data pipelines
  • Experience building web applications
  • Experience building machine learning models in a professional environment
  • Experience with continuous integration and continuous development
  • Experience in devops, i.e. Linux, Ansible, Docker, Kubernetes
  • Understanding of reliability concepts (Weibull, Lognormal, Exponential, etc.), life data (or survival) analysis, and reliability modeling
  • Understanding of distributed computing, i.e. how HDFS, Spark and Presto work
  • Proficient in Scala