A sizable portion of a data scientist's day is often spent fetching and cleaning the data they need to train their algorithms. In this course, learn how to use Python tools and techniques to get the relevant, high-quality data you need. Instructor Miki Tebeka covers reading files, including how to work with CSV, XML, and JSON files. He also discusses calling APIs, web scraping (and why it should be a last resort), and validating and cleaning data. Plus, discover how to establish and monitor key performance indicators (KPIs) that help you monitor your data pipeline.

Topics include:
  • Describe the characteristics of different data types and the work of data scientists.
  • Describe different data serialization formats and explain how to use them in Python.
  • Define APIs and explain how to use them with Python to make http calls, interpret JSON, and utilize message queues.
  • Explain what web scraping is and describe ways to do it.
  • Define what a schema is and describe characteristics of schemas and how they influence operations.
  • Describe the characteristics of different types of databases.
  • Categorize types of errors and explain how to correct them.
  • Explain design criteria for data systems and describe how to monitor performance using KPIs.

Deze cursus is enkel beschikbaar in het Engels. Als dit voor u geen probleem vormt, dien dan gerust uw aanvraag in.