Data Science Digest

Data Science Digest — 28.04.21


Time Series Anomaly Detection with PyCaret

In this step-by-step tutorial, Moez Ali explains how to detect anomalies in time series data using PyCaret’s Unsupervised Anomaly Detection Module. He dives deep into what anomaly detection is and explores business use cases, and then demonstrates how you can train and evaluate anomaly detection models with PyCaret, label anomalies, and analyze the results.


​​Zero-Shot Learning: Can You Classify an Object Without Seeing It Before?

Developing machine learning models that can perform predictive functions on data it has never seen before has become an important research area called zero-shot learning. We tend to be pretty great at recognizing things in the world we never saw before, and zero-shot learning offers a possible path toward mimicking this powerful human capability.


​​Shedding Light on Fairness in AI with a New Data Set

Bias and fairness in AI are highly debatable topics. To address the problem, Facebook AI has created Casual Conversations, a new dataset consisting of 45,186 videos of participants having non scripted conversations, to help AI researchers identify and evaluate the fairness of their computer vision and audio models across subgroups of age, gender, apparent skin tone, and ambient lighting.


Maximum Likelihood vs. Bayesian Estimation

If you ever looked for a hands-on tutorial comparing Maximum Likelihood and Bayesian Estimation, Lulu Ricketts has done the job for you. In this article, she provides a simple comparison of the two parameter estimation methods, from basic theory to prediction specifics to major use cases and scenarios.


Face Detection Tips, Suggestions, and Best Practices

In this tutorial, Adrian Rosenbrock and the PyImageSearch team continue to explore the topic of face detection. You will learn their tips, suggestions, and best practices to achieve high face detection accuracy with OpenCV and dlib. Though the tutorial is mostly theoretical, it features code and tons of useful links inside.


Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-supervised Learning

In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called "transferable visual words" (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.


VideoGPT: Video Generation using VQ-VAE and Transformers

In this research paper, Wilson Yan et al. present VideoGPT, a simple architecture for scaling likelihood-based generative modeling to natural videos. Despite its simplicity, it can generate samples competitive with advanced GAN models for video generation, as well as high fidelity natural images from UCF-101 and Tumbler GIF Dataset (TGIF). The code is available on GitHub. You can also check out the demo here.


Towards Open-World Text-Guided Face Image Generation and Manipulation

In this work, Weihao Xia et al. propose a unified framework for both face image generation and manipulation that produces diverse and high-quality images with an unprecedented resolution at 1024 from multimodal inputs. The method supports open-world scenarios, including both image and text, without any re-training, fine-tuning, or post-processing. You can review the code on GitHub and look into the dataset here.


FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

In this technical research, Tai Wang et al. look into the problem of monocular 3D object detection in application to autonomous driving. They have built a fully convolutional single-stage detector with FCOS3D, which took 1st place out of all the vision-only methods in the nuScenes 3D detection challenge of NeurIPS 2020.


MBRL-Lib: A Modular Library for Model-based Reinforcement Learning

In this research work, Luis Peneda et al. present MBRL-Lib, a new machine learning library for model-based reinforcement learning in continuous state-action spaces based on PyTorch. MBRL-Lib can be used by both researchers, to easily develop, debug and compare new algorithms, and non-expert users, to lower the entry-bar of deploying algorithms.


​​Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

The book by Christoph Molnar goes deep to explain how to make supervised machine learning models more interpretable. You’ll start by exploring the concepts of interpretability to learn about simple, interpretable models such as decision trees, decision rules, and linear regression. Then, you’ll look into general model-agnostic methods for interpreting black box models like feature importance and accumulated local effects and explaining individual predictions with Shapley values and LIME. The book focuses on ML models for tabular data and less on computer vision and natural language processing tasks. Reading the book is recommended for machine learning practitioners, data scientists, statisticians, and anyone else interested in making machine learning models interpretable.