Data Science Digest

Data Science Digest — 21.04.21

Hi All,

I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!

If you’re more used to getting updates every day, follow us on social media:


Dmitry Spodarets.


Paper Review: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Microsoft Research Asia has presented a brand new vision Transformer called Swin Transformer that can serve as a backbone like usual CNNs in computer vision and Transformers in natural language processing. The author provides a detailed review of the paper, exploring all the do’s and don’ts of the new approach and the possibilities it offers for developing a unified architecture for CV and NLP tasks.


Weekly Awesome Tricks And Best Practices From Kaggle

Kaggle is a go-to destination for data scientists and ML engineers for a reason. It features tons of valuable resources and hosts competitions covering pretty much each and every existing/potential topic in the industry. But how do you take the most out of the platform? Check out this article with tips, tricks, and best practices on using Kaggle during a typical data science workflow.


Using PyTorch + NumPy? You’re Making a Mistake!

A revealing take on why bugs in open-source ML projects happen more often than they should. The author has analyzed a hundred thousand repositories from GitHub that import PyTorch to figure out the problem that plagues Data Scientists and ML Engineers without their knowing and that is actually featured in PyTorch’s official tutorial, OpenAI’s code, and NVIDIA’s projects. Interested? Well, keep reading then!


How Graph Neural Networks (GNN) Work: Introduction to Graph Convolutions from Scratch

The title of this one is quite self-explanatory — The author explores graph neural networks and graph convolutions to explain how they work and how you can apply them in theory and practice in your projects. All points are illustrated with code for convenience.


Loading SQL Data into Pandas Without Running Out of Memory

How often do you run out of memory when processing a relational database with Pandas? In this article, the author explains what causes the issue and how to efficiently process larger-than-memory queries with Pandas. All problems and solutions are illustrated with code so that you can start playing with it immediately.


Multi-Template Matching with OpenCV

In this step-by-step tutorial, you’ll learn how to perform multi-template matching with OpenCV, from configuring your development environment to analyzing the results of matching. The tutorial features a video so that you can easily follow the author and tweak your code as required to complete your first multi-template matching with OpenCV. 


OpenCV Face Detection with Haar Cascades

Face detection is one of the most popular Computer Vision use cases (at least, as perceived by the general public). Learning how to use OpenCV and Haar Cascades can be critical if you want to go deep with the field — and this detailed tutorial provides a fresh and easy start for new learners. Just follow the instructions step by step and see the results in action.


OpenCV Haar Cascades

Now that you know the basics of face detection using OpenCV and Haar Cascades, let’s look into it in more detail and find out how to apply OpenCV Haar Cascades to real-time video streams. Just follow the steps and start using Haar Cascades in your own applications where you can tolerate some false-positive detections and a bit of parameter tuning.


Transferable Visual Words: Exploiting the Semantics of Anatomical Patterns for Self-supervised Learning

In this paper, Fatemeh Haghighi and the team of authors introduce a new concept called "transferable visual words" (TransVW), which is designed to help achieve annotation efficiency for deep learning in medical image analysis. Learn about the team’s extensive experiments and the advantages that TransVW has demonstrated. The research is available as code, pre-trained models, and curated visual words.


Machine Learning with Graphs

A short course by Jure Leskovec, Computer Science, Ph.D. at Stanford University. Learn why using graphs makes sense and how to use them; proceed to explore multiple ways of choosing specific graph representations; and, look into traditional feature-based methods, such as Node, Link, and Graph for comparison.