Ready for Round 2? There are a lot of interesting articles in this round up.
Data Science Methods
When we have a huge sample set to analyze, correlation between variables can often be a tricky concept to analyze. Here is a 7 ways to view correlation, compiled by SAS:
- The sum of crossproducts
- The inner product of standardized vectors
- Angle between two vectors
- The standardized covariance
- Slope of the regression line between two standardized variables
- Geometric mean of regression slopes
Another study done by storybench which has been interviewing data specialists for three years came to the following conclusion about their studies. “Based on the interviews analyzed, three recurring themes emerged: Collaborative, open and mobile.” – link.
- Collaborative, team-based story-building
- The new open-source echos
- Mobile-focused ideation
Spotify’s Discover Weekly algorithm is very complex and based on three machine learning models (link):
- Collaborative Filtering models: analyze your and others’ behavior (if two people listen to many common songs, chance are they will enjoy the other person’s playlist)
- Natural Language Procession: analyzing text (analyze lyrics, blogs, articles to determinate what adjectives are used to describe the song; how to categorize it)
- Audio models: take raw audio and transform into matrix, then use CNN to categorize the song.
A computer was asked to predict which start-ups would be successful. The results were astonishing In this article, we see that an 8 years old prediction using computing power was able to accurately spot startups like Evernote, Spotify, Etsy, Zynga, Palantir and more as future big tech giants. They are looking to create a new list. The methology this time will use up to 50000 private companies’ data and categorize them based on investors and theme. (link) Some interesting findings:
- Augmented reality will be far more significant than virtual reality because it will shape the way we look at and interact with the world around us.
- Image recognition and mapping technologies will be deployed across the auto industry as traditional car manufacturers adapt to self-driving vehicles.
- There continue to be major market opportunities in e-commerce as fashion becomes increasingly mobile and social.
Learning Things We Already Know About Stocks This project done in R groups together in a network that highlights associations within and between the groups using only historical price data. As expected, stocks are grouped together into business sectors.
“We downloaded daily closing stock prices for 100 stocks from the S&P 500, and, using basic tools of statistics and analysis like correlation and regularization, we grouped the stocks together in a network that highlights associations within and between the groups. The structure teased out of the stock price data is reasonably intuitive.” – (link)
Deep Reinforcement Learning
With the continuous progress of DRL, here are a few new research I found especially interesting:
- Unity Machine Learning Agents: Unity introduces new agents to simulate a game playing environment for research in RL. (link)
- Machine Learning for Flappy Bird: uses Neural Network and Genetic Algorithm (link)
- Sudoku Solover: uses ARkit, a CNN to recognize the sudoku shown on camera and automatically show the answers on screen (link)
Without surprise, technology and social media is changing the nature of society. In this study, we see how online dating becoming one of the dominant way to meet a soul mate in the digital age. (link)
What happens when algorithms design a concert hall? The stunning Elbphiharmonie? (link)
What New York Subway Stations Actually Look Like (link)
‘220 Mini Metros’ Illustrates Metro and Train Networks from Around the World (link)
Software Engineers Map All the Buildings in the Netherlands (link)
Machine Visions: Exploring Visual Motifs in Wes Anderson Films I love well presented presentation with interactive data visualization that goes with it. This machine-visions webpage does exactly that and it is totally fun to go through it again and again. (link)
A quick explanation of what the Open-Source project Semiotic can do: Transform data plots into a “sketchy version”! (link)