Beware of Deepfake audio scams

As ai moves forwards, there will be more and more misuse of technology. Recently, an audio Deepfake of a CEO’s voice was used in a $243,000 scam.

Here is a sample of what audio syncretization can achieve (the example is not related to the scam):

In 2018, The Goolge’s Tacotron team created a text to speech synthesizer that can copy your voice after listening just 5 seconds. https://google.github.io/tacotron/publications/speaker_adaptation/

Reference:

Synthesized:

Berkley Deep Unsupervised Learning

CS294-158 Deep Unsupervised Learning Spring 2019

Lectures, papers and assignments for Berkeyls Deep Unsupervised Learning Course are now available here:
https://sites.google.com/view/berkeley-cs294-158-sp19/home.

The course deals with 2 areas of deep learning, namely Deep Generative Models and Self-supervised Learning.

Topics are:

  • Generative adversarial networks
  • variational autoencoders
  • autoregressive models
  • flow models
  • energy based models
  • compression
  • self-supervised learning
  • semi-supervised learning.

The course is currently ongoing so not all lectures are available yet.

Creating a Dataset from Google Image Search Results

If you wish to create an image classifier and want to use the data from Google Image Search results, and want to exclude some of the images, you can use this bookmarklet gi2ds (drag it to your bookmarks bar and click on it after your search). Then you can click on the images you want to exclude. A list is generated for you with all the relevant image-urls for you to process further.

gi2ds is intened to help you when creating an image dataset based on a google images query. It allows you to exclude images that are not relevant by toggling them on and off by clicking on them. Default is that all images are included. The urls are found in a popup down to the right. To get all available images you need to scroll all the way down for more images to load, also pressing the show more results button and continuing scrolling in order to get all the pictures available.

For more info, the code, see GitHub

Inspiration comes from this years fast.ai course (v3) where i am attending as an International Fellow. The course will be available to the public in January 2019

 

Artificial Intelligence Competition Leaderboard

I have not seen this cool leaderboard for AI challenges before.
https://leaderboard.allenai.org/

There are a few very interesting similar competition leaderboards for machine learning such as Kaggle and Numerai. Allenai host right now 4 interesting NLP challenges.

Here is the description of one of the challenges:

OpenBookQA: Open Book Question Answering

OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject. It consists of 5,957 multiple-choice elementary-level science questions (4,957 train, 500 dev, 500 test), which probe the understanding of a small “book” of 1,326 core science facts and the application of these facts to novel situations. For training, the dataset includes a mapping from each question to the core science fact it was designed to probe. Answering OpenBookQA questions requires additional broad common knowledge, not contained in the book. The questions, by design, are answered incorrectly by both a retrieval-based algorithm and a word co-occurrence algorithm. Strong neural baselines achieve around 50% on OpenBookQA, leaving a large gap to the 92% accuracy of crowd-workers.

Google Dataset Search

This is quite cool. Google has released a search tool for finding datasets! https://toolbox.google.com/datasetsearch

You can for instance find world surface temperature data, real-time assessment of hybridization between wolves and dogs, lot’s of x-ray datasets or data from breast cancer screenings etc…

The data seems to come from a lot of research projects where they have used different machine learning techniques to analyse the data.

Now that we have a lot better means of using machine learning and we have easy access to a lot of related data and our compute power has increased dramatically it might be that we will see quite a few improvements to older research results. I welcome this initiative and believe that the world will become a better place due to us collectively solving the worlds many problems using AI.