[ISSUE #2] TextToSQL, Philosophical GPT-3, VirTex, Programming Language Tranlsation

Transformers for the win!

Jun 15, 2020

In this newsletter issue, I tried to cover a comprehensive mix of everything that I’ve come across in the past week. I changed the organization style a bit. I thought this would be more appealing than a single wall of text. Let me know what you think of it :)

Articles 📄

[Text-to-SQL] Learning to query tables with natural language 👀

The author narrows the problem down to the major parts - (SELECT statement, WHERE conditions, aggregations and joins) and as recent research suggests, moulds the problem as a classification task using attention models like BERT on a crowd-sourced dataset - WikiSQL - released by Salesforce in 2017.

She also discusses the real-life problems arising in production such as multiple ways to write the same natural language query (NLQ), enterprise databases requiring multiple JOIN statements, mismatch between NLQ and database values etc. Further she goes on to propose concrete solutions to these problems.

Attention helps the model focus on the query words (or subwords) that are most relevant to each label during the training.

The Obligatory GPT-3 Post 🗨️

This indifferent, practical and non-mathematical take on the GPT-3 really realigns your perspective on advances in the NLP domain. It's a rather philosophical piece which poses some really good questions and tickles your mind from time to time. A must read for everyone!

The author discusses the coherency of the model in terms of poetry, writing samples and the ability to perform basic addition. I'll just quote two paragraphs which should grind your gears and personally made me laugh out loud -

For me the scary part isn’t the much larger GPT we’ll probably have in a few years. It’s the discovery that even very complicated AIs get smarter as they get bigger. If someone ever invented an AI that did do more than text prediction, it would have a pretty fast takeoff, going from toy to superintelligence in just a few years.
Speaking of which – can anything based on GPT-like principles ever produce superintelligent output? How would this happen? If it’s trying to mimic what a human can write, then no matter how intelligent it is “under the hood”, all that intelligence will only get applied to becoming better and better at predicting what kind of dumb stuff a normal-intelligence human would say.

Papers 📝

Short summaries of interesting papers I read over the last week.

🌀 VirTex: Learning Visual Representations from Textual Annotations [Desai and Johnson]

[Paper] [Code]

Many of the downstream computer vision tasks (object detection, instance segmentation, image captioning etc.) rely on visual representations encoded in the various networks pretrained on the vast ImageNet dataset. The authors believe that mapping an image to it's well-defined caption can generate richer visual representations - and that too on a comparatively smaller dataset (10x fewer images)

One cool thing that the authors have provided is a visualization of where the model is focusing during prediction. The focus rightly shifts downwards when it encounters "on a surfboard" suggesting a richer understanding of the image unlike a sparse representation like "dog" or "surfboard".

🔄 Unsupervised Translation of Programming Languages [Researchers from Facebook AI Research]

[Paper]

This paper takes advantage of recent advancements in unsupervised machine translation. A single model (seq2seq with attention) is used to encode the 3 programming languages (C++, Python and Java) into the same latent space hoping semantically similar constructs would lie closer to each other. The training happens on code scraped from Github for each of the 3 languages and is evaluated on a test set scraped from GeeksForGeeks.

It'll be interesting to try this translation out on "code in the wild" since I'm sure we'll encounter list comprehensions and lot of “less clean” and unorganized code. The rate at which progress is happening in the NLP field, I'm sure we'll be able to tackle that in the future 😉

Interesting things around the world! 🌏

OpenAI API: OpenAI has released an API to access new models developed by them in the Natural Language domain - beta.openai.com
Summary of Transformers: If you are left baffled with the numerous types of transformers available (BERT, ALBERT, etc.) and want some clarification, your search ends here. HuggingFace has included it in their documentation.
Deepsounds AI: It provides AI generated piano pieces which you can download as MIDI and sheet music. It is based on Google's Music Transformer released in 2018. Here's an eerie sample it generated for me - https://www.deepsoundsai.com/song/8f3tiw
Facial Recognition in trouble: IBM will no longer offer, develop, or research facial recognition technology in light of racial profiling, mass surveillance etc. On a similar note, Amazon will halt use of its facial recognition tool, Rekognition, for one year.

If you would like a copy of this issue in your mailbox every week, consider subscribing 🤗You can unsubscribe as easily if you feel it’s overwhelming.

The Layman's Guide to AI

Discussion about this post