SuperGlue, the forgotten Leaderboard

Since October 2022, no new model has been submitted to SuperGlue. What happened?

  • Not applicable: The current models, especially the multimodal ones, may have already surpassed the capabilities of the benchmark, making it less relevant.
  • Limited energy: Research teams may be focusing their resources on developing new versions of their models that are more competitive, or building ecosystems around their AI solutions.
  • Interest-driven: The current AI landscape is highly competitive and there is a trend towards quickly monetizing AI technology through user-friendly and practical applications, rather than solely focusing on technical research.
  • other reasons….

Looking back at this forgotten Leaderboard, there are still many interesting insights.

1. SuperGlue Leaderboard Version.2.0

SuperGLUE, a new benchmark designed to pose a more rigorous test of language understanding. SuperGLUE has the same high-level motivation as GLUE: to provide a simple, hard-to-game measure of progress toward general-purpose language understanding technologies for English. We anticipate that significant progress on SuperGLUE should require substantive innovations in a number of core areas of machine learning, including sample-efficient, transfer, multitask, and unsupervised or self-supervised learning. https://super.gluebenchmark.com

SuperGlue Leaderboard Version: 2.0, Until 2023-05-12
SuperGlue Leaderboard Version: 2.0, Until 2023-05-12

I Have shared this database on Notion:

At first, I believed that as time progressed, the model’s performance would continuously improve, resulting in a steady increase in score. However, the reality is not always the case, as shown in the graph, scores do not always increase steadily and have highs and lows, while high scores consistently maintain around 90 without any breakthroughs to higher scores.

It appears that two factors affect this:

  • Team strength: Teams from technology giants tend to consistently achieve high scores, while other teams with less stable organizations tend to have more unstable scores.
  • Similar models: Many of the submitted models are variants of the same model, optimized and improved upon, but there are no significant technological advancements.

On the SuperGlue Leaderboard, before October 2022, Researchers tend to prefer fine-tuning the BERT and ALBERT models, with the ALBERT-xxlarge-v2 version being the most commonly used. The scores of the three models based on the T5 model are relatively close, all around 9.0, and quite stable.


2. Teams

27 teams submitted models to the SuperGlue competition. Most of these teams were AI/Research Teams affiliated with enterprises, while others were university research teams or independent teams.

The enterprise teams primarily consisted of well-known technology giants, forming two major camps in the field: US and China. Notably, Infosys, the sole Indian tech company on the leaderboard. Is it because many technology companies in India lack strong financial resources and a mature business ecosystem locally, that they have not participated in this AI race? But, as we know, there are many Indian technical talents in AI filed, working for American companies such as Google.This seems to be a form of brain drain.

From the corresponding paper of the model, it can be seen that many of the authors are university researchers. In addition to their own research teams, companies need more scholars to provide professional academic support.

The universities and research institutions that collaborate with JD.com are the most numerous, and they include not only domestic universities but also some well-known universities overseas.

  • Wuhan University – China
  • Shanghai AI Lab – China
  • Chongqing University of Post and Telecommunications – China
  • The University of Sydney – Australia
  • Nanyang Technological University – Singapore
  • Washington University in St Louis – US

In China, people tend to pay more attention to the AI research of B.A.T(Baidu, Alibaba, Tencent) companies, perhaps because JD.com‘s business scope is relatively narrow, and few people notice their investment in the AI field. However, as of May 12th, 2023, JD.com is ranked first on the list.

With more universities comes more power? Google, Microsoft, and Baidu still achieved high rankings without collaborating with universities. Perhaps it is because their own technological capabilities are not enough that they need more external resources and support. 🤔️

From the corresponding paper of the model, there is no correlation between the numbers of author and SuperGlue Score. However, some teams with around 10 members have achieved higher rankings.The outlier point, team YI Tay (Google) has the highest number of authors, and it ranks 5. This suggests that with the open-source model, it is possible to fine-tune or modify the model with minimal time and R&D cost, requiring only a small team of researchers. Additionally, the model can be forked to create a more robust version for the SuperGlue ranking. As mentioned earlier, teams tends to prefer the ALBERT-xxlarge-v2. While it may not seem to have breakthrough research innovations on its own, the various optimization methods used can provide good inspiration.

However, it is necessary to mention this news:

Google “We Have No Moat, And Neither Does OpenAI”

https://www.semianalysis.com/p/google-we-have-no-moat-and-neither?utm_source=/search/google&utm_medium=reader2

Perhaps the open-source policy for future models will change, and small teams may no longer have an advantage in this AI competition.


3. People

After searching for the authors’ information on LinkedIn or personal websites, it was found that many of them are Asian researchers. It is not clear what their current nationalities are, but based on their names and experiences, many of them are of Chinese or Indian (descent).

In Chinese teams, there are no European and American researchers. Although JDExplore d-team has collaborations with many universities overseas, the personnel involved in these collaborations are mostly of Chinese descent.

In US teams, they have more international talents. A significant proportion of the two Microsoft teams are Asian researchers. The same situation also appears in Google’s teams.

The interesting thing is that the US AI enterprises actively ban Chinese users, such as OpenAI, however, they have a lot of core Chinese researchers. Whether for political or commercial reasons, voluntarily giving up the Chinese market will only accelerate the development of AI in China. Just like JD.COM ranked first on the SuperGlue leaderboard.

For US companies, they should focus on internal risks as well, as the loss of research talent can create direct and formidable competitors. As we know, Google > OpenAI > Anthropic, the competitor is oneself.

So far, many authors(researchers) have left their original research teams and transitioned to new job positions or startups. Especially notable are Google and OpenAI, where a significant number of departures have occurred, resulting in the outflow of AI talents to the market.

In 2020, OpenAI reached a peak, and a lot of senior members left to founded the new AI startup- Anthropic.

In 2022, a lot of Google’s researchers have left. Google remains the top talent pool for AI and has attracted the attention of numerous companies. When ChatGPT bloomed in the AI field, Google did not manage to retain these talents effectively to compete with OpenAI. Now, they have merged the Google Brain team and the DeepMind team.

Where they went

The three main destinations for departing employees:

  1. Universities: Some researchers return to universities to continue their studies (for Postdoc) or pursue teaching positions(as an assistant professor). Engaging in pure academic research.
  2. Startups: Most researchers decided to start their own ventures or join early-stage startups. Since the emergence of successful entrepreneurial companies like OpenAI, Anthropic, Character.AI… It has attracted more people to venture into it.
  3. Joining other companies: A small percentage of people still choose to join technology giants.

After conducting online searches for authors’ information, it is evident that Google, Meta, and Microsoft have dedicated pages introducing their research staff. These pages provide comprehensive details about the researchers, including their backgrounds, and areas of expertise. However, in China, it is challenging to find similar pages dedicated to researchers in tech giants like Baidu and Alibaba. It appears that there is a trend in China to emphasize company achievements while downplaying individual accomplishments. It is possible that Chinese companies consider talents to be their greatest secret weapon and treasure, leading them to protect and conceal their researchers’ information.

CompanyPeople page
Metahttps://research.facebook.com/people/
Googlehttps://research.google/people/
Microsofthttps://www.microsoft.com/en-us/research/people/
Alibabanull
Baidunull
Huaweinull
JD Explore Academynull

As we know, LinkedIn is is a business and employment-focused social media platform. However, in Chinese culture, interpersonal relationships and personal networks tend to rely more on personal connections and reputation, rather than a business social media.

LinkedIn will shut down InCareer service in China on 2023-08-09:

After careful consideration, we’ve made the decision to discontinue InCareer effective August 9, 2023. Despite our initial progress, InCareer faced fierce competition and a challenging macroeconomic climate, which ultimately led us to the decision of discontinuing the service. https://www.linkedin.com/help/linkedin/answer/a534125?lang=en

The number of Chinese team authors with LinkedIn profiles is lower than the average, even though many Chinese researchers are working board.

Furthermore, some senior professors or scientists who are older do not have LinkedIn profiles. It seems that they deem social platforms unnecessary for their professional requirements.

Google Scholar Profiles provide a simple way for authors to showcase their academic publications. It’s an important platform for researchers, most researchers verified their Profiles on Google Scholar. But the number of Chinese team authors with Google Scholar Profiles is also lower than the average.

Almost half of the authors have their own personal websites where they present their biographies and research work. In addition to creating personal websites with custom domains, many researchers opt for GitHub as a popular platform to build profiles and share their research projects. And Chinese researchers rarely have personal websites, making it difficult to find their private contact information.

In conclusion, the easy accessibility of information regarding relevant researchers may contribute to the higher rate of talent mobility and susceptibility to being poached by competitors among researchers in Europe and America.

Summary

SuperGlue has not received new models since October 2022. Possible reasons include the benchmark’s decreasing relevance, limited resources, and a trend towards practical applications. The leaderboard shows that teams from technology giants tend to achieve higher scores, and many submitted models are variants of the same model. Most teams are affiliated with enterprises, with US and Chinese teams dominating. Talent competition remains present, especially in European and American companies, where entrepreneurship continues to be a mainstream trend. However, in China, non-compete agreements restrict talent mobility while simultaneously ensuring competitiveness for companies.

P.S

  1. Creating Sankey diagrams on Tableau is complex, despite there being a beta version of the Sankey chart on Tableau Public for testing, it‘s still challenging to use. I have to use alternative platforms like PlotDB to create Sankey diagrams.
  2. Notion AI is very useful.