Close
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
Read Down
Sign in
Close
Welcome!Log into your account
Forgot your password?
Read Down
Password recovery
Recover your password
Close
Search
Logo
Logo
  • Latest News
  • Artificial Intelligence
  • Big Data and Analytics
  • Cloud
  • Networking
  • Cybersecurity
  • Applications
  • IT Management
  • Storage
  • Sponsored
  • Mobile
  • Small Business
  • Development
  • Database
  • Servers
  • Android
  • Apple
  • Innovation
  • Blogs
  • PC Hardware
  • Reviews
  • Search Engines
  • Virtualization
More
    Home Applications
    • Applications
    • Big Data and Analytics
    • IT Management

    4 Reasons Transformer Models are Optimal for NLP

    By getting pre-trained on massive levels of text, transformer-based AI architectures become powerful language models capable of accurately understanding and making predictions based on text analysis.

    By
    eWEEK EDITORS
    -
    December 8, 2021
    Share
    Facebook
    Twitter
    Linkedin
      enterprise IT

      Since their initial development in the seminal AI research paper Attention Is All You Need, transformer-based architectures have completely redefined the field of Natural Language Processing (NLP) and set the state of the art for numerous AI benchmarks and tasks. 

      What are transformer models? They’re an advanced artificial intelligence model that has benefited from an “education” the likes of which some dozen humans might gain in a lifetime.

      Transformer architectures are typically trained in a semi-supervised manner on a massive amount of text—think English Wikipedia, thousands of books, or even the entire Internet. By digesting these massive corpora of text, transformer-based architectures become powerful language models (LM) capable of accurately understanding and performing predictive analytics based on textual analysis. 

      In essence, this level of exhaustive training allows transformer models to approximate human text cognition – reading – at a remarkable level. That is, not merely simple comprehension but (at best) making upper level connections about the text.

      Recently, it has been shown that these impressive learning models can also quickly be fine-tuned for upper level tasks such as sentiment analysis, duplicate question detection, and other text-based cognitive tasks. Additional model training on some separate dataset/task relative to what the model was originally trained on allows the parameters of the network to be slightly modified for the new task. 

      More often than not, this results in better performance and faster training than if the same model had been trained from scratch on the same dataset and task. 

      Also see: Top 10 Text Analysis Solutions 

      Benefits of Transformer Models

      1) Great with Sequential Data 

      Transformer models are excellent at dealing with the challenges involved with sequential data. Because of this, they act as an encoder-decoder framework, where data is mapped to a representational space by the encoder. Then they are mapped to the output by way of the decoder. This makes them scale well to parallel processing hardware like GPUs – a processor that is super-charged to drive AI software forward.  

      2) Pre-Trained Transformers

      Pre-trained transformers can be developed to quickly perform related tasks. This is because transformers already have a deep understanding of language, which allows training to focus on learning whatever goal you have in mind. For example, named-entity recognition, language generation, or conceptual focus. Their pre-training makes them particularly versatile and capable. 

      3) Gain Out-of-the-Box Functionality

      By fine-tuning your pre-trained transformers, you can gain high performance out of the box, without enormous investment. In comparison, training from scratch would take longer, and use orders of magnitude more compute and energy just to reach the same performance metrics. 

      4) Sentiment Analysis Optimization

      Transformer models enable you to take a large-scale LM (language model) trained on a massive amount of text (the complete works of Shakespeare), then update the model for a specific conceptual task, far beyond mere “reading,” such as sentiment analysis and even predictive analysis. 

      This tends to result in a significantly better performance because the pre-trained model already understands language really well, so it just has to learn the specific task, versus trying to learn both language and the task at the same time.

      Looking Ahead: Redefining the Field of NLP

      Since their early emergence, transformers have become the de facto standard for tasks like question answering, language generation, and named-entity generation. Though it’s hard to predict the future when it comes to AI, it’s reasonable to assume that transformer models bears close focus as a next-gen emerging technology. 

      Most significant, arguably, is their ability to allow machine learning models to not only approximate the nuance and comprehension of human reading, but to far surpass human cognition at many levels – far beyond mere quantity and speed improvements.

      About the Author: 

      Dylan Fox is the CEO of AssemblyAI

      eWEEK EDITORS
      eWeek editors publish top thought leaders and leading experts in emerging technology across a wide variety of Enterprise B2B sectors. Our focus is providing actionable information for today’s technology decision makers.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.
      Get the Free Newsletter!
      Subscribe to Daily Tech Insider for top news, trends & analysis
      This email address is invalid.

      MOST POPULAR ARTICLES

      Latest News

      Zeus Kerravala on Networking: Multicloud, 5G, and...

      James Maguire - December 16, 2022 0
      I spoke with Zeus Kerravala, industry analyst at ZK Research, about the rapid changes in enterprise networking, as tech advances and digital transformation prompt...
      Read more
      Applications

      Datadog President Amit Agarwal on Trends in...

      James Maguire - November 11, 2022 0
      I spoke with Amit Agarwal, President of Datadog, about infrastructure observability, from current trends to key challenges to the future of this rapidly growing...
      Read more
      IT Management

      Intuit’s Nhung Ho on AI for the...

      James Maguire - May 13, 2022 0
      I spoke with Nhung Ho, Vice President of AI at Intuit, about adoption of AI in the small and medium-sized business market, and how...
      Read more
      Applications

      Kyndryl’s Nicolas Sekkaki on Handling AI and...

      James Maguire - November 9, 2022 0
      I spoke with Nicolas Sekkaki, Group Practice Leader for Applications, Data and AI at Kyndryl, about how companies can boost both their AI and...
      Read more
      Cloud

      IGEL CEO Jed Ayres on Edge and...

      James Maguire - June 14, 2022 0
      I spoke with Jed Ayres, CEO of IGEL, about the endpoint sector, and an open source OS for the cloud; we also spoke about...
      Read more
      Logo

      eWeek has the latest technology news and analysis, buying guides, and product reviews for IT professionals and technology buyers. The site’s focus is on innovative solutions and covering in-depth technical content. eWeek stays on the cutting edge of technology news and IT trends through interviews and expert analysis. Gain insight from top innovators and thought leaders in the fields of IT, business, enterprise software, startups, and more.

      Facebook
      Linkedin
      RSS
      Twitter
      Youtube

      Advertisers

      Advertise with TechnologyAdvice on eWeek and our other IT-focused platforms.

      Advertise with Us

      Menu

      • About eWeek
      • Subscribe to our Newsletter
      • Latest News

      Our Brands

      • Privacy Policy
      • Terms
      • About
      • Contact
      • Advertise
      • Sitemap
      • California – Do Not Sell My Information

      Property of TechnologyAdvice.
      © 2022 TechnologyAdvice. All Rights Reserved

      Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.

      ×