TigerBot Technology Releases Multi-Modal large-Scale Model - 2023

TigerBot Technology Releases Multi-Modal large-Scale Model - 2023

On June 7, TigerBot officially released and open-sourced its self-developed large model TigerBot. At the same time, Hubo Technology has also released a full set of APIs required for the development of large-scale model applications, as well as multi-field professional data.




NLP (Natural Language Processing)

Founded in 2017, Hubo Technology is an AI company focusing on deep learning and NLP (Natural Language Processing) technology, which has been reported by 36 Krypton many times. Previously, after years of technological exploration and accumulation, Hubo Technology has already possessed key technologies including intelligent search, intelligent recommendation, machine reading comprehension, summarization, translation, public opinion analysis, and writing, as well as professional information data accumulation in various industries around the world.

In November 2022, OpenAI released ChatGPT, which set off a global wave of AI large-scale technology. There have also been many attempts in the Chinese market to develop AI large-scale models by themselves, and Hubo Technology is one of them.

The self-developed large model TigerBot released by TigerBot this time is a multi-language multi-task large-scale language model. After 3 months of closed development and more than 3000 experimental iterations, the first MVP version has been iterated.

Functionally, Tigerbot already includes most of the ability to generate and understand classes, including several parts:

  1. Content generation: assist users to solve creative problems, and quickly generate marketing copy, comments, press releases, etc. In addition, Tigetbot also supports image generation - the model can realize Vincent diagram, illustration creation, etc.
  2. Open Q&A: Users ask Tigerbot questions, such as cooking strategies, long text summaries, text understanding, character dialogue, polishing, etc.
  3. Extract information: such as purposefully obtaining key information, extracting numbers, main content, etc.



3-Month Closed Development

Behind the 3-month closed development is a capable team. Hubo Technology founder and CEO Chen Ye told 36 Krypton that in the research and development of large models, the Hubo team paid tribute to the classic "garage startup" model in Silicon Valley. The team started with only 5 people, and the CEO served as the chief programmer and chief AI scientist at the same time.

"In the research and development of large models, we firmly believe that the top team can play a role. The team size does not need to be too large, but the technology needs to be strong. During our research and development process from 0 to 1, our core research and development team has been maintained at 4- 5 people, and the research and development status of close cooperation." Chen Ye said.

From the model effect point of view, Tigerbot is evaluated on the public NLP data set of the OpenAI InstructGPT paper. TigerBot-7B corresponds to the 6B version of OpenAI with the same scale, and its comprehensive performance can reach 96% of the OpenAI effect.



Hubo Technology

In the past three months, Hubo Technology has made a series of optimizations on the model architecture and algorithm side mainly based on the two open-source models of GPT and BLOOM. Chen Ye told 36 Krypton that Hubo Technology’s technological breakthroughs mainly focus on original supervision and fine-tuning methods. "From an overall technical point of view, the method of supervised fine-tuning is the core of the large model, which can affect 70% to 80% of the effect of the model."

For example, after the wave of large models came, one of the problems that plagued the industry was the "illusion" of large models—that is, the output results of large models were as natural as human speech, but they would be "nonsense" at the factual level.

Chen Ye took a practical case as an example. To solve this problem, Hubo applied some classic supervised learning methods, such as Ensemble and Probabilistic Modeling, and combined them into large models.

"Assuming that humans are asking the model a factual question, Tigerbot will not simply generate natural language, but simultaneously use a smaller amount of data to know human intentions - in terms of answers, it will better balance facts and creativity .” Chen Ye added. 

The result of this is that the computing power and data consumption of the machine in the training model will be smaller than that of the same grade model.

For the Chinese context, Hubo Technology has made targeted algorithm optimization from the tokenizer (Tokenizer) to the training algorithm, so that the model can better understand Chinese instructions and improve the Chinese cultural attributes of the question-and-answer results.




Parallel Training

In terms of parallel training, Hubo's large model team has also broken through some memory and communication problems in mainstream frameworks such as deep-speed, making it possible to train for several months without interruption in a kilocalorie environment.




Open-Source Route

Hubo Technology chooses to take the open-source route in the development of large models. This open source content includes three parts: model, code, and data, including TigerBot-7B-soft, TigerBot-7B-base, TigerBot-180B-research, and other model versions; basic training and coverage of dual card inference 180B model quantification and inference Code; and up to 100G of pre-training data, supervised fine-tuning of 1G/1 million pieces of data.
At present, these contents have all been published in Github ( link here ). The reason for choosing the open source route, Chen Ye said, technological changes that promote human civilization often originate from instinct, intuition, and chance, and having a free and innovative spirit is fundamental.


"Large model technology is like an emerging discipline, which is subversive and long-term. The future possibility exceeds PC and the Internet. At this stage, it may not be necessary to discuss products, applications, scenarios, and commercialization prematurely and too rationally. What's more important is to promote this original breakthrough in artificial intelligence infrastructure and promote the development and update of technology."

For the above considerations, Hubo has also opened up a systematic Chinese data collection and cleaning methodology in addition to a part of the accumulated pre-training data sets.

Chen Ye does not think that data will become a barrier: "What is more important is the team's theoretical and systematic level of data cleaning. This is a long-term systematic project."




Chinese NLP technology

Since its establishment, Hubo Technology has focused on Chinese NLP technology and product development and has accumulated a large amount of high-quality Chinese pre-training data. The 100G pre-training data released this time is part of it. In the future, Hubo will also open a large amount of professional data in the fields of finance, law, and encyclopedias for use by application developers.


Pan-Financial Field

In the past few years, based on NLP, Hubo Technology has developed NLP products mainly for the pan-financial field, such as public opinion monitoring, search, knowledge graph, etc., and has also used APIs to serve B-end customers. The release of this large model will also be combined with Tiger Bo’s business. At present, Tiger Bo Technology has provided functional modules including content generation for old customers. 

Chen Ye said that after the arrival of the wave of large-scale model technology, the market side felt that "the speed of customer decision-making is faster than before, and the speed of product landing is also faster."




Implementation of large Models

In the future, Hubo Technology will continue to invest in the development and implementation of large models. Chen Ye talked about some functions that are being developed or being improved, such as research assistant TigerDoc, cultural creation, marketing tools, etc. Hubo Technology is also internally testing some personal assistant products.
Yuxin Semiconductor providing car-spec MCU products for central integrated architecture

Post a Comment

Previous Post Next Post