TigerBot Technology Releases Multi-Modal large-Scale Model - 2023
On June 7, TigerBot officially released and open-sourced its self-developed large model TigerBot. At the same time, Hubo Technology has also released a full set of APIs required for the development of large-scale model applications, as well as multi-field professional data.
NLP (Natural Language Processing)
Founded in 2017, Hubo Technology is an AI company focusing on deep learning and NLP (Natural Language Processing) technology, which has been reported by 36 Krypton many times. Previously, after years of technological exploration and accumulation, Hubo Technology has already possessed key technologies including intelligent search, intelligent recommendation, machine reading comprehension, summarization, translation, public opinion analysis, and writing, as well as professional information data accumulation in various industries around the world.
In November 2022, OpenAI released ChatGPT, which set off a global wave of AI large-scale technology. There have also been many attempts in the Chinese market to develop AI large-scale models by themselves, and Hubo Technology is one of them.
The self-developed large model TigerBot released by TigerBot this time is a multi-language multi-task large-scale language model. After 3 months of closed development and more than 3000 experimental iterations, the first MVP version has been iterated.
Functionally, Tigerbot already includes most of the ability to generate and understand classes, including several parts:
- Content generation: assist users to solve creative problems, and quickly generate marketing copy, comments, press releases, etc. In addition, Tigetbot also supports image generation - the model can realize Vincent diagram, illustration creation, etc.
- Open Q&A: Users ask Tigerbot questions, such as cooking strategies, long text summaries, text understanding, character dialogue, polishing, etc.
- Extract information: such as purposefully obtaining key information, extracting numbers, main content, etc.
3-Month Closed Development
Behind the 3-month closed development is a capable team. Hubo Technology founder and CEO Chen Ye told 36 Krypton that in the research and development of large models, the Hubo team paid tribute to the classic "garage startup" model in Silicon Valley. The team started with only 5 people, and the CEO served as the chief programmer and chief AI scientist at the same time.
"In the research and development of large models, we firmly believe that the top team can play a role. The team size does not need to be too large, but the technology needs to be strong. During our research and development process from 0 to 1, our core research and development team has been maintained at 4- 5 people, and the research and development status of close cooperation." Chen Ye said.
From the model effect point of view, Tigerbot is evaluated on the public NLP data set of the OpenAI InstructGPT paper. TigerBot-7B corresponds to the 6B version of OpenAI with the same scale, and its comprehensive performance can reach 96% of the OpenAI effect.
Hubo Technology
In the past three months, Hubo Technology has made a series of optimizations on the model architecture and algorithm side mainly based on the two open-source models of GPT and BLOOM. Chen Ye told 36 Krypton that Hubo Technology’s technological breakthroughs mainly focus on original supervision and fine-tuning methods. "From an overall technical point of view, the method of supervised fine-tuning is the core of the large model, which can affect 70% to 80% of the effect of the model."
For example, after the wave of large models came, one of the problems that plagued the industry was the "illusion" of large models—that is, the output results of large models were as natural as human speech, but they would be "nonsense" at the factual level.
Chen Ye took a practical case as an example. To solve this problem, Hubo applied some classic supervised learning methods, such as Ensemble and Probabilistic Modeling, and combined them into large models.
"Assuming that humans are asking the model a factual question, Tigerbot will not simply generate natural language, but simultaneously use a smaller amount of data to know human intentions - in terms of answers, it will better balance facts and creativity .” Chen Ye added.
The result of this is that the computing power and data consumption of the machine in the training model will be smaller than that of the same grade model.
For the Chinese context, Hubo Technology has made targeted algorithm optimization from the tokenizer (Tokenizer) to the training algorithm, so that the model can better understand Chinese instructions and improve the Chinese cultural attributes of the question-and-answer results.
Parallel Training
In terms of parallel training, Hubo's large model team has also broken through some memory and communication problems in mainstream frameworks such as deep-speed, making it possible to train for several months without interruption in a kilocalorie environment.
Open-Source Route
Chinese NLP technology
Pan-Financial Field
Implementation of large Models
Yuxin Semiconductor providing car-spec MCU products for central integrated architecture