Deep Cogito has announced the release of Cogito v2, a suite of open-source AI models engineered to enhance reasoning capabilities through a focus on internalizing the reasoning process itself. The Cogito v2 family features four hybrid reasoning models, spanning from mid-sized versions with 70B and 109B parameters to large-scale models with 405B and 671B parameters. The 671B model, a Mixture-of-Experts (MoE) architecture, is positioned as one of the most potent open-source AIs currently available.
Cogito v2 leverages Iterated Distillation and Amplification (IDA) to refine its internal understanding. IDA essentially distills insights gleaned from search processes and integrates them directly into the model’s core parameters, fostering a more robust “intuition.” This approach results in reasoning chains that are reportedly 60% more concise compared to those of competing models. Deep Cogito claims the entire model family was developed for less than $3.5 million.
A key focus during the training of the 671B model was optimizing the thinking process. The model was trained to reduce unfocused exploration and instead prioritize direct, efficient paths to solutions. Notably, the models also exhibited an unexpected aptitude for image reasoning, even without specific training in that area. Deep Cogito intends to further enhance its models through continuous self-improvement and reaffirms its dedication to the open-source AI movement.