Local AI Processing Breaks New Ground with Qwen 3.6:35b-a3b on 3090 GPU

A groundbreaking discovery has been made in the realm of local AI processing, shattering expectations and pushing the boundaries of what is possible. After exploring various models, including Anthropic and OpenAI, and encountering disappointing results with local options, the Qwen 3.6:35b-a3b model has emerged as a game-changer.

By leveraging a used gaming rig equipped with a powerful 3090 GPU, a user successfully set up the Qwen model, which had been optimized to 20GB. The initial performance was impressive, with the model handling 15 transactions per second (tps) when stored in system RAM. However, when transferred to VRAM, its performance skyrocketed to an astonishing 160tps.

The true test of the model’s capabilities came when it was fed an image, which it processed in a mere 75 seconds. This remarkable feat was achieved while simultaneously streaming a transcoded movie via Plex, demonstrating the incredible capabilities of the 3090 GPU and its ability to handle multiple demanding tasks with ease.

This breakthrough has significant implications for those interested in local AI processing, highlighting the potential of the 3090 GPU to unlock new possibilities, even when used with older hardware. As the field of local AI processing continues to evolve, this discovery is poised to inspire further innovation and exploration.

Photo by Anoop VS on Pexels
Photos provided by Pexels