These startups are building cutting-edge AI models without the need for a data center
Researchers have utilized GPUs distributed globally, combined with private and public data, to train a new type of large language model (LLM). This move indicates that the mainstream approach to building artificial intelligence may be disrupted.
Two unconventional AI-building startups, Flower AI and Vana, collaborated to develop this new model, named Collective-1.
Flower's developed technology allows the training process to be distributed across hundreds of connected computers over the internet. The company's tech has been used by some firms to train AI models without the need for centralized computing resources or data. Vana, on the other hand, provided data sources such as private messages on X, Reddit, and Telegram.
By modern standards, Collective-1 is relatively small-scale, with 7 billion parameters—these parameters collectively empower the model—compared to today's most advanced models (such as those powering ChatGPT, Claude, and Gemini) with hundreds of billion parameters.
Nic Lane, a computer scientist at the University of Cambridge and co-founder of Flower AI, stated that this distributed approach is expected to scale well beyond Collective-1. Lane added that Flower AI is currently training a 300 billion parameter model with conventional data and plans to train a 1 trillion parameter model later this year—approaching the scale offered by industry leaders. "This could fundamentally change people's perception of AI, so we are going all-in," Lane said. He also mentioned that the startup is incorporating images and audio into training to create multimodal models.
Distributed model building may also shake up the power dynamics shaping the AI industry.
Currently, AI companies construct models by combining massive training data with large-scale computing resources centralized in data centers. These data centers are equipped with cutting-edge GPUs and interconnected via ultra-high-speed fiber-optic cables. They also heavily rely on datasets created by scraping public (though sometimes copyrighted) materials such as websites and books.
This approach implies that only the wealthiest companies and nations with a large number of powerful chips can effectively develop the most robust, valuable models. Even open-source models like Meta's Llama and DeepSeek's R1 are constructed by companies with large data centers. A distributed approach could allow small companies and universities to build advanced AI by aggregating homogeneous resources. Alternatively, it could enable countries lacking traditional infrastructure to build stronger models by networking multiple data centers.
Lane believes that the AI industry will increasingly move towards allowing training in novel ways that break out of a single data center. The distributed approach "allows you to scale computation in a more elegant way than a data center model," he said.
Helen Toner, an AI governance expert at the Emerging Technology Security Center, stated that Flower AI's approach is "interesting and potentially quite relevant" to AI competition and governance. "It may be hard to keep up at the cutting edge, but it may be an interesting fast-follower approach," Toner said.
Divide and Conquer
Distributed AI training involves rethinking how computation is allocated to build powerful AI systems. Creating LLMs requires feeding a model large amounts of text, adjusting its parameters to generate useful responses to prompts. In a data center, the training process is segmented to run parts of tasks on different GPUs and then periodically aggregated into a single master model.
The new approach allows work typically done in large data centers to be performed on hardware potentially miles apart and connected by relatively slow or unreliable internet connections.
Some major companies are also exploring distributed learning. Last year, Google researchers demonstrated a new scheme called DIstributed PAth COmposition (DiPaCo) for segmenting and integrating computation to make distributed learning more efficient.
To build Collective-1 and other LLMs, Lane collaborated with academic partners in the UK and China to develop a new tool called Photon to make distributed training more efficient. Lane stated that Photon enhances Google's approach by adopting a more efficient data representation and shared and integrated training schemes. This process is slower than traditional training but more flexible, allowing for the addition of new hardware to accelerate training, Lane said.
Photon was developed through a collaboration between researchers at Beijing University of Posts and Telecommunications and Zhejiang University. The team released the tool under an open-source license last month, allowing anyone to use this approach.
As part of Flower AI's efforts in building Collective-1, their partner Vana is developing a new method for users to share their personal data with AI builders. Vana's software enables users to contribute private data from platforms like X and Reddit to the training of large language models, specifying potential final uses and even receiving financial benefits from their contributions.
Anna Kazlauskas, co-founder of Vana, stated that the idea is to make unused data available for AI training while giving users more control over how their information is used in AI. "This data is usually unable to be included in AI models because it's not public," Kazlauskas said. "This is the first time that data contributed directly by users is being used to train foundational models, with users owning the AI model created from their data."
University College London computer scientist Mirco Musolesi has suggested that a key benefit of distributed AI training approaches may be unlocking novel data. "Extending this to cutting-edge models will allow the AI industry to leverage vast amounts of distributed and privacy-sensitive data, such as in healthcare and finance, for training without the risks of centralization," he said.
You may also like

Kyle Samani's Exit Scam, Is There More to the Story?

February 10th Market Key Intelligence, How Much Did You Miss?

Tokenomics New Paradigm? When Backpack Starts Enabling VCs to "Deferred Gratification"

BankrCoin Achieves New Milestones as YZi Labs and ETH Investors Make Significant Moves
Key Takeaways BankrCoin (BNKR) hit a new all-time high with significant market activity. YZi Labs executed a major…

Bitcoin Tests $75K Amid Market Predictions
Key Takeaways Bitcoin shows a 47% chance to test the $75,000 mark this February, contrasting with a potential…

MrBeast Acquires Step to Enhance Financial Offerings for Youth
Key Takeaways YouTube star MrBeast has acquired the financial services platform Step through Beast Industries. The acquisition aims…

Polymarket Predicts Bitcoin Uptrend as MrBeast Ventures into Fintech
Key Takeaways Bitcoin’s Potential Surge: Polymarket denotes a fluctuating probability of Bitcoin achieving $75,000 in February, reflecting volatile…

MrBeast Enters Financial Services with Step Acquisition
Key Takeaways Binance announced an Alpha Airdrop event, highlighting the growing trend of gamified airdrops. Bitcoin prediction markets…

Analysts Predict Bitcoin May Fall to $55K as Support Levels Threaten
Key Takeaways Analysts suggest a potential drop of Bitcoin to $55K if current support levels are breached. Galaxy…

Analysts Predict Bitcoin May Drop to $55K Amid Support Challenges
Key Takeaways Experts caution that Bitcoin could fall to $55,000 if current support levels are breached. The market…

Bitcoin May Decline to $55K: Analysts Warn
Key Takeaways Analysts project Bitcoin could drop to $55,000 if key support levels fail. Technical analysts forecast that…

YZI Labs Transfers Massive ID Tokens to Binance as BNKR Hits New High
Key Takeaways BNKR, a digital currency, has achieved its highest-priced milestone of $0.295 CAD as of January 26,…

MrBeast Acquires Step, Expanding Influence in Teen Finance Market
Key Takeaways MrBeast has acquired the financial services app Step, which caters specifically to Gen Z users. Step…

Analysts Predict Bitcoin’s Critical Support Level May Trigger Decline
Key Takeaways Experts indicate a crucial moment for Bitcoin, with potential price drop to $55,000 if support fails.…

Michael Saylor Faces Bitcoin Valuation Challenges: Impact on the Crypto Market
Key Takeaways Michael Saylor’s Bitcoin investment is currently valued at $55 billion, but recent market trends have seen…

MrBeast Acquires Step FinTech App in Strategic Move
Key Takeaways MrBeast’s company, Beast Industries, has announced the acquisition of Step, a fintech app focused on Gen…

Bitcoin’s Potential Surge Sparks Debate Among Investors
Key Takeaways The probability that Bitcoin will reach $75,000 in February fluctuates as predicted by Polymarket. Bitcoin recently…

Analysts Predict Bitcoin’s Potential Plunge to $55K
Key Takeaways Analysts warn of a possible drop to $55K if Bitcoin’s current support breaks. 10X Research and…
Kyle Samani's Exit Scam, Is There More to the Story?
February 10th Market Key Intelligence, How Much Did You Miss?
Tokenomics New Paradigm? When Backpack Starts Enabling VCs to "Deferred Gratification"
BankrCoin Achieves New Milestones as YZi Labs and ETH Investors Make Significant Moves
Key Takeaways BankrCoin (BNKR) hit a new all-time high with significant market activity. YZi Labs executed a major…
Bitcoin Tests $75K Amid Market Predictions
Key Takeaways Bitcoin shows a 47% chance to test the $75,000 mark this February, contrasting with a potential…
MrBeast Acquires Step to Enhance Financial Offerings for Youth
Key Takeaways YouTube star MrBeast has acquired the financial services platform Step through Beast Industries. The acquisition aims…