These startups are building cutting-edge AI models without the need for a data center

By: blockbeats|2025/05/01 18:05:30

GPT AI

Alien Invasion

Large Language Model Based

Ai.com

AI Protocol

Researchers have utilized GPUs distributed globally, combined with private and public data, to train a new type of large language model (LLM). This move indicates that the mainstream approach to building artificial intelligence may be disrupted.

Two unconventional AI-building startups, Flower AI and Vana, collaborated to develop this new model, named Collective-1.

Flower's developed technology allows the training process to be distributed across hundreds of connected computers over the internet. The company's tech has been used by some firms to train AI models without the need for centralized computing resources or data. Vana, on the other hand, provided data sources such as private messages on X, Reddit, and Telegram.

By modern standards, Collective-1 is relatively small-scale, with 7 billion parameters—these parameters collectively empower the model—compared to today's most advanced models (such as those powering ChatGPT, Claude, and Gemini) with hundreds of billion parameters.

Nic Lane, a computer scientist at the University of Cambridge and co-founder of Flower AI, stated that this distributed approach is expected to scale well beyond Collective-1. Lane added that Flower AI is currently training a 300 billion parameter model with conventional data and plans to train a 1 trillion parameter model later this year—approaching the scale offered by industry leaders. "This could fundamentally change people's perception of AI, so we are going all-in," Lane said. He also mentioned that the startup is incorporating images and audio into training to create multimodal models.

Distributed model building may also shake up the power dynamics shaping the AI industry.

Currently, AI companies construct models by combining massive training data with large-scale computing resources centralized in data centers. These data centers are equipped with cutting-edge GPUs and interconnected via ultra-high-speed fiber-optic cables. They also heavily rely on datasets created by scraping public (though sometimes copyrighted) materials such as websites and books.

This approach implies that only the wealthiest companies and nations with a large number of powerful chips can effectively develop the most robust, valuable models. Even open-source models like Meta's Llama and DeepSeek's R1 are constructed by companies with large data centers. A distributed approach could allow small companies and universities to build advanced AI by aggregating homogeneous resources. Alternatively, it could enable countries lacking traditional infrastructure to build stronger models by networking multiple data centers.

Lane believes that the AI industry will increasingly move towards allowing training in novel ways that break out of a single data center. The distributed approach "allows you to scale computation in a more elegant way than a data center model," he said.

Helen Toner, an AI governance expert at the Emerging Technology Security Center, stated that Flower AI's approach is "interesting and potentially quite relevant" to AI competition and governance. "It may be hard to keep up at the cutting edge, but it may be an interesting fast-follower approach," Toner said.

Divide and Conquer

Distributed AI training involves rethinking how computation is allocated to build powerful AI systems. Creating LLMs requires feeding a model large amounts of text, adjusting its parameters to generate useful responses to prompts. In a data center, the training process is segmented to run parts of tasks on different GPUs and then periodically aggregated into a single master model.

The new approach allows work typically done in large data centers to be performed on hardware potentially miles apart and connected by relatively slow or unreliable internet connections.

Some major companies are also exploring distributed learning. Last year, Google researchers demonstrated a new scheme called DIstributed PAth COmposition (DiPaCo) for segmenting and integrating computation to make distributed learning more efficient.

To build Collective-1 and other LLMs, Lane collaborated with academic partners in the UK and China to develop a new tool called Photon to make distributed training more efficient. Lane stated that Photon enhances Google's approach by adopting a more efficient data representation and shared and integrated training schemes. This process is slower than traditional training but more flexible, allowing for the addition of new hardware to accelerate training, Lane said.

Photon was developed through a collaboration between researchers at Beijing University of Posts and Telecommunications and Zhejiang University. The team released the tool under an open-source license last month, allowing anyone to use this approach.

As part of Flower AI's efforts in building Collective-1, their partner Vana is developing a new method for users to share their personal data with AI builders. Vana's software enables users to contribute private data from platforms like X and Reddit to the training of large language models, specifying potential final uses and even receiving financial benefits from their contributions.

Anna Kazlauskas, co-founder of Vana, stated that the idea is to make unused data available for AI training while giving users more control over how their information is used in AI. "This data is usually unable to be included in AI models because it's not public," Kazlauskas said. "This is the first time that data contributed directly by users is being used to train foundational models, with users owning the AI model created from their data."

University College London computer scientist Mirco Musolesi has suggested that a key benefit of distributed AI training approaches may be unlocking novel data. "Extending this to cutting-edge models will allow the AI industry to leverage vast amounts of distributed and privacy-sensitive data, such as in healthcare and finance, for training without the risks of centralization," he said.

-- Price

The trusted AI prediction ecosystem Manadia, which has secured $7 million in funding from well-known institutions like OKX, will globally launch in June. The core token UMXM has already been listed on multiple mainstream platforms, inviting you to seize the new blue ocean of the trillion-level predi...

Who is footing the bill for the $64 billion accounting frenzy?

Affected by Bitcoin falling below $60,000, publicly listed companies heavily invested in this asset are facing huge paper losses and valuation discounts, and their debt structure and accounting standards may trigger structural liquidity risks in the future.

Morning Report | CoinEx becomes a key hub for Iran to evade sanctions, involving over $3.8 billion in funds; Kalshi seeks a new round of financing, with a valuation potentially rising to $40 billion

Overview of Important Market Events on June 25

Why do cryptocurrency projects always like to change their names?

In many cases, the old names of encryption projects have no competitive advantage, only historical baggage.

From the white-haired stock god to the billionaire fund mogul, the smart people shorting Nvidia are all getting rich using the same framework

Give up on heavily investing in Nvidia's "nine major bottlenecks"! This article analyzes the underlying logic behind top AI investors making billions: physical infrastructure such as electricity, HBM, and optical interconnects are the true keys to wealth in AI hardware.

Morning News | The draft amendment to the People's Bank of China Law aims to clarify the legal status of digital renminbi; South Korea will transfer about 40 unregistered virtual asset service providers to law enforcement agencies

Overview of Important Market Events on June 24

The cryptocurrency industry has entered the "Show Me" era: merely relying on vision is no longer enough

The awareness level of the audience in the cryptocurrency industry—including media, institutions, and retail investors—is steadily increasing, and this trend has become a foregone conclusion.

Interpreting the Ethereum Foundation's new structure: Reaffirming self-sovereignty amid institutional trends

The Ethereum Foundation has announced a new five-layer working framework, clarifying the focus of future development and reaffirming its commitment to decentralized core values amidst the wave of institutionalization.

Former SpaceX engineer reconstructs the financial execution system using first principles

Plan Execution Lab completes angel round financing for Singapore family office, with a valuation of 50 million USD.

Standard Chartered Bank sings a 50x rhapsody again, aiming for AAVE to reach 3500 USD

The throne of DeFi lending still exists, but the foundation beneath the throne needs to undergo a reconstruction or reinforcement.

Tidal Investment: We still have a positive outlook on the AI industry chain, but the reasons have changed

The intense financing by tech giants has triggered a panic of "AI peak," but the soaring capital expenditures of the five major cloud vendors and the bottlenecks in physical infrastructure indicate that the AI investment cycle is far from over; the second half of this grand performance has just begu...

Popular coins

Latest Crypto News

00:42

Data: The TVL of liquid staking has dropped to a two-year low, with the total scale in Q2 decreasing to 33.4 billion USD

According to data from CryptoRank.io, the total TVL of global liquid staking protocols fell to $33.4 billion in the second quarter of 2026, a decrease of 56.1% from the historical high of $76 billion in the third quarter of 2025, marking three consecutive quarters of decline.Specifically, the fourth...

00:42

Aave plans to expand into traditional assets and enter the securities lending market

Aave founder Stani Kulechov posted that Aave is expanding its business boundaries from crypto assets to traditional assets, launching securities collateral loans and securities lending services.Aave executive Luigi D'Onorio DeMeo pointed out that the global securities lending market has a borrowing ...

00:42

According to the Wall Street Journal, several U.S. lawmakers are calling for an investigation into the prediction market platform Polymarket, as its advertising is alleged to potentially contain misleading content.It is reported that the lawmakers believe Polymarket may not clearly disclose the natu...

These startups are building cutting-edge AI models without the need for a data center

Divide and Conquer

-- Price

You may also like

The survival dilemma of small and medium exchanges behind the withdrawal anomalies exposed by AscendEX

Why Is Bitcoin Falling Below $60K? 5 Key Market Drivers Explained

The shift in the cloud of the air: from despising stablecoins a year ago to the high-profile entry of capital today

ETH has entered a non-consensus phase, and the turning point is approaching!

Bitcoin vs. Gold in 2026: Which Asset Performs Better in Different Markets?

What is your view on Binance's competitive advantages?

I never expected that the first application of AI x Crypto would be in security auditing

Global Launch: As predictions become the most scarce asset in the AI era, Manadia is defining the next generation of the value internet

Who is footing the bill for the $64 billion accounting frenzy?

Morning Report | CoinEx becomes a key hub for Iran to evade sanctions, involving over $3.8 billion in funds; Kalshi seeks a new round of financing, with a valuation potentially rising to $40 billion

Why do cryptocurrency projects always like to change their names?

From the white-haired stock god to the billionaire fund mogul, the smart people shorting Nvidia are all getting rich using the same framework

Morning News | The draft amendment to the People's Bank of China Law aims to clarify the legal status of digital renminbi; South Korea will transfer about 40 unregistered virtual asset service providers to law enforcement agencies

The cryptocurrency industry has entered the "Show Me" era: merely relying on vision is no longer enough

Interpreting the Ethereum Foundation's new structure: Reaffirming self-sovereignty amid institutional trends

Former SpaceX engineer reconstructs the financial execution system using first principles

Standard Chartered Bank sings a 50x rhapsody again, aiming for AAVE to reach 3500 USD

Tidal Investment: We still have a positive outlook on the AI industry chain, but the reasons have changed

The survival dilemma of small and medium exchanges behind the withdrawal anomalies exposed by AscendEX

Why Is Bitcoin Falling Below $60K? 5 Key Market Drivers Explained

The shift in the cloud of the air: from despising stablecoins a year ago to the high-profile entry of capital today

ETH has entered a non-consensus phase, and the turning point is approaching!

Bitcoin vs. Gold in 2026: Which Asset Performs Better in Different Markets?

What is your view on Binance's competitive advantages?

Contents

Popular coins

Latest Crypto News

Data: The TVL of liquid staking has dropped to a two-year low, with the total scale in Q2 decreasing to 33.4 billion USD

Aave plans to expand into traditional assets and enter the securities lending market

Data: The current average funding rate for BTC across the entire network over 8 hours is 0.0045%

Kashkari: The construction of artificial intelligence will trigger inflation in the short term

Multiple U.S. lawmakers are calling for an investigation into Polymarket for alleged misleading advertising