In recent months, the race for generative artificial intelligence has heated up, and now a new player with a Cupertino accent enters the scene: the model Apple's FS-DFM, designed to generate long text at brutal speedWhile Microsoft is strengthening Copilot with new models and Google continues to accelerate its development, Apple is trying to close the gap with a technology that, on paper, promises to be a true revolution.
This context is significant: we're talking about a sector where every second counts and where text quality, the ability to follow instructions, and integration with everyday tools are key. In this scenario, it's worth calmly asking ourselves: Can FS-DFM really compete with Copilot and the models that power it?Or does he arrive too late to a party that's already well underway?
What is FS-DFM and why is it generating so much buzz?
A group of Apple engineers, in collaboration with the Ohio State University has presented FS-DFM (Few-Step Discrete Flow-Matching)FS-DFM, a new language model designed specifically to generate long text extremely quickly without degrading quality. According to the published study, FS-DFM is capable of generating long sequences of up to 128 times faster than traditional autoregressive models in the style of ChatGPT, while maintaining a comparable level of quality.
This model focuses on the Efficient and stable text generation, even in long passagesThis makes it one of the most interesting approaches we've seen for tasks where latency is critical: real-time conversational assistants, document writing, long responses in productivity apps, and generally any case where the user doesn't want to wait for the text to trickle in.
How does FS-DFM work: from autoregressive approach to diffusion in a few steps?
Classical language models of the GPT type use an autoregressive scheme: They generate text token by token, word by word.This approach calculates at each step which is the next most probable unit of text. It offers very good control and quality, but it is inherently sequential: each new token depends on the previous one, which limits speed, especially when dealing with long texts.
FS-DFM breaks with that philosophy by drawing inspiration from the diffusion models, which have become famous in image generationInstead of going word by word, the model generates several text fragments in parallel and refines them in successive iterations. The initial text may be noisy, incoherent, or incomplete, but in each round of improvement, the model corrects, reorganizes, and polishes the result to bring it closer to a high-quality final output.
Apple doesn't just copy traditional broadcasting; it applies a technique known as flow-matching, which speeds up the process by eliminating a large part of the intermediate iterationsInstead of making hundreds or thousands of small refinement steps, FS-DFM learns to take longer "strides" in the space of possible texts, so that it converges in very few rounds towards a coherent and fluid text.
The key to this few-step approach is that the model learns to navigate directly between the initial “noise” and the final text without having to go through all the intermediate stages typical of standard distribution. This drastically reduces generation time while maintaining a solid overall text structure, which is especially valuable when producing long paragraphs or entire documents.
The three phases of FS-DFM training
For this scheme to work, the Apple and Ohio State University team has designed a three-stage training process that seeks balancing speed, stability, and linguistic accuracyIt's not just a matter of launching a dissemination model and hoping for the best, but of carefully guiding him so that he learns to write well in very few steps.
In the first phase, the model learns to operate with different numbers of refinement iterationsThis allows FS-DFM to adapt to scenarios where extreme speed is paramount, as well as to those where an extra step can be taken to further refine the result, training its ability to progressively improve the text according to the number of steps available.
The second phase introduces a master model (a larger and more powerful “teacher”) that guides FS-DFMThis teaching model acts as a quality benchmark, providing examples and corrections that help FS-DFM to fine-tune more subtle details: word choice, semantic coherence, style, consistency in long texts… In this way, the lighter model learns to approximate the performance of a much larger system.
Finally, in the third phase, it They optimize individual refinement steps to achieve faster and more stable convergence.The idea is that, in each of those few iterations, the model makes the most of the available information, intelligently reduces "noise," and maintains the text structure without introducing sudden errors or strange topic jumps.
Speed without sacrificing quality: tests against Dream and LLaDA
One of the most striking aspects of the study is that FS-DFM is capable of generate a complete text in just eight very fast iterationsMeanwhile, other language broadcasting models can require over a thousand steps to refine the content to a reasonable level. The difference in latency is enormous, especially when considering personal devices like an iPhone or a Mac.
In comparisons, FS-DFM faces off against larger diffusion models such as Dream (with 7.000 billion parameters) or LLaDA (with 8.000 billion parameters)Despite having fewer parameters, Apple's model achieves better results in two fundamental metrics in language processing: perplexity and entropy, which serve to measure how natural and stable the generated text is.
An Lower perplexity indicates that the model better predicts actual word sequencesThat is, the language it produces is more similar to human language. At the same time, a more stable entropy suggests that the model maintains a healthy balance between creativity and consistency, without becoming chaotic or excessively repetitive. On both fronts, FS-DFM shows a clear advantage over these more voluminous diffusion models.
This result is especially relevant considering that, on devices with limited resources, A smaller, faster model that maintains competitive quality could be much more useful. that a giant that can only function in large data centers. That's where Apple wants to make a difference with its products and services.
Publication of the study and opening to the research community

The work describing FS-DFM has been published in arXiv under the title “FS-DFM: Fast and Accurate Long-Text Generation with Few-Step Diffusion Language Models” or similar, detailing both the architecture and the training process, as well as the evaluations against other models. The article also includes concrete examples showing how the generated text changes and improves throughout the different iterations.
Apple has expressed its intention to release the code and checkpoints of the modelThis would allow researchers and developers to experiment with FS-DFM, adapt it to new domains, or integrate it into their own applications. For a traditionally closed environment like Apple's, this move can be interpreted as a way to gain relevance within the AI scientific community.
If this step-by-step diffusion approach takes hold, it could become a de facto standard for text generation in systems where response time is criticalIt is precisely in these types of scenarios that Apple wants to shine with its own assistants and productivity tools.
FS-DFM in the Apple ecosystem: Siri, Apple Intelligence and iWork in the spotlight
Apple's big bet is to integrate FS-DFM (or derivative models) into its ecosystem, especially in Siri, in Apple Intelligence features and in the iWork suiteIf a model capable of writing quickly and well runs with low latency on iPhone, iPad, and Mac, the user experience could take a considerable leap forward compared to what we have today.
In Siri, a model of this type would allow more elaborate, contextual and faster responsesMoving away from the limited and somewhat clunky assistant that many users currently perceive, Apple Intelligence could greatly benefit from features like assisted writing, style correction, and summary generation, thanks to the ability to produce long paragraphs without annoying wait times.
Within iWork, tools like Pages, Numbers, or Keynote could receive a significant boost if they are integrated. generative functions that help to draft documents, propose presentations, or structure dataIn fact, Apple's recent purchase of the iWork.ai domain fuels speculation about a massive influx of generative AI features into this office suite.
How Microsoft is positioning itself: Copilot and its new MAI-Voice-1 and MAI-1-preview models
While Apple is perfecting FS-DFM, Microsoft isn't standing still. The company has announced Two new AI models closely linked to Copilot: MAI-Voice-1 and MAI-1-previewTheir goal is to maintain Copilot as a benchmark in productivity assistants and further expand its capabilities.
MAI-Voice-1 is a voice generation model that stands out for its extreme speedMicrosoft claims it can produce one minute of audio in less than one second using a single GPU, making it especially attractive for real-time applications: text narration, email reading, on-the-fly podcasts, or more natural voice assistants.
This voice model is already available integrated into Copilot Daily and the Podcasts featureAnd Microsoft has also begun rolling it out in Copilot Labs, so users can try it directly from existing tools without waiting for future mass releases.
On the other hand, MAI-1-preview is a MoE (Mixture of Experts) type model trained with around 15.000 NVIDIA H100 GPUsIt is designed to follow instructions accurately and provide helpful and direct answers to everyday questions, which is very much in line with Copilot's main use as a task assistant and information query.
Microsoft plans to partially integrate this model in Copilot in the coming weeksthus reinforcing the assistant's capabilities both in understanding user requests and in generating responses more adapted to the context and the requested tone.
Copilot: business model, advantages and limitations compared to Apple
A key piece of the puzzle is the access model. Although Copilot has a free version accessible to any userIts true potential is unlocked with the Copilot Pro subscription, which costs 22 euros per month.
With Copilot Pro, users get priority access to the latest AI models and deep integration with Microsoft 365Word, Excel, PowerPoint, Outlook, and other tools. For businesses, this means being able to automate tasks such as writing, data analysis, presentation creation, and email management much more efficiently.
In parallel, Google offers its own paid service, Google AI Pro, priced at €21,99 per monthwhich also adds 2 TB of storage for Photos, Drive, and Gmail. Therefore, the competition is not only technological, but also in business models, pricing, and added value for the end user.
Faced with this, Apple finds itself in a delicate situation: It does not yet have a public generative assistant of the scale of Copilot And its traditionally more closed approach complicates the rapid adoption of similar subscription models. Even so, native integration into devices and systems could be its greatest strength if it manages to offer useful AI features without forcing the user to subscribe to yet another monthly service.
Microsoft Copilot in the Apple world: an unwelcome guest
The relationship between Microsoft and Apple has been, for years, a mixture of rivalry and pragmatic collaborationOffice has long been a full-fledged citizen on macOS and iOS, and now Copilot has also landed on Mac and iPhone as a fully functional application.
This move makes Copilot into a very powerful competitive weapon within the Apple ecosystemThis is precisely where the Cupertino company is weakest in terms of visible generative AI. Copilot is presented as a powerful assistant and, in its basic version, is free, which puts clear pressure on Apple, which until now has not offered anything equivalent on that scale.
It is true that Apple has been using AI techniques for years under the umbrella of “Machine learning” to improve the experience on your devicesSmart app suggestions when opening the search engine, phrase auto-completion, and pattern recognition in daily iOS and macOS use rely on machine learning models, although Apple rarely labels them as "artificial intelligence."
That AI is also noticeable in more visible aspects, such as the Automatic photo enhancement, sticker creation from images, and background removal in seconds. However, a model that can compete head-to-head with GPT-4, DALL·E 3 The full functionality of Copilot remains the major gap in generative technology.
What does Copilot offer today that Apple hasn't yet matched?
Copilot can be described as an advanced chatbot with generative capabilities for both text and image, powered by OpenAI technology (GPT-4 and DALL·E 3) but packaged and extended by Microsoft with its own layer of services and functionalities.
One of its great advantages is that It is permanently connected to the Internet and can access up-to-date informationIn this way, you can answer questions about recent topics, offer updated data, and link to relevant sources, which is highly valued when seeking specific information and not just static content generation.
In the visual realm, Copilot allows Generate images from natural language promptsJust like the more well-known image generators, the system suggests variations and modifications to the created images, such as changing elements of the scene, adjusting the visual style, or transforming specific details (for example, changing a dragon's fire to water or placing the scene in a castle).
Regarding the text, Copilot is capable of write stories, summarize documents, answer complex queries, or generate code in multiple programming languages. It also provides examples, additional explanations, and links to resources like Wikipedia for further exploration of the concepts covered.
All of this means that, today, many Apple device users turn directly to Copilot or ChatGPT. for tasks that Apple does not yet cover with its own tools, creating the feeling that the company is somewhat behind in this particular race.
The future of Apple: AppleGPT, iOS 18, and the bet on generative AI
Rumors and leaks suggest that Apple is preparing a major leap with iOS 18 and macOS 15 Regarding generative artificial intelligence, there is internal talk of a project nicknamed AppleGPT or Ajax, a proprietary language model that could be integrated into multiple layers of the operating system.
The great unknown is To what extent will this future model be able to compete with OpenAI, Microsoft, or Google? in quality, versatility, and speed. The AI race didn't start yesterday: it's been underway for years and took off in 2023, so Apple is entering a market where the competition already has solid products on the market.
What does seem clear is that Apple intends flood their platforms with AI-powered featuresCraig Federighi, the company's top software executive, has reportedly ordered the inclusion of "every AI-powered feature possible" in future versions of iOS, iPadOS, and macOS, including both end-user utilities and advanced developer tools.
Xcode with AI: the “Copilot” that Apple wants for its developers
One of the areas where Apple plans to place a lot of emphasis is on its development platforms, especially XcodeAccording to information revealed by Mark Gurman, the company has been internally testing generative AI features within Xcode for about a year, which would allow... automatically generate blocks of source code, very much in line with what GitHub Copilot already does in other environments.
The idea is not for the assistant to do all the programmer's work, but streamline repetitive tasks, suggest solutions, and help those who are learningMany developers already use tools like ChatGPT to resolve specific questions, and Apple doesn't want its development environment to be left out of this trend.
The goal would be to be able to showcase these new features. at an upcoming WWDC, opening the door for the developer community to start testing themIf FS-DFM or related models integrate well with Xcode, they could offer very low latency code generation and refactoring, which is very valuable when working intensively on large projects.
Meanwhile, Apple's purchase of the iWork.ai domain reinforces the idea that Pages, Numbers, and Keynote will also receive generative features. that take advantage of internal advances in AI, thus closing the circle between productivity, development and user experience in the brand's ecosystem.
Given this whole picture, the feeling is that FS-DFM is a key technical piece in Apple's strategy to "catch up" in generative AIMeanwhile, Microsoft and Google continue to move forward with products already available. established as Copilot and their new support models, so the big battle will not only be who has the fastest model, but who integrates it best into the daily lives of users. Share the information and more people will be aware of the issue.