Skip to main content

jz's blogs

Windows Copilot: An AI Assistant at the OS Level?

AI-generated summary
The article highlights Microsoft’s Windows Copilot, an AI-integrated feature for Windows 11, exploring its functionalities and potential. Copilot, akin to a Bing Chat-based chatbot, offers diverse capabilities, yet its AI performance seems constrained. Windows 11’s update extends AI assistance to apps like drawing and photos. Contrasted with competitors’ AI products, Copilot aims for OS-level integration. Challenges like export controls are discussed, while its potential as a seamless AI companion within operating systems remains to be seen. In essence, the article examines Copilot’s features, significance, and the future of AI-OS integration.

At the Microsoft Event on September 21st, Microsoft unveiled a range of new hardware and software products, simultaneously announcing that Windows 11 would be updated on September 26th. In this update, Microsoft enhanced the modern UI of Windows, introducing over 150 updates including Windows Backup, Developer Homepage, Volume Synthesizer, and more. However, the most noteworthy additions are Windows Copilot and AI features for applications such as drawing and image editing. Microsoft’s ongoing relationship with OpenAI suggests that Microsoft is committed to continuously integrating artificial intelligence capabilities into its products.

Windows Copilot
cover image, Windows Copilot

Preliminary Introduction

In the updated Windows 11, Microsoft has integrated AI capabilities throughout the entire system. Users no longer need to use Bing AI in Edge; instead, they can access it directly by clicking the Copilot icon in the taskbar or using the shortcut WIN + C. The content displayed automatically scales, and split-screen operations adjust to the current screen size. The overall impression of the application is akin to presenting the Bing Chat webpage to users in the form of an Electron app.

Windows Copilot Preview
Copilot will automatically occupy the rightmost side of the screen and will always be set to stay on top, while the desktop portion will also adjust its size automatically.

Key Features

Windows Copilot Preview Video

Copilot has been available for some time, and the previous logo differed from the one released this time. In the current preview version I’m using, Copilot can perform some basic interactions with the system, such as taking screenshots and opening certain applications. It feels somewhat similar to invoking Cortana (Windows’ previous voice assistant) to open applications. It does not yet fully utilize advanced language model technology.

I also tried summarizing a webpage in Edge, and it seemed to work. However, in Chrome or other browsers, sometimes it may indicate it cannot access your screen or provide incorrect answers. When copying a webpage address by clicking on the browser’s address bar, Copilot automatically detects this action and prompts the user to send the selected or copied text to the chat. After sending the webpage address, it summarizes it like any other chatbot. Like other “ChatGPT” services, Copilot inevitably encounters errors. Therefore, in terms of functionality, Windows Copilot is more like an AI assistant that combines the services of Cortana and Bing Chat.

Windows Copilot Preview-2
Copilot can interact with the content in Microsoft Edge.
Windows Copilot Preview-3
Just like Bing Chat, it can be seamlessly invoked at any time.
Windows Copilot Preview-4
The functionality of Bing Image Creator has also been incorporated into Windows Copilot.
Windows Copilot Preview-5
Copilot made a mistake in its summary; Yuchang PENG did not appear in that Japanese drama.

Conversing like Cortana

Previously, using the AI assistant on Windows was as simple as saying “Hey, Cortana” to invoke its assistance with tasks related to the operating system. However, now it requires using keyboard shortcuts or mouse clicks to access. Thanks to advancements in NLP technology, the accuracy of voice recognition seems to have improved in my experience. When users use voice input, Copilot prompts them with “Voice input is processed by Microsoft online services and will not be collected or stored.”

Windows Copilot Preview-6
After voice input, Copilot will also verbally announce the response.

In terms of supported languages, when we use English as the input language, Copilot also responds in English. Depending on the language packs downloaded on Windows accounts, Copilot supports languages such as Japanese, Spanish, French, and German, in addition to English and Chinese. In the current preview version I’m using, unlike Cortana, when I try to get Copilot to perform tasks controlling my computer, it responds with “Hello, this is Bing. I’m sorry, I can’t help you minimize all windows. That’s because I’m just a chat mode Bing; I don’t have permission to control your computer or browser. I can only chat with you in different languages or generate some interesting content with my knowledge and creativity.”

Chatting like Bing Chat

As mentioned earlier with the image example, Copilot can chat, write, and provide insights just like Bing Chat. If we compare Windows Copilot to ChatGPT or products from integration companies like POE, the only difference lies in the method of access. Windows Copilot attempts to provide users with a seamless experience, seamlessly accessible at any time or in any scenario while using a PC. However, the quality of responses is more dependent on the underlying language models used. Regarding image creation, Copilot can only generate images for users logged in with personal accounts. Therefore, in the preview version, it behaves more like Bing Chat, with weak integration with the system.

At the Microsoft Event, the official version of Windows Copilot can do much more. According to Microsoft’s blog, Copilot can directly cooperate with the current page on the computer in more ways, capturing the current page or directly capturing images for editing through Copilot. In Edge, Bing Chat, and Microsoft 365 Copilot also achieve unified functionality in Microsoft’s AI era alongside Windows Copilot.

Simple Comparison

Bing Chat helps users quickly find search results from Bing Search; Bard enhances Google’s search functionality, sparking creativity and providing efficiency; Github Copilot assists programmers in writing code through paired programming with AI; Notion AI helps users create their Notion content; Adobe Firefly brings AI drawing capabilities to Adobe Photoshop, allowing users to manipulate images with just a sentence; Office Chat assists office users in efficiently handling Word, Excel, PPT documents, and more… These AI tools seem to be designed and developed for specific applications, aiming to improve the user experience of the original applications.

In contrast, Windows Copilot seems to unify these different “AI companions” with various functionalities into one AI companion. When users work with Windows, whether it’s searching, programming, office work, artistic creation, etc., they can easily access AI assistance through Windows Copilot.

As mentioned earlier, if the main difference between Windows Copilot and other large language models lies in the difference in entry points, then what distinguishes Copilot from traditional voice assistants integrated with operating systems? Both are products of artificial intelligence. Apple’s Siri and Microsoft’s Cortana can interact with the system through specific commands, such as “Call XX”, “Set an alarm for XX o’clock”, “Remind me to finish XX in 15 minutes”, “Play music”, etc. The feature of Windows Copilot lies in its integration of powerful AI capabilities, enabling it to better understand and interpret user commands and provide more intelligent and personalized responses. However, in my actual experience, if seeking Copilot to handle these tasks, its AI capabilities seem to be limited; entrusting Copilot to handle more tasks also means that users need to provide more privacy data to the language model.

Beyond Copilot

In addition to Windows Copilot, the latest Windows update brings AI capabilities to applications such as Paint and Photos. In the previous beta preview versions, the Paint application already introduced features like multi-layer functionality and offered a dark theme. In the upcoming updates, Paint will further utilize the DALL-E AI image generator Cocreator to assist users in digital creative drawing. In the Photos app, AI capabilities have also been introduced to help users easily edit photos, such as cropping and highlighting subjects. Moreover, in the Microsoft Store, a new category called AI Hub has been added to the sidebar, where Microsoft aggregates most applications with AI functionality, allowing users to conveniently find tools that suit their needs.

Windows Copilot Preview-7
The AI Hub in the app store gathers several applications with AI functionalities.

Earlier in March of this year, Microsoft introduced Microsoft 365 Chat in Office to help users organize various work documents, emails, meetings, and more. Microsoft also stated that new features would be introduced for Copilot in Outlook, Word, Excel, Loop, OneNote, and OneDrive. There has even been a meme circulating to humorously depict Microsoft’s relationship with AI.

Windows Copilot Preview-7
Microsoft is ‘sending’ AI to its various products, image credits

Some Reflections

However, how should AI cooperate with operating systems? I still remember that in December last year, generative AI began to gain popularity. By February, all the major companies were talking about launching their own chatbots. Everyone was looking forward to seeing how Apple would incorporate AI into its systems. However, at WWDC in June, what caught everyone’s attention was not the effort put into updates for iOS, iPadOS, and macOS, but the sudden emergence of Vision Pro. Apple, which has always emphasized user privacy and security, has made efforts to keep machine learning models as localized as possible in terms of AI application, bringing optimized functions such as transcription, image recognition, and input prediction to users. However, despite Apple’s continuous improvement in the neural network computing capabilities of its chips, the computing power required for the learning and training of large language models is a resource that personal devices find difficult to provide. On the other hand, Apple may still be waiting for regulations related to generative AI to be perfected.

Of course, due to the U.S. Export Administration Regulations and other reasons, Copilot can only be used in unrestricted areas. China (mainland, Hong Kong, and Macau) cannot access this service.

It has been almost two years since Windows 11 was launched to all users. Windows updates seem to have been mostly about changing the interface or “skin” of the system. Until last year, when OpenAI introduced ChatGPT and gained unprecedented attention, application developers were all thinking about how to incorporate ChatGPT into their products. I initially thought that Microsoft integrating Copilot into the system would be the biggest functional update for Windows 11 in these two years. However, after experiencing it, it seems more like the launch of a new application. Perhaps seamless access in any scenario, coupled with the ability to adapt to various usage scenarios, can already be considered a true daily AI companion at the operating system level.