Those AI Agents Beyond Imagination

07/16 2025 524

"In the mid-5th century AD, an unknown Christian poet passed away, and this year happened to be the cutoff year for an ancient environmental reconstruction chronology. What is the name of this scientific chronology?"

Faced with such an obscure question, even the most seasoned scholars may find themselves pondering. Without knowing the poet's name or the chronology's title, traditional search engines are completely ineffective here. The two seemingly unrelated pieces of information are like two grains of sand in the ocean, making it difficult to know where to start.

It is precisely this kind of perplexing puzzle that an agent named WebSailor can quickly lock in the correct answer through cross-verification: the poet is Synesius of Cyrene, the scientific chronology is "PAGES 2k", and the year is 414 AD.

This inevitably raises the question: When did AI evolve to such a degree?

It should be noted that just six months ago, Agents were generally considered to have more toy-like attributes than tool-like ones. Most products had highly sought-after beta test spots, but their actual performance frequently fell short.

Despite initially underwhelming results, the evolution of agents has been rapid. Today, in professional fields such as marketing and healthcare, Agents are even outperforming humans.

Today, let's take a look at which agents from the first half of the year have exceeded our previous imaginations.

Faced with financial modeling questions at the World Championships level, even experienced analysts often need several hours to deduce and verify. But what if I told you that someone can give an accurate answer in just 10 minutes? Would you believe it?

Such a complex task may be beyond the capabilities of even the best large models on the market. However, an agent named Shortcut completed it in just 10 minutes, with an accuracy rate exceeding 80% and a time that is 10 times faster than humans.

How difficult is the Excel World Championships?

Endorsed by Microsoft and operated by the FMWC Committee, the competition covers complex functions, Power Query, dynamic arrays, Monte Carlo simulations, etc., and is described by contestants as the "cruelest function battlefield." Participants come from all over the world and are mostly investment bank data analysts, financial modeling directors from the Big Four, former Microsoft MVPs, and have impressive academic and professional qualifications.

The exam question for this year, which was also Shortcut's debut exam, was themed around the 30th anniversary of "World of Warcraft" and required contestants to complete over 20 related table operations within 40 minutes. Participants needed to manually create formulas such as VLOOKUP and INDEX-MATCH to establish precise links within the complex data maze.

In this regard, Shortcut not only overcame the limitations of traditional AI models in data processing volume but also perfectly avoided the pain point of hallucinatory output. Faced with massive unordered data and highly deterministic function rules, it can quickly understand task requirements and provide precise solutions, just like an experienced analyst. What originally required human contestants 1-2 hours to complete, Shortcut delivered a perfect answer in just 10 minutes.

According to the development team, Shortcut supports natural language command interaction and can easily handle complex tasks such as financial modeling, 5000-row CSV data analysis, data visualization, and even pixel art creation. Its core capabilities cover professional functions such as smart fill, automatic error detection, and multi-table correlation analysis, making it a hexagon warrior in the Excel domain.

Seeing such a financial professional, one might exclaim that they have found a savior.

The financial department is plagued by countless data, spreadsheets, and documents. However, early AI development was limited by token constraints and hallucinations, unable to handle hundreds or thousands of data points. A single decimal point or punctuation mark error could bring immeasurable losses to the company. This once left the public with the impression that AI could not solve practical problems.

The emergence of Shortcut has broken this deadlock and brought new possibilities to this pain point.

After all, entering and proofreading 5000 rows of CSV data line by line could take nearly a week's worth of work. Now, although Shortcut still has the potential for errors in complex function graphing, just solving the single task of information organization can save them from their increasingly sparse hair.

In the foreign trade industry, sales teams may struggle to push the conversion rate from 10% to 15%. However, one company quietly raised this number to 50% - not by working overtime or relying on Crowd tactics , but by an invisible sales ace.

Do competitors think the other party has hired a master? Do customers think they are making autonomous decisions? No, they may have already fallen into the gentle trap carefully designed by the Agent.

Data shows that the conversion rate of a traditional salesperson is generally 10%-15%. However, an agent named Agentforce has achieved a conversion rate of 50%. Since its launch in 2024, it has closed over 8000 deals.

What is most heartbreaking for salespeople is that this Agent not only has a high conversion rate but also does not sign deals with low amounts, often reaching seven-figure US dollar levels. If these large deals were signed by them, the commission would start at least in the four-digit range. But the reality is that even the most seasoned sales champions have to ponder why the skills and scripts they have painstakingly cultivated have been intercepted by an Agent that has emerged out of nowhere.

First, humans who need rest cannot compete with machines that run continuously. In international trade, there is a saying that whoever stays up later can make more money. The existence of time differences creates a day and night shift schedule for foreign trade, but still, no one can stay on duty 24 hours a day to precisely persuade customers at the moment they decide to place an order. Agentforce does this; it is like a never-tiring digital sales system that concurrently handles thousands of conversations in 7x24h mode, reducing the number of manual seats by 30-60%.

Second, uniform and rigid scripts cannot compete with the all-around "flattery" of an Agent. Why do customers often not realize that it was AI that swayed them when placing an order? Because in the 21st century, it is truly hard to find a character that is better at flattery than AI. Traditional sales rely on manpower, and salespeople rely on experience to judge customer intentions, which are influenced by personal emotions and fatigue, making it difficult to weave out the right words. However, Agentforce can analyze behavioral traces in real-time, such as official website browsing and email interactions, lock in high-intent targets, and automatically adjust scripts through sentiment analysis to improve subsequent conversion rates.

Third, people who only speak their native language cannot compete with AI that is proficient in foreign languages and encyclopedic knowledge. With AI, speaking a foreign language is really not considered a significant strength. It is reported that the training corpus of Agentforce spans 17 languages and covers 740,000 Salesforce official documents and metadata. Relying on Salesforce's industry-level data lake with a total volume of 200-300PB, Agentforce achieves far greater contextual depth and domain precision than similar products, thereby significantly reducing the risk of hallucinations and providing more reliable results.

We have reason to believe that in the future, Agent salespeople will invade every trading field, whether it is for bulk commodities or small businesses. Their conversion rate will continue to increase, and their scope of transactions will become broader and broader.

Would you dare to take medication prescribed by AI?

We all know that AI has entered various fields, and healthcare is no exception. However, most people would still be nervous about directly taking medication prescribed by AI. After all, slight differences in dosage can lead to addiction, and minor deviations in medication plans may trigger severe side effects. It is truly a matter of being off by a millimeter and missing by a mile.

But what if I told you that the diagnostic accuracy of AI doctors even surpasses that of professional doctors? Would you believe it?

In the United States, a medical agent named Polaris can provide patients with genuine medication advice, with a medical advice accuracy rate exceeding 99%, far higher than the average level of 81% for registered nurses in the US. Moreover, the drugs and follow-up opinions recommended by this agent have a patient satisfaction rate approaching 90%. This means that AI is not only more accurate than humans but also more trusted by patients.

But as an Agent, how does it achieve this? This stems from the collaborative work and cross-verification mechanism of multiple agents.

Polaris is diagnosed by three agents, rather than a single model making independent decisions. For example, when a patient inquires about the side effects of a certain medication, the Laboratory Agent retrieves the latest drug clinical trial data to ensure that the information is based on authoritative medical research; the Drug Agent checks the patient's medication history and allergy records to avoid potential drug interaction risks; the Primary Agent integrates the analysis of the first two to generate a final recommendation and annotate the confidence level.

To further ensure medication safety and patient welfare, over 6500 nurses and 500 doctors participated in the final safety assessment, helping the system obtain an FDA-approved medical AI patent.

It is reported that in the UAE, Polaris has been integrated into the digital system of the Burjeel Healthcare Group. In over 1.85 million real patient interaction tests, Polaris 3.0 achieved a clinical accuracy rate of 99.38% and a patient satisfaction rate of 8.95/10.

However, it should be noted that Polaris can currently only provide consultation plans and medical advice for diseases with clear solutions and medical cases and cannot directly participate in drug research and development. In other words, medical agents place more emphasis on the diagnostic accuracy rate of routine cases rather than research and innovation work. Therefore, to a certain extent, they can only play a role in clinical practice and cannot participate in cutting-edge work such as the development of drugs for rare diseases. Because in hospitals, where life takes priority, safety must come first. Agents still have a long way to go before they can rival professional doctors.

It is not hard to see that in just one year, agents have gradually exceeded people's imaginations. From the development trajectory of these agents, we can clearly see a trend: Agents are moving from concept to practical use, from the laboratory to our daily work and life. They are not cold machines but are gradually becoming valuable assistants for professionals in various fields. WebSailor allows researchers to no longer be overwhelmed by massive literature, Shortcut liberates the hands of financial professionals, Agentforce becomes the secret weapon of sales teams, and Hippocratic serves as the second brain for healthcare workers.

The most valuable aspect of these agents is that they do not seek to replace humans but rather complement human limitations in efficiency, memory, and computational ability, allowing us to devote more energy to areas that truly require human wisdom. Just as telescopes extend human vision, these Agent tools are expanding our cognitive boundaries.

In the foreseeable future, each of us may have one or even multiple agents as assistants: Agent mentors to help us learn new knowledge, Agent secretaries to manage our schedules, Agent doctors to take care of our health, Agent partners to create content... But just like all great tools in history, they will not replace us but will make us stronger, ultimately becoming a part of human capabilities.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.