{"id":2419,"date":"2026-05-08T06:15:13","date_gmt":"2026-05-07T23:15:13","guid":{"rendered":"https:\/\/daiilynews.cu.ma\/ai-bots-auditioning-for-wall-street-trading-are-mostly-losing\/"},"modified":"2026-05-08T06:15:13","modified_gmt":"2026-05-07T23:15:13","slug":"ai-bots-auditioning-for-wall-street-trading-are-mostly-losing","status":"publish","type":"post","link":"https:\/\/daiilynews.cu.ma\/?p=2419","title":{"rendered":"AI Bots Auditioning For Wall Street Trading Are Mostly Losing"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<p>                        &#13;<br \/>\n\tAI isn\u2019t ready to replace your fund manager \u2014 and the public experiments testing it are showing why.&#13;<br \/>\n&#13;<br \/>\n\tAcross a series of new trading contests between the world\u2019s leading AI models, the verdict so far is unflattering. Most of the systems lose money. They trade too much. They make wildly different decisions when given identical instructions. And no one yet knows if these shortcomings will fade with more powerful iterations \u2014 or if they reveal something fundamental about the gap between large language models and how markets actually work.&#13;<br \/>\n&#13;<br \/>\n\tTake\u00a0Alpha Arena, run by tech startup Nof1. It pitted eight major frontier AI systems \u2014 including Anthropic\u2019s Claude, Google\u2019s Gemini, OpenAI\u2019s ChatGPT and Elon Musk\u2019s Grok \u2014 against each other in four separate competitions. Each was handed $10,000 per contest before being turned loose on US tech stocks for two weeks. The challenges involved trading on a variety of signals, acting defensively, reacting to the competition, and using high leverage.\u00a0                <\/p>\n<p>&#13;<br \/>\n\tThe portfolio as a whole lost about a third of its capital. Across all 32 sets of results, a model finished in profit only six times. Grok 4.20 delivered the best performance during the challenge in which it was aware of its rivals\u2019 performance. It placed only 158 trades; under the same prompt, Alibaba\u2019s Qwen traded 1,418 times.<br \/>\n&#13;<br \/>\n\tAlpha Arena is one of a growing number of experiments testing whether LLMs can do the hardest job in finance: beat the market. While these\u00a0contests\u00a0are far from academically rigorous, they\u2019re the most public demonstration yet of what happens when the systems try to take on some of the most lucrative and high-stakes work on Wall Street.<br \/>\n&#13;<br \/>\n\tThe early results matter because trading is one job the financial industry has been cautious about handing entirely to AI. Over the past few years, heavyweights from JPMorgan Chase &#038; Co. to Balyasny Asset Management have put the technology to work nearly everywhere else. LLMs now\u00a0parse news\u00a0at quant shops,\u00a0draft\u00a0memos at hedge funds, and detect fraud at big banks, among other tasks. But \u201chuman in the loop\u201d remains the motto when it comes to trading real money. Perhaps for good reason.<br \/>\n&#13;<br \/>\n\t\u201cLLMs can\u2019t really make money by themselves,\u201d said Jay Azhang, founder of Nof1. \u201cYou need basically a very sophisticated harness and scaffolding and data platform in order to even give them a chance.\u201d<br \/>\n&#13;<br \/>\n\tLLMs are good at doing research and finding and deploying the correct tools for certain tasks, he said. But they don\u2019t yet know how much each of the many variables that swing stocks \u2014 including things like analyst ratings, insider transactions, and sentiment shifts \u2014 actually matters. They tend to mistime their trades, incorrectly size positions and buy and sell too often.<br \/>\n&#13;<br \/>\n\tThe AI\u00a0blog\u00a0Flat Circle tracked 11 markets-related arenas, and all had at least one model that made money. But in only two of the arenas was the median model profitable, showing how most struggled to beat the market.\u00a0<br \/>\n&#13;<br \/>\n\tThat outcome mirrors human performance, since a majority of actively managed funds famously also lag the broad market. And just like people, the models can be prone to obvious bias. The arenas show the AI systems making very different decisions with identical instructions, which has big implications for any firm deploying them. For instance, Azhang said that in Alpha Arena\u2019s latest run, Claude mostly wanted to go long, Gemini had no problem being short, and Qwen was comfortable taking risks with big leverage.\u00a0<br \/>\n&#13;<br \/>\n\t\u201cThey have personalities that you have to manage almost like a human analyst,\u201d said Doug Clinton, who runs Intelligent Alpha, a firm with an LLM-driven fund that publishes its own\u00a0benchmark\u00a0for how well AI predicts corporate earnings. Results can be improved by letting the model know it\u2019s showing some bias, he said.<br \/>\n&#13;<br \/>\n\tIntelligent Alpha\u2019s benchmark gives 10 AI models access to financial filings, analyst forecasts, earnings transcripts, macroeconomic data and up to 10 web searches. With its narrower focus, the results are more positive for LLMs. In the fourth quarter of 2025, OpenAI\u2019s ChatGPT correctly predicted the direction of earnings estimates 68% of the time \u2014 the best results yet. And the models, Clinton said, tend to improve with every new release.<br \/>\n&#13;<br \/>\n\tHedge Fund Secrets<br \/>\n&#13;<br \/>\n\tEvaluating any of this is hard. Design choices in everything from how often the models run to what assets they trade makes a big difference. And the default test for a trading strategy \u2014 running it backward through history to see how it would have performed \u2014 doesn\u2019t really work for AI.\u00a0<br \/>\n&#13;<br \/>\n\tA model asked in 2026 how it would have traded in March 2020 already knows what March 2020 looked like. That contamination, known as\u00a0lookahead bias, has challenged the frameworks underlying academic and quantitative finance for decades. LLMs have to be assessed in live markets instead, hence the proliferation of benchmarks and arenas.<br \/>\n&#13;<br \/>\n\tPerhaps because they mostly lose money, AI trading arenas tend to run for only short periods of time. With the low barriers to entry, many are set up by individuals or startups using the platforms as a launchpad for other products.<br \/>\n&#13;<br \/>\n\tNof1 is preparing season two of Alpha Arena, which will give each AI model the ability to search the web, ponder for longer, access more data sources and take multiple steps. But ultimately the firm\u2019s business is a system enabling retail traders to build AI trading\u00a0agents\u00a0for their own strategies.<br \/>\n&#13;<br \/>\n\t\u201cGiving an LLM money right now and just having it go \u2014 that\u2019s not a thing yet,\u201d said Azhang.<br \/>\n&#13;<br \/>\n\tMost of the public experiments are still too short and too noisy to support firm conclusions, reckons Jim Moran, who writes the Flat Circle blog and who previously co-founded alternative-data provider YipitData. These arenas also have natural disadvantages, including limited access to proprietary stock research and inferior execution.<br \/>\n&#13;<br \/>\n\t\u201cIf you took one of these agents from one of these arenas and you just moved it over to operate inside of a high-end hedge fund, they should perform better,\u201d he said.<br \/>\n&#13;<br \/>\n\tAlexander Izydorczyk, formerly head of data science at the hedge fund Coatue Management and now at NX1 Capital, recently wrote that no AI trading bot he tracks has yet shown a lasting edge. He argued the arenas are limited by what they cannot see in their training data: the practical quant techniques used inside secretive trading shops.<br \/>\n&#13;<br \/>\n\tHe suggested the same secrecy is also a preview of where any AI that does begin to work will eventually go.<br \/>\n&#13;<br \/>\n\t\u201cBut beginners sometimes see things incumbents cannot,\u201d Izydorczyk wrote on his personal\u00a0blog. \u201cThe outsiders, if successful, will also learn quickly that success in liquid, competitive markets pays better than the marginal X follower. When LLM agent trading strategies start working, you will not hear about it for a while.\u201d\u00a0<br \/>\n&#13;<br \/>\n\tThis article was provided by Bloomberg News.<br \/>\n&#13;<br \/>\n\t\u00a0<\/p>\n<p><br \/>\n<br \/><a href=\"https:\/\/www.fa-mag.com\/news\/ai-bots-auditioning-for-wall-street-trading-are-mostly-losing-86902.html\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#13; AI isn\u2019t ready to replace your fund manager \u2014 and the public experiments testing it are showing why.&#13; &#13; Across a series of new trading contests between the world\u2019s leading AI models, the verdict so far is unflattering. Most of the systems lose money. They trade too much. They make wildly different decisions when [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2420,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[676],"tags":[835,996,991,990,992,998,997,994,995,993],"class_list":["post-2419","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-ai","tag-ai","tag-are","tag-auditioning","tag-bots","tag-for","tag-losing","tag-mostly","tag-street","tag-trading","tag-wall"],"_links":{"self":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/2419","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2419"}],"version-history":[{"count":0,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/posts\/2419\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=\/wp\/v2\/media\/2420"}],"wp:attachment":[{"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2419"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2419"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/daiilynews.cu.ma\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2419"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}