|
Welcome to AI Decoded, Fast Companys weekly newsletter that breaks down the most important news in the world of AI. You can sign up to receive this newsletter every week here. After a week of DeepSeek freakout, doubts and mysteries remain The Chinese company DeepSeek sent shockwaves through the AI and investment communities this week as people learned that it created state-of-the-art AI models using far less computing power and capital than anyone thought possible. The company then showed its work in published research papers and by making its models available to other developers. This raised two burning questions: Has the U.S. lost its edge in the AI race? And will we really need as many expensive AI chips as weve been told? How much computing power did DeepSeek really use? DeepSeek claimed it trained its most recent model for about $5.6 million, and without the most powerful AI chips (the U.S. barred Nvidia from selling its powerful H100 graphics processing units in China, so DeepSeek made do with 2,048 H800s). But the information it provided in research papers about its costs and methods is incomplete. The $5 million refers to the final training run of the system, points out Oregon State University AI/robotics professor Alan Fern in a statement to Fast Company. In order to experiment with and identify a system configuration and mix of tricks that would result in a $5M training run, they very likely spent orders of magnitude more. He adds that based on the available information its impossible to replicate DeepSeeks $5.6 million training run. How exactly did DeepSeek do so much with so little? DeepSeek appears to have pulled off some legitimate engineering innovations to make its models less expensive to train and run. But the techniques it used, such as Mixture-of-experts architecture and chain-of-thought reasoning, are well-known in the AI world and generally used by all the major AI research labs. The innovations are described only at a high level in the research papers, so it’s not easy to see how DeepSeek put its own spin on them. Maybe there was one main trick or maybe there were lots of things that were just very well engineered all over, says Robert Nishihara, cofounder of the AI run-time platform Anyscale. Many of DeepSeeks innovations grew from having to use less powerful GPUs (Nvidia H800s instead of H100s) because of the Biden Administrations chip bans. Being resource limited forces you to come up with new innovative efficient methods, Nishihara says. Thats why grad students come up with a lot of interesting stuff with far less resourcesits just a different mindset. What innovation is likely to influence other AI labs the most? As Anthropics Jack Clark points out in a recent blog post, DeepSeek was able to use a large model, DeepSeek-V3 (~700K parameters), to teach a smaller R1 model to be a reasoning model (like OpenAIs o1) with a surprisingly small amount of training data and no human supervision. V3 generated 800,000 annotated text samples showing questions and the chains of thought it followed to answer them, Clark writes. DeepSeek showed that after processing the samples for a time the smaller R1 model spontaneously began to think about its answers, explains Andrew Jardine, head of go-to-market at Adaptive ML. You just say here’s my problemcreate some answers to that problem and then based on the answers that are correct or incorrect, you give it a reward [a binary code that means good] and say try again, and eventually it starts going I’m not sure; let me try this new angle or approach or that approach wasnt the right one, let me try this other one and it just starts happening on its own. Theres some real magic there. DeepSeeks researchers called it an aha moment. Why havent U.S. AI companies already been doing what DeepSeek did? How do you know they havent? asks Jardine. We don’t have visibility into exactly the techniques that are being used by Google and OpenAI; we don’t know exactly how efficient the training approaches are. Thats because those U.S. AI labs dont describe their techniques in research papers and release the weights of their models, as DeepSeek did. There’s a lot of reason to believe they do have at least some of these efficiency methods already. It should come as no surprise if OpenAIs next reasoning model, o3, is less compute-intensive, more cost-effective, and faster than DeepSeeks models. Is Nvidia stock still worth 50X of earnings? Nvidia provides up to 95% percent of the advanced AI chips used to research, train, and run frontier AI models. The companys stock lost 17% of its value on Monday when investors interpreted DeepSeeks research results as a signal that fewer expensive Nvidia chips would be needed in the future than previously anticipated. Metas Yann LeCun says Mondays sell-off grew from a major misunderstanding about AI infrastructure investments. The Turing Award winner says that while DeepSeek showed that frontier models could be trained with fewer GPUs, the main job of the chips in the future will be during inferencethe reasoning work the model does when its responding to a users question or problem. (Actually, DeepSeek did find a novel way of compressing context window data so that less compute is needed during inference.) He says that as AI systems process more data, and more kinds of data, during inference, the computing costs will continue to increase. As of Wednesday night, the stock has not recovered. Did DeepSeek use OpenAI models to help train its own models? Nobody knows for sure, and disagreement remains among AI experts on the question. The Financial Times reports Wednesday that OpenAI believes it has seen evidence that DeepSeek did use content generated by OpenAI models to train its own models, which would violate OpenAIs terms. Distillation refers to saving time and money by feeding the outputs of larger, smarter models into smaller models to teach them how to handle specific tasks. Weve just experienced a moment when the open-source world produced some models that equaled the current closed-source offerings in performance. The real cost of developing the DeepSeek models remains an open question. But in the long run the AI companies that can marshal the most cutting-edge chips and infrastructure will very likely have the advantage as fewer performance gains can be wrung from pretraining and more computing power is applied at inference, when the AI must reason toward its answers. So the answers to the two burning questions raised above are probably not and likely yes. The DeepSeek breakthroughs could be good news for Apple The problem of finding truly useful ways of using AI in real life is becoming more pressing as te cost of developing models and building infrastructure mounts. One big hope is that powerful AI models will become so small and efficient that they can run on devices like smartphones and AR glasses. DeepSeeks engineering breakthroughs to create cheaper and less compute-hungry models may breathe new life into research on small models that live on edge devices. Dramatically decreased memory requirements for inference make edge inference much more viable, and Apple has the best hardware for exactly that, says tech analyst Ben Thompson in a recent Stratechery newsletter. Apple Silicon uses unified memory, which means that the CPU, GPU, and NPU (neural processing unit) have access to a shared pool of memory; this means that Apple’s high-end hardware actually has the best consumer chip for inference. Stability AI founder Emad Mostaque says that reasoning models like OpenAI’s o1 and DeepSeek’s R1 will run on smartphones by next year, performing PhD-level tasks with only 20 watts of electricityequivalent to the human brain. OpenAI releases an AI agent for government workers OpenAI this week announced a new AI tool called ChatGPT Gov thats designed specifically for use by U.S. government agencies. Since sending sensitive government data out through an API to an OpenAI server presents obvious privacy and security problems, ChatGPT Gov can be hosted within an agency’s own private cloud environment. [W]e see enormous potential for these tools to support the public sector in tackling complex challengesfrom improving public health and infrastructure to strengthening national security, OpenAI writes in a blog post. The Biden Administration in 2023 directed government agencies to find productive and safe ways to use new generative AI technology (Trump recently revoked the executive order). The Department of Homeland Security, for example, built its own AI chatbot, which is now used by thousands of DHS workers. OpenAI says 90,000 users within federal, state, and local government offices have already used the companys ChatGPT Enterprise product. More AI coverage from Fast Company: Microsoft posts 10% growth for Q4 as it plans to spend $80 billion on AI AI assistants for lawyers are a booming businesswith big risks Why we need to leverage AI to address global food insecurity Alibaba rolls out AI model, claiming its better than DeepSeek-V3 Want exclusive reporting and trend analysis on technology, business innovation, future of work, and design? Sign up for Fast Company Premium.
Category:
E-Commerce
Dallas is prepared to spend big to protect its logo. In fact, the Dallas City Council voted last week to spend up to $200,000 as part of a federal lawsuit to cancel the trademark of Triple D Gear, a Dallas apparel company that the city argues uses a logo so similar to its own that it causes confusion. One sign of a good civic mark, whether it’s a logo or a flag, is whether it becomes a symbol of popular expression. People get tattoos of the Chicago flag, for example, but not the flag of Illinois (hence the state’s efforts to redesign it). The Dallas logo, then, has done its job. Maybe too well. The city’s logo, which has been in use since 1972, features concentric D shapes made from three stripes with a stylized tree in the center. It’s a great logominimalist and contemporary, even at more than 50 years old. The city considered scrapping the logo in 2015, but ultimately decided against it. From left: the Triple D Gear logo; the city of Dallas logo According to the city, Dallas registered its logo as a trademark in 1972 in the state of Texas. It wasn’t until 2020 that the city filed for federal trademark protection. By that time, Triple D Gear had already registered its own logo with the U.S. Patent and Trademark Office (USPTO). Its logothree concentric Ds with a star in the middlehad been trademarked since 2014 for use on apparel and athletic gear. It later filed for another trademark in 2020 for a logo showing just the concentric Ds. When Triple D Gear founder Turo Sanchez appeared on Good Morning Texas in 2018, he said, Basically, we took the city of Dallas logo and we just put a star in it and tilted it. When the show’s host joked that they’d get sued by the city, Sanchez’s co-owner noted that the companys logo was trademarked. It’s a David and Goliath story here, Sanchez tells Fast Company about the legal fight. It’s the big man going against the small business and trying to overpower, especially when the small businesses have been doing everything by the book. He says a trial date is scheduled for May 5. The city of Dallas, which tells Fast Company it does not comment on pending litigation, filed a complaint over the apparel company’s logo in 2021. The USPTO denied Dallas’s petition in 2023, according to the Dallas Morning News, which first reported the legal battle, and the city then filed suit. At issue in the denied petition is whether the city has the right to use the logo on clothing, which Triple D Gear specifically included in its trademark application. The USPTO said the city had not provided enough evidence to prove that it had established use of its logo for apparel. In a brief filed last week, the city of Dallas argued that the USPTO’s Trademark Trial and Appeal Board committed clear legal error in requiring it to show evidence that it specifically used its logo on shirts, noting that its use of the logo on other goods and services, including uniforms for employees, should be enough to prove its case. Perhaps the best example of an often-imitated and beloved city logo is the I NY mark, designed by Milton Glaser. New York has taken great pains to protect the mark, with the New York State Department of Economic Development sending out countless cease-and-desist letters to knockoff versions and requiring prior approval and a license agreement to use it. Dallas has a license agreement with Southern Methodist University (SMU), which has its own concentric D logo that includes its mustang mascot; Triple D Gear filed suit against SMU in 2023. Dallas has such an iconic logo that it’s inspired imitators. Whether or not it has the trademark to it, though, remains to be seen. A federal court could decide.
Category:
E-Commerce
Peavey Industries LP, Canadas largest farm and ranch retailer, has announced the shuttering of all its locations across the country following its filing for creditor protection under the Companies Creditors Arrangement Act (CCAA), granted by the Court of Kings Bench Alberta. The closures will impact 90 Peavey Mart stores and six MainStreet Hardware locations, with liquidation sales starting immediately, marking the end of nearly six decades of operations for the Alberta-based company, which has long been a key player in Canadas rural and suburban retail landscape. ‘A profoundly difficult decision’ The news comes as Canada’s retail industry faces unprecedented challenges, including low consumer confidence, inflation, rising costs, supply disruptions, and a tough regulatory environmentfactors that have significantly impacted businesses like Peavey. This was a profoundly difficult decision, but one that allows us to explore the best possible alternatives for the future of the Company, said Doug Anderson, president and CEO of Peavey Industries LP, in a statement. For nearly six decades, our customers’ loyalty, employees’ dedication, and the resilience of the communities we serve have been the cornerstone of our business. We remain focused on working with our partners and stakeholders to preserve the Peavey brand and the value it represents. Gordon Brothers is managing closing sales and liquidation. The investment and restructuring firm has recently managed affairs for bankrupt American retailer Big Lots and recently made a bid for embattled fabrics chain Joann. Store closing sales will offer discounts of up to 30% off the original ticketed prices on agricultural supplies, farm and ranch supplies, workwear, lawn and garden essentials, hardware, and homesteading merchandise, according to Gordon Brothers. A loss to rural communities The closing of the trusted chain will be particularly felt in rural areas. In the city of Weyburn, Saskatchewan, city councillor Laura Morrissette told CBC its local Peavey Mart was seen as an asset for bringing business the core area. Before being elected as councillor for the city of approximately 11,000 people, Morrissette even worked for the retailer. “[We] had a mantra at Peavey Mart,” she said to CBC. “‘If you can’t find it at Peavey Mart, you probably didn’t need it anyway.'”
Category:
E-Commerce
All news |
||||||||||||||||||
|