|
Twenty-four hours before the White House and Silicon Valley announced the $500 billion Project Stargate to secure the future of AI, China dropped a technological love bomb called DeepSeek. DeepSeek R1 is a whole lot like OpenAIs top-tier reasoning model, o1. It offers state-of-the art artificial thinking: the sort of logic that doesn’t just converse convincingly, but can code apps, calculate equations, and think through a problem more like a human. DeepSeek largely matches o1s performance, but it runs at a mere 3% the cost, is open source, can be installed on a companys own servers, and allows researchers, engineers, and app developers a look inside and even tune the black box of advanced AI. In the two weeks since it launched, the AI industry has been supercharged with fresh energy around the products that could be built next. Through a dozen conversations with product developers, entrepreneurs, and AI server companies, its clear that the worried narratives most of us have heard about DeepSeekits Chinese propaganda, its techie hypedoesnt really matter to a free market. Everyone wants OpenAI-like quality for less money, says Andrew Feldman, CEO and cofounder of the AI cloud hosting service Cerebras Systems that is hosting DeepSeek on its servers. DeepSeek has already driven down OpenAIs own pricing on a comparable model by 13.6x. Beyond cost, DeepSeek is also demonstrating the values of open technologies versus closed, and wooing interest from Fortune 500s and startups alike. OpenAI declined an interview for this piece. Not to overstate it, but weve been in straight up giddy mode over here since [DeepSeek] came out, says Dmitry Shevelenko, chief business officer at Perplexity, which integrated DeepSeek into its search engine within a week of its release. We could not have planned for this. We had the general belief this is the way the world could go. But when it actually starts happening you obviously get very excited. Looking back five years from now, DeepSeek may or may not still be a significant player in AI, but its arrival will be considered a significant chapter in accelerating our era of AI development. The new era of low-cost thought, powered by interoperability Kreaan AI-based creative suitehad long considered adding a chatbot to the heart of its generative design tools. When DeepSeek arrived, their decision was made. Krea spent 72 hours from the time R1 was announced to integrating it as a chat-based system to control their entire editing suite. Released on a Monday, the team realized by that afternoon that DeepSeeks APIs worked with their existing tools, and it could even be hosted on their own machines. By Tuesday, they were developing a prototype, coding and designing the front end at the same time. By 3 a.m. Wednesday, they were done, so they recorded a demo video and shipped it by 7 a.m. Thats part of our culture; every Wednesday we ship something and do whatever it takes to get it done, says cofounder Victor Perez. But its a type of marketing that’s actually usable. People want to play with DeepSeek, and now they can do it with Krea. [Source Images: Gunes Ozcan/Getty Images] Krea’s story illustrates how fast AI is moving, and how product development in the space largely hinges on whatever model can deliver on speed, accuracy, and cost. Its the sort of supply-meets-demand moment thats only possible because of a shift underway in AI development. The apps we know are increasingly powered by AI engines. But something most people dont realize about swapping in and out a large language model like R1 for 03, or ChatGPT for Claude, is that its remarkably easy on the backend. It would literally be a one-line change for us, says Sam Whitmore, cofounder of New Computer. We could switch from o3 to DeepSeek in like, five minutes. Not even a day. Like, its one line of code. A developer only needs to point a URL from one AI host to another, and more often than not, they’re discovering the rest just works. The prompts connecting software to AI engines still return good, reliable answers. This is a phenomenon we predicted two years ago with the rise of ChatGPT, but even Perez admits his pleasant surprise. Developers of [all] the models are taking a lot of care for this integration to be smooth, he says, and he credits OpenAI for setting API standards for LLMs that have been adopted by Anthropic, DeepSeek, and a host of others. But the [AI] video and image space is still a fucking mess right now, he laughs. Its a completely different situation. Why DeepSeek is so appealing to developers In its simplest distillation, DeepSeek R1 gives the world access to AIs top tier thinking machine, which can be installed and tuned on local computers or cloud servers rather than connecting to OpenAIs models hosted by Microsoft. That means developers can touch and see inside the code, run it at a fixed cost on their own machines, and have more control over the data. Called inference models, this generation of reasoning AI works differently than the large language models like ChatGPT. When presented with a question, they follow several logical paths of thought to attempt to answer it. That means they run far slower than your typical LLM, but for heavy reasoning tasks, that time is the expense of thinking. Developing these systems is computationally incredible. Even before the advanced programming methods were involved, DeepSeeks creators fed the model 14.8 trillion pieces of information known as tokens, which constitute a significant portion of the entire internet, notes Iker García-Ferrero, a machine researcher at Krea. From there, reasoning models are trained with psychological rewards. Theyre asked a simple math problem. The machine guesses answers. The closer it gets to right, the bigger the treat. Repeat countless times, and it learns math. R1 and its peers also have an additional step known as instructional tuning, which requires all sorts of hand-made examples to demonstrate, sa, a good summary of a full article, and make the system something you can talk to. Some of their optimizations have been overhyped by the general public, as many were already well known and used by other labs, concedes García-Ferrero, who notes the biggest technological breakthrough was actually in an R1 zero sub model few people in the public are talking about because it was built without any instructional tuning (or expensive human intervention). But the reason R1 took off with developers was the sheer accessibility of high tech AI. [Before R1], there weren’t good reasoning models in the open source community, says Feldman, whose company Cerebras has constructed the worlds largest AI processing chip. They built upon open research, which is what you’d want from a community, and they put out a comprehensiveor a fairly comprehensive paper on what they did and how. A few beats later, Feldman echoes doubt shared by many of his peers. [The paper] included some things that are clearly bullshit . . . they clearly used more compute [to train the model] than they said. Others have speculated R1 may have queried OpenAIs models to generate otherwise expensive data for its instructional tuning steps, or queried o1 in such a way that they could deconstruct some of the black box logic at play. But this is just good old reverse engineering, in Feldmans eyes. [Source Images: Gunes Ozcan/Getty Images] If you’re a car maker, you buy the competitor’s car, and you go, Whoa, that’s a smooth ride. How’d they do that? Oh, a very interesting new type of shock! Yeah, that’s what [DeepSeek] did, for sure. China has been demonized for undercutting U.S. AI investment with a free DeepSeek, but its easy to forget that, two years ago, Meta did much the same thing when, trailing Microsoft and Google in the generative AI race, it released LLaMa as the first open source AI of early LLMs. There was one difference, however: The devil is in the details with open source agreements, and while LLaMa still includes provisions stopping its commercial use by Metas competitive companies, DeepSeek used MITs gold standard license that blows it wide open for anything. Now that R1 is trained and in the wild, the how, what, and why matter mostly to politicians, investors, and researchers. Its a moot point to most developers building products that leverage AI engines. I mean, it’s cool, says Jason Yuan, cofounder of the AI startup New Computer. We’re painters, and everyone’s competing over giving you better and cheaper paints. A wave of demand for DeepSeek Feldman describes the last two weeks at Cerebras as overwhelming, as engineers have been getting R1 running on their servers to feed clients looking for cheap, smart compute. It’s like, every venture capitalist calls you and says, I got a company that can’t find supply. Can you help out? Im getting those three, four times a day, says Feldman. It means you’re getting hundreds of requests through your website. Your sales guys can’t return calls fast enough. That’s what it’s like. These sentiments are shared by Lin Qiao, CEO and cofounder of the cloud computing company Fireworks, which was the first U.S.-based company to host DeepSeek R1. Fireworks has seen a 4x increase in user signups month-over-month, which it attributes to offering the model. Qiao agrees that part of the appeal is price. Ive heard estimates that R1 is about 3% the cost of o1 to run, and Qiao notes that on Fireworks, theyre tracking it as 5X cheaper than o1. Notably, OpenAI responded to DeepSeek with a new model released last week called o3 mini. According to Greg Kamradt, the founder of ARC Prize, a nonprofit AI benchmarking competition, o3 mini is 13.6x cheaper than o1 processing tasks. Cerebras admits o3 is all around more advanced than DeepSeeks R1, but claims the pricing is comparable. Fireworks contends o3 mini is still less expensive to query than R1. The truth is that costs are moving targets, but the bigger takeaway should be that R1 and o3 mini are similarly cheap. And developers dont need to bet on either horse today to take advantage of the new competition. Our philosophy is always to try all models, writes Ivan Zhao, founder and CEO of Notion, over email. We have a robust eval system in place, so it’s pretty easy to see how each model performs. And if it does well, is cost effective, and meets our security and privacy standards, then we’ll consider it.” DeepSeek offers transparent thought for the first time Shevelenko insists that integrating DeepSeek into Perplexity was more than a trivial effort. I wouldnt put it in the mindless bucket, he says. But the work was still completed within a week. In many ways, the larger concern for integration was not, would it function, but could Perplexity mitigate R1s censorship on some topics as it leveraged AI for real time internet queries. The real work was we quickly hired a consultant thats an expert in Chinese censorship and misinformation, and we wanted to identify all the areas in which the DeepSeek model was potentially being censored or propagating propaganda, says Shevelenko. And we did a lot of post-training in a quick time frame . . . to ensure that we were answering any question neutrally. But that work was worth it because, it just makes Perplexity better, he says. Shevelenko is not talking in platitudes; with DeepSeek, Perplexity can do something the world has never seen before: Offer a peek inside how these AIs are actually thinking through a problem. This feature of R1 is called chain-of-thought. Perplexity always offered some transparency in its front end, listing the websites it was crawling on your behalf to answer a question. But now, it will list the prompts as R1 literally talks to itself, step by step, as it reasons through an answer. [Screenshot: courtesy of the author] OpenAI, for competitive purposes, never exposed [chain-of-thought]. One of Perplexitys strengths is UI; we are able to quickly figure out an elegant way of showing you how the model is thinking in real time, says Shevelenko. There’s a curiosity and a utility to it. You can see where the thinking may have gone wrong and reprompt, but more than anything, part of the whole product law at Perplexity is not that you always get the best answer in one shot, its that youre guided on the way to ask better and better questions. It makes you think of other questions. Seeing AI reasoning laid bare also creates more intimacy with the user. The biggest problem of AI right now, how can we trust it? Because we all know AI can hallucinate, says Qiao. However, if transparent thought can bridge this gap of trust, then she imagines developers will begin to do a lot more we cant think of yet with all of this thinking data. [Screenshot: courtesy of the author] There may be products built directly on top of chain-of-thought. Those products could be general search, or all kinds of assistants: coding assistant, teaching assistant, medical assistants. She also believes that, while AI has been obsessed with the assistant metaphor since the launch of ChatGPT, transparent thought will actually give people more faith in automated AI systems because it will leave a trail that humans (or more machines!) can audit. Buying breathing room for the future Even as debates about Chinese vs U.S. innovation rage on, the biggest single impact that DeepSeek will have is giving developers more autonomy and capability. Some, like Anthropic CEO Dario Amodei, argue that we are simply witnessing the known pricing and capability curve of AI play out. Others recognize the kick in the ass that DeepSeek offered an industry hooked on fundraising and opaque profit margins. Theres no way OpenAI would have priced o3 as low as they had it not for R1, says Shevelenko. It’s a bit of a moving target, once you have an open source drop it dramatically curves down the pricing for closed models, too. While nothing is to say that OpenAI or Anthropic wont release a far more cutting edge model tomorrow that puts these systems to shame, this moving target is providing confidence to developers, who now see a path toward realizing implementations theyd only fantasized about, especially now that they can dip their own fingers into advanced AI. R1 on its own is still relatively slow for many tasks; a question might take 30 seconds or more to answer, as it has a habit for obsessively double checking its own thinking, perhaps even burning extra energy than it needs to in order to give you an answer. But since it’s open source, the community can distill R1think of it like a low cost cloneto run faster and in lower power environments. Indeed, developers are already doing this. Cerebras demonstrated a race between its own distilled version of R1 to code a chess game against an o3 mini. Cerberus completed the task in 1.2 seconds versus 22 seconds on o3. Efficiencies, fueled by both internal developers and the open source community, will only make R1 more appealing. (And force proprietary model developers to offer more for less.) At Krea, the team is most excited about the same thing thats exciting the big AI server companies: They can actually task an engineer to adjust the weights of this AI (essentially tuning its brain like a performance vehicle). This might allow them to run an R1 model on a single GPU themselves, sidestepping cloud compute altogether, and it can also let them mix homebuilt AI models with it. Being able to run models locally on office workstations, or perhaps even distilling them to run right on someones phone, can do a lot to reduce the price of running an AI company. Right now, developers of AI products are torn between short term optimizations and long-term bets. While they charge $10 to $30 a month, those subscriptions make for a bad business today thats really betting on the future. Its really hard for any of those apps to be profitable because of the cost of doing intelligent workflows per person. There’s always this calculus you’re doing where it’s like, OK, I know that it’s going to be cheap, long, long term. But if I build the perfect architecture right now with as much compute as I need, then I may run out of money if a lot of people use it in a month, says Whitmore. So the pricing curve is difficult, even if you believe that long term, everything will be very cheap. What this post-DeepSeek era will unlock, Whitmore says, is more experimentation from developers to build free AI services because they can do complicated queries for relatively little money. And that trend should only continue. I mean, the price of compute over the past 50 years has [nosedived], and now you have 30 computers in your house. Each of your kids has toys with it. Your TVs have computers in them. Your dishwashers have computers in them. Your fridges probably have five. If you look around, you got one in your pocket, says Feldman. This is what happens when the price of compute drops: You buy a shitload of it. And what this will mean for the UX of AI will naturally change, too. While the way most of us use AI is still based in metaphors of conversation, when it can reason ahead faster than we can converse, the apps of tomorrow may feel quite differenteven living steps ahead of where we imagine going next. As humans, even the smartest of us, take time to reason. And right now, we’re used to reasoning models taking a bit of time, says Yuan of New Computer. But swing your eyes just a few months or even a year, and imagine thinking takes one second or less, or even microseconds. I think that’s when you’ll start seeing the quote unquote AI native interfaces beyond chat. I think it’s even hard to kind of imagine what those experiences will feel like, because you can’t really simulate it. Even with science fiction, theres this idea hat thinking takes time, he continues. And that’s really exciting. It feels like this will happen.
Category:
E-Commerce
Activists have decried Shein for years, calling out its devastating impact on the environment and exploitation of workers. But with a stroke of his pen, President Donald Trump appears to have upended Sheins business model, making it harder for the Chinese fast-fashion brand to keep selling clothes at rock-bottom prices. During the pandemic, Shein and online marketplace Temu exploded in popularity in the United States. Both companies manufacture low-quality goods in Chinese factories using cheap labor, then sell them to American consumers at extremely low prices. But Shein and Temu also had a distinct advantage over their competition. While American companies like Gap ship large quantities of inventory from overseas factories into U.S. warehousespaying all the requisite taxes and tariffsthese Chinese companies ship products directly from factories to consumers’ houses. This allows them to take advantage of an obscure loophole in the U.S. tax code called de minimis, which allows packages containing less than $800 of merchandise to ship duty-free. In 2022, Shein and Temu paid $0 in import taxes, whereas Gap paid $700 million and H&M paid $205 million. American and European brands pass their costs on to customers in the form of higher-priced goods, which has sent many shoppers looking for deals elsewhere. This was their basic advantage, says Kinshuk Jerath, a Columbia Business School professor. [Shein and Temu] built their entire business model on de minimis. This week, that competitive advantage vanished, as Trump imposed a 10% tariff on all goods imported from China and also ended the de minimus tax exemption. Given how quickly these tariffs were rolled out, there’s a lack of clarity about exactly how much companies will have to pay. Logistics agents who import goods to the U.S. are already asking vendors to pay an extra 30% on the retail price of goods shipped from Hong Kong and China. Depending on the actual tariffs U.S. Customs imposes, these agents will either return part of that fee or seek additional payments. Experts say Shein and Temu will have to raise their prices in response. And since these companies’ main selling point is their low prices, consumers may be less inclined to shop with them. This, in turn, could shake up the retail landscape in the U.S. Garment packages at a textile factory that supplies clothes to fast-fashion e-commerce company Shein in Guangzhou, China [Photo: Jade Gao/AFP/Getty Images] Could Shein and Temu find a way keep costs down? While many were taken aback by how quickly Trump abolished the de minimus exemption, lawmakers from both parties have been trying to get rid of the loophole for several years. The rule first came about in 1930, when most small-value packages were sent between individuals, and the government didnt think it was worth the administrative cost for the tax revenue it would collect. In 2016, the exemption limit was raised from $200 worth of goods to $800, to further reduce the administrative burden. But then Shein and Temu entered the scene. They were almost single-handedly responsible for increasing the number of such shipments from 140 million in 2014 to 1 billion in 2023. Many experts believe that Shein and Temu will have no choice but to raise their prices. These companies have already found ways to slash costs throughout the supply chain, leaving little room for cuts elsewhere. Shein has been accused of exploiting workers, forcing them into 17-hour shifts to make hundreds of garments a day at a base salary of $20, which would then be slashed by $14 if they made any mistakes. While it is possible for Shein and Temu to absorb the cost of these taxes for a short time, Jerath doesn’t believe thats a viable long-term strategy. These companies do have big pockets, he says. But they will not be able to absorb costs indefinitely while keeping the company profitable. And the question is, if you have to keep selling at a loss forever, at what point is the business model no longer successful? Giacomo Santangelo, senior economics lecturer at Fordham University, says American companies like Amazon and Uber were willing to take a loss for some time in order to put their competitors out of business. This strategy won’t work for Shein and Temu because there’s no chance they will be able to successfully wipe out their competitors, he says. [Photo: Nikos Pekiaridis/NurPhoto/Getty Images] Shaking Up Consumer Preferences Forcing Shein and Temu to pay taxes levels the playing field for other brands. But perhaps more important, it changes the market, says Itamar Zur, CEO of Veho, a shipping company that serves brands like Macy’s, Sephora, and Stitch Fix. With these ultracheap Chinese players on the market, many brands felt forced to compete on price. But as prices even out, brands can start competing across other dimensions, like the speed of delivery and quality. Shipping products from China took seven to ten days, but consumers were willing to wait to get their products at such low prices, Zur says. But if it costs the same or just a little more to buy from a U.S. brand that can ship the products in two days, many consumers might opt for the U.S. brand. Shein does have some warehouse spce in the U.S. that allows it to ship products faster. It has already started driving American customers to buy products shipped locally by prioritizing these items in search results. Zur says that Shein may shift more of its inventory to U.S. warehouses, but this would effectively upend its business model. Until now, Shein has made products on demand based on consumer preferences. It adds 2,000 to 10,000 new items to its website every day, and mass-produces only the items that consumers seem to like. But if the company chooses to warehouse clothes in the U.S., it will need to predict what consumers will want to buy weeks in advance and send that inventory over. And, of course, this inventory would be taxed. Brands might also start to compete on the make and longevity of a product, Zur says. Shein and Temu are known for selling very low-quality products, and if they raise their prices, consumers may opt to shop for items that are more durable. As a consumer, your entire calculation changes when these ultralow prices are off the table, he says. If you’re going to spend more money, you might not want your T-shirt or dress to be disposable. In fact, you might choose to spend a few dollars more for a shirt you can wear for years. While Trump’s tariffs have the capacity to transform the market and consumer behavior, Fordhams Santangelo warns that things are changing quickly and its still possible that the administration will reverse its decisions. Trump was going to impose a 25% tariff on Mexico and Canada on Tuesday, but decided to postpone for 30 days; its unclear whether those tariffs will eventually take effect. It’s a very fluid situation, Santangelo says. We can’t really make predictions because everything might change again overnight.
Category:
E-Commerce
Imagine youre an academic researcher. Youre writing a pitch for funding to the National Science Foundation (NSF), the independent agency of the federal government that funds projects designed to advance our understanding of the world. But you cant use the words excluded, historically, socioeconomic, systemic, or women. Go. Thats the quandary that researchers across the country find themselves in thanks to a misguided attempt to try to eliminate what President Donald Trump and lackey Elon Musk would likely describe as woke research. The Trump administrations drive to tamp down studies that promote an agenda pushing diversity, equity, and inclusion (DEI) involves hitting small nails with very big, very blunt hammers, with all ongoing and future research projects reportedly being analyzed to see whether they contain any number of newly forbidden words. Among them are terms that Trump and others might dislike, such as diversity, inequities, or multicultural. But there are also words that almost certainly get caught in the dragnet inadvertently, including women and historically. The mood is pretty glum here, says one academic, granted anonymity because of a fear of reprisals. While my work has implications for DEI, its not explicitly DEI in writing. For academics who do work in this space, its a death knell. It really seems like a huge mess, says a second academic researcher, also granted anonymity to be able to speak over fear of reprisals or their research being targeted as a result of speaking out. The list is long and vague enough that all kinds of research will potentially be harmed. Everything from biomedical research to engineering to research in the social sciences. That researcher says they believe the guidelines have been drawn vaguely by design, not an accident, in order to give the governmentthrough the NSFenough leeway to block anything they want to. The Institute of Electrical and Electronics Engineers, the worlds largest technical professional organization for tech research in academia, declined to comment for this story. ACM supports technology research in a wide array of areas and understands that priorities for funding of research can shift for a variety of reasons, Jody Westby, vice chair of the Association for Computing Machinerys US Technology Policy Committee, wrote in a statement to Fast Company. ACM hopes, when this happens, that funding from other sources also shifts to fill gaps so needed research can continue. Researchers will still likely pursue their work under the current administration, even if the NSFs list of forbidden words stymies them. It just means they may have to take a page out of the book of online content creators, and understand how to deploy algospeakor the rephrasing of words in order to avoid blocks put in place by online platforms, most commonly found on social media. There have been many examples of researchers using different terms to try to get their work funded by different organizations, particularly private philanthropic foundations which often have an only slightly hidden political or ideological alignment, the anonymous researcher says. Euphemistically referring to subjects that might otherwise be seen as sensitive using a crude check of content in order to evade censorship could well be a path that researchers have to follow, fears Carolina Are, a researcher at the Center for Digital Citizens at Northumbria University. Are has studied platform censorship and how rank-and-file users avoid its clutches. With the broligarchs in power greatly affecting and influencing the way the U.S. is run, bolstered by Trumps politics, [what content creators had to do] is being broadened out to research. It’s possible to use doublespeak or euphemism to dance around contentious phrasing, Are explainsbut it does significantly impact the ability to disseminate that content, and finding work-arounds taxes thinking that could otherwise be put to the broader problem that needs addressing. Are also worries that researchers will end up in a cat-and-mouse game with the NSF in the same way that creators are, where terms and words that are being used euphemistically are added to block lists and the effect is nullified, meaning people have to find new ways of subverting barriers. Its not a long-term solution for people, she says. One of the researchers Fast Company spoke to is more pessimistic than that. Im not sure were going to see people successfully using different terms for the banned research terms because when you cant even use words like female or systemic, theres not really a way to get around that, they say. All biomedical research that looks at more than men is potentially out of bounds, and thats by design.
Category:
E-Commerce
All news |
||||||||||||||||||
|