DeepSeek’s R1 Model and Its Impact on Travel

The Rise of Computer Use Agents

This post looks at the DeepSeek R1 travel impact and what it means for how AI will shape future travel products and strategy.

Last week, after OpenAI released its new Operator product, we talked about the potential of computer use agents (CUA) in travel. As a reminder, CUA products (Anthropic has one, too) follow a user’s instructions in a browser to navigate to websites and interact with them.

We talked about how the combination of “deep research agents” like Google’s could generate and execute search plans on a topic, a reasoning model like OpenAI’s o1 could think through the response to optimize the results, and then a CUA agent could act on those results. To create a highly personalized trip planning experience, the combination could give the best human agents a run for their money.

But that was a whole week ago and so much has happened since then!

Enter DeepSeek and the “Move 37” Moment

This week we’ll take a look at the new models from DeepSeek and some implications for travel. The biggest implication of DeepSeek isn’t its pricing or reasoning capabilities, but in something that happened as it was being built. Hint; you might be hearing a lot more about “Move 37 moments” in the days to come.

Forgive me a dive this week into the subject of model types but it’s necessary to understand the implications of what DeepSeek has done and why it’s so impactful.

Next week we’ll talk about why it’s so important for the travel community to jump into GAI with both feet, and not in a haphazard “marketing checklist item” way, for the benefit of both the organization and for its employees.

In the following weeks, we’ll begin talking about how to take that jump: a protocol of GAI adoption for companies that helps you streamline the process across the entire organization.

DeepSeek R1: What Was Announced

The new DeepSeek R1 models are dominating the news lately, and now that we’ve had time to see how markets and pundits have analyzed them, it seems like a good moment to recap what R1 is, what it does, and some possible impacts on travel.

Let’s start with a little Q&A.

What did DeepSeek announce?

In December, they announced their V3 model, roughly comparable to OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5. It was certainly newsworthy, but was drowned out by a rush of other announcements from leading AI companies like Google and OpenAI.

In January, they released DeepSeek R1, a reasoning model (aka “reasoner”) roughly on par with OpenAI’s o1. Like o1, R1 can deliver better answers by thinking more carefully through a request before responding. A reasoner can create a plan for answering once you ask a question and iterate on the plan until it’s satisfied with the result. That often means it might take a few minutes to think before responding.

R1 was built on the somewhat enigmatic DeepSeek R1-Zero model, which exhibited some unique characteristics and capabilities we’ll discuss below.

OK, but if OpenAI already has o1 and o3 reasoning models, why is R1 such a big deal?

Lower model build cost

R1’s final training cost was under $6 million—an order of magnitude less than competitors have spent on even simpler models. While that’s not the entire cost of creating R1, it’s still a big slice of what model development has historically required.

Lower cost to run

It’s also cheaper to operate than o1 or o3, meaning more businesses can afford to put reasoning capabilities to work.

Small model size

This is a relatively small model, so devices like laptops and phones can run it more easily—a particular boon for Apple’s unified-memory architecture. Today, you can already run R1 on a robust laptop, and future versions may run natively on phones.

Dubious “parentage”

While DeepSeek did introduce some interesting innovations with R1 and R1-Zero, emerging evidence suggests they “distilled” OpenAI’s o1 and possibly Anthropic’s Claude Sonnet 3.5 to create them. (Distillation means using outputs from existing models to train a new, smaller model.) This is, of course, concerning for the original model creators.

What This Means for Travel Companies

For starters, a reasoner available at a much lower price point makes advanced capabilities more accessible for designing and building GAI-powered applications.

Reasoners aren’t better at everything, but they’re excellent for tasks requiring iterative thinking—such as competitive strategy assessment, sentiment analysis, legal contract review, coding, and go-to-market strategy. They really shine where there’s a clearly verifiable right or wrong answer (like coding). Think about each function in your business: there’s likely a spot where a reasoner could add real value.

From a product perspective, this could be groundbreaking for optimization problems—like generating highly personalized itineraries—when combined with CUA, reasoning models, and deep research agents as we discussed last week.

Reasoners could also be helpful in assessing supply chains and creating/maintaining all kinds of descriptive content (especially accommodations and activities). Product managers appear to be some of the earliest beneficiaries, using reasoners to challenge assumptions and expand their thinking for product enhancements and new product design.

Finally, consider “big picture” questions in your company. Imagine how a PhD-level strategist might help you think through mission, strategy, product value propositions, or customer segments. A capable reasoning model could serve as a constantly available consulting partner, ready to supercharge your thinking whenever you need it.

The Self-Learning Breakthrough

There are some implications you might not have heard about, but that could be the most impactful this year.

DeepSeek may be a leader, but it’s not the leader among model developers. OpenAI still offers the newer (and arguably more performant) o3, with o4 already in training and GPT5 under development. They’ve responded to DeepSeek’s R1 by moving some product releases forward, reinforcing their GAI leadership. With new capabilities hitting the market practically every week, the competitive imperative to bring GAI into organizations is accelerating, not slowing.

Expect a fierce contest among AI providers as they rush to unveil new model and product innovations at a frenzied pace.

Distillation and licensing concerns aside, the biggest deal about DeepSeek models might not be about costs at all. DeepSeek’s models appear able to teach themselves.

Traditionally, large language models have relied on reinforcement learning with human feedback (RLHF), which is expensive and slow because humans are in the loop. DeepSeek’s approach removes humans for a growing set of tasks, letting the models teach themselves to solve problems, optimize their methods, and reinforce what works.

That’s a ‘TrulyBigDeal’ because it allows for self-improvement over millions or billions of iterations—a glimpse of AI both training another AI and training itself. In many ways, it’s a potential on-ramp to far greater intelligence in the near future.

“Move 37” and the AlphaGo Connection

Let’s go “Back to the Future” on this concept.

In 2016, Google DeepMind’s AlphaGo shocked the Go world by defeating Lee Sedol—a grandmaster believed to be beyond the model’s reach. Although AlphaGo had beaten strong human players before, it was the first time it had learned primarily by playing against another instance of itself instead of just imitating human strategies.

By running games endlessly against itself, AlphaGo discovered wholly new ways to play. The classic example came in Game Two with “Move 37,” an unconventional maneuver that no human would likely have tried—but which was clearly instrumental in winning the game.

During training, DeepSeek’s R1-Zero reportedly experienced a series of these “Move 37” moments—which DeepSeek referred to as “aha moments,” breakthroughs in which it taught itself to think outside its prior training set.

AI researchers are still grappling with what it means for a reasoning model to train itself.

If you don’t yet feel a tingle up your spine, consider two more tidbits:

  • In the past week, one researcher asked R1 to find a way to speed itself up. R1 wrote all the code to double its own speed, leaving out only the testing function (less than 5% of the total work).

  • Another researcher asked R1-Zero to communicate with another model, and they ended up creating their own symbolic language to do it. I wonder if that’s the language SkyNet used to talk to Terminators…


Key Takeaways

  • Lower-cost reasoning models like DeepSeek’s R1 make advanced AI accessible to far more travel companies.

  • Innovation is accelerating—expect faster, bigger leaps across reasoning, research, and action capabilities.

  • Self-learning models (like R1-Zero) may represent the first true step toward AGI.

If you want a deeper dive in the new releases, Matthew Berman did a nice recap of some of the implications of DeepSeek’s new models and OpenAI’s Deep Research function here.

Related Posts

current-target state
Read More
infra build
Read More
Gemini_Generated_Image_q8z3b4q8z3b4q8z3
Read More
OC robot
Read More