In the tech world, we’re regularly faced with decisions that will have a major impact. What if every quick decision cost you thousands of euros and months of work?

This is especially true in Data Engineering, MLOps, and AI, where decisions involve both apparently technical questions (which database to choose? which tech stack?) and business challenges (how to meet a business need? what are the data freshness requirements?).

Consider these scenarios:

  • You’re asked to choose an MLOps stack to match your company’s AI ambitions. How do you choose without regretting it?
  • A new business line launches, and you need to set up a critical data ingestion pipeline with a 30-minute SLA. The PoC worked fine in dev. Is that enough for production?
  • A business change requires modifying a critical component of a recommendation API. You manage it, but performance drops drastically, and the component was untested. What’s the impact on the project?

These examples illustrate a recurring problem: decisions made under pressure to meet short-term needs can have disastrous long-term consequences. Responding too quickly to an immediate need is essentially shooting yourself in the foot.

The goal of this article is to share 3 situations I’ve experienced, their impact, and most importantly how we could have done things differently.


The hidden costs of short-termism

Declining chart
Rushed decisions often lead to hidden costs, budget surprises, and worse, a loss of user trust.

Making strategic decisions quickly can seem smart in the short term, but it can lead to costly long-term consequences. Here are the 3 concrete examples I’ve observed:

1. A poorly chosen MLOps stack

A strategic Machine Learning project had been developed in Python with open-source libraries. Following a change in direction, a proprietary no-code stack (Dataiku) was chosen to facilitate development and deployment. Result: code rewrite, external consulting, and ultimately a costly migration to a third hybrid solution (Vertex AI) less than a year later.

Impact:

  • 2 costly migrations
  • Over 100,000 euros spent on ill-fitting solutions
  • Several months of lost productivity

2. A catastrophic production launch

A data ingestion pipeline via Parquet files on Google Cloud was set up with a 30-minute SLA. The PoC worked in dev, but production volume had never been tested. Despite the team’s warnings, the decision was made to go live. Result: ingestion failure, emergency debugging, and damaged user trust.

Impact:

  • Ingestion failure from day one
  • Production debugging on the critical data path
  • Degraded user trust

3. A rushed algorithm change

A well-tuned recommendation API had to undergo an algorithm change to meet a new business requirement. For lack of time, the component wasn’t tested, and performance dropped from 100ms to over 10 seconds. Result: damaged product image and the algorithm was disabled shortly after.

Impact:

  • Response time multiplied by 100
  • Damaged product image at launch
  • Algorithm disabled due to lack of business relevance

My framework for avoiding these mistakes

1. Protect user trust at all costs

User trust is the key to any tech project’s success. Start simple, avoid complex solutions, and ship a product that meets a basic business need. Then evolve it through successive iterations.

2. Think long-term

Why distinguish between PoC and production? Before even testing an idea, define the foundations of the solution. Ask the right questions:

  • Does the solution meet our needs?
  • Is it extensible and scalable?
  • What is the total cost of ownership (TCO)?

Prefer robust, proven solutions like Airflow, Docker, Kubernetes, Terraform, or FastAPI. For MLOps, go with MLflow or Kubeflow.

3. Involve your team

Bring your technical experts in from the design phase. Their feedback can prevent costly mistakes and ensure the solution meets real needs. Their buy-in is crucial to the project’s success.

4. Adopt a FinOps approach

Integrate cost thinking from the start. Compare the economic models of solutions (e.g., Snowflake vs PostgreSQL) and plan for easy migration if needed. An extra week of evaluation can save you thousands of euros and avoid launching a product before it’s ready.

5. Document and communicate

Document every decision (ADRs) and share it with the team. Clear documentation saves time and prevents mistakes. And if you leave tomorrow, your project should be able to continue without you.

6. Stay flexible in the face of business demands

Technical teams and business teams often have distinct priorities:

  • Technical teams focus on designing, implementing, and maintaining robust, long-lasting tools. They tend to think long-term and can be resistant to rapid change.
  • Business teams aim to capitalize on concrete commercial opportunities and respond quickly to market changes.

So don’t be rigid about the above. Find a balance between the long-term stability of a solution and responsiveness to business needs.


Conclusion

The goal of this article is to share frameworks for thinking to help you avoid the mistakes I’ve observed over the years.

This brings me to one last point: how do we have the best impact on a company in these situations? Since a company’s objective is to generate value, we can rephrase the question: how do we maximize the value we bring to the company in these decisions?

By responding to business requests, even when they go against the company’s interests? Seizing an opportunity that will cost more than it brings? I don’t think that’s the right approach.

Sometimes you have to know how to say no, set expectations, and think first and foremost about the value delivered as a whole. But do it in partnership with business teams, to find the most adapted solution with the best long-term impact.

Taking time to think through your technology choices is an investment that pays off. Avoid regrets by prioritizing quality over speed.


Further reading

In my role as Founder of a startup, I face these questions on an even larger scale. How do you lay solid, scalable, performant foundations while staying agile?

I’ll document my thinking, decisions, and especially my mistakes on this blog.

A teaser: a containerized architecture with Docker, CI/CD tools for rapid iteration, all of it laying the groundwork for a migration from Cloud Run to Kubernetes to reduce costs over time.

Stay tuned.


About this article

This article addresses a recurring challenge in tech companies: technical decisions. These come up particularly in Data Engineering, DevOps, and AI.

If you’d like to know more about these topics, contact me directly by email.