• Featured post

Embeddings, Vector Search & BM25

Un ordenador no puede entender texto ni relaciónes semánticas o significados entre palabras. Solo puede entender números. Esto lo resolvemos mediante el uso de embeddings.

Un embedding es la representación de texto (en forma de números) en un espacio vectorial. Esto permite a los modelos de IA comparar y operar sobre el significado de las palabras.

flowchart TD
    A["perro"] --> B
    B --> C["[-0.003, 0.043, ..., -0.01]"]
    
    N1["(texto que queremos convertir)"]:::note --> A
    N2["(vectores con contenido semántico)"]:::note --> C
    
    classDef note fill:none,stroke:none,color:#777;    

Los vectores de cada palabra o documento capturan el significado semántico del texto.

  • perro estará cerca de mascota
  • contrato estará lejos de playa

Vector vs SQL databases

El problema con las BBDD típicas es que solo buscan matches exactos. Si yo busco por coche solo me sacará las entradas que contengan coche.

En cambio, como las BBDD vectoriales pueden interpretar la semántica de las palabras mediante los vectores, si busco por coche puede sacarme valores como sedán, SUV, Land Rover, etc.

Las BBDD vectoriales son muy buenas cuando necesitamos buscar items similares por proximidad uno respecto al otro. Un ejemplo de uso es buscar películas parecidas (Netflix). Otro ejemplo son los recomendadores de items parecidos en tiendas online (Amazon).

Como ejecutar una búsqueda (query) mediante vectores

(You can see the code here)

Necesitamos:

  • Una BBDD Vectorial (CosmosDB)
  • Un modelo para transformar los embeddings (text-embedding-3-large)

El flujo completo es el siguiente:

  1. Usar un embedding model para obtener los vectores del contenido que queremos indexar
  2. Insertar el texto original y los vectores del contenido en una BBDD vectorial
  3. Cuando queramos ejecutar una query usar el mismo embedding model de antes con la query a buscar. Con el embedding resultante buscamos vectores similares en la BBDD y sacamos el texto original de original_text

    Introducir vectores en CosmosDB

    Para poder buscar necesitamos rellenar antes la BBDD con contenido. Lo mantenemos simple. Metemos

    • un ID a mano
    • el texto original
    • los vectores resultado de hacer el embedding sobre el texto original

El pseudocódigo se ve así y se ejecuta de uno en uno

text = "A shiba walks alone in the park"
# this sends the text to the model text-embedding-3-large 
vectors = createEmbeddingsForText(text)
item = {
	"id": "1",
	"original_text": text,
	"vectors": vectors
}
uploadToCosmosDB(item)

ejemplos de los datos que guardo

{
	"id": "1",
	"original_text": "A shiba walks alone in the park",
	"vectors": [-0.003, 0.043, ..., -0.001]
}

Read More

Interact with Github Copilot

Inline Chat

  1. Seleccionar el código para la query
  2. alt + ç
  3. hacer la pregunta

Algunos comandos útiles para inline chat son

  • /doc - añade comentarios al código
  • /explain - conseguir explicaciones sobre código
  • /fix - propose fixes for problems in the selected code
  • /generate - generar código para responder a una pregunta específica
  • /optimize - analizar y optimizar código seleccionado
  • /tests - generates unit tests for the selected code
  • /comment - converts comments into code snippets
  • /suggest - offer code suggestions based on the current context

Comments to code

You write a comment and the function name/input and when you hit Enter copilot completes with code

# function to iterate all prompts and print them
def iterate_and_print(prompts):

then copilot completes it to this

# function to iterate all prompts and print them
def iterate_and_print(prompts):
    for idx, prompt in enumerate(prompts):
        var_name = chr(ord('A') + idx)
        print(f"Prompt {var_name}:\n{prompt}\n")

Read More

Concentration Methods

Pomodoro

25 min trabajo + 5 min descanso; a la tercera vez - descanso 15 mins Ideal para tareas cortas y variadas. Ritmo constante, evita fatiga.

52/17 rule

52 mins de trabajo + 17 mins descanso. Ideal para tareas de media duración. Equilibrio entre productividad y descanso pero menos flexible que pomodoro.

Python's Poetry

Prerequisites

(!) TODO: review (!)

First of all install pip and use pip to install pipx. From then on, use only pipx

install pip tools

py -m pip install --user pip-tools

# upgrade pip
py -m pip install --upgrade pip

install pipx

py -m pip install --user pipx

# adds executables to global path so you can call them without py -m ...
py -m pipx ensurepath
# close and reopen console

install poetry through pipx

py -m pipx install --user poetry

Read More

Ollama & OpenWeb UI (local LLMs)

Ollama’s github repository (to check for updates)
Ollama’s web (to check for models)
OpenWeb UI (to check for docker commands)

Install locally

Prerequisites

I’m running OpenWeb UI through docker.

First of all check you have docker.desktop open. It may tell you to update WSL. Afterwards please check your docker is able to run containers

docker run hello-world

Ollama

ollama ls # see local models
ollama run gpt-oss # run model
ollama rm gemma3 # delete model

inside a model

/? # see help

# this creates a 'blueprint' you can save and load multiple times to give the LLM some context
/save <model>
/load <model>

/clear
/bye (or ctrl+D)

Read More

Screaming architecure

Es un principio de organización de proyectos. Percibe como estructurar el código en un proyecto.

Concepto propuesto por Robert C. Martin donde la arquitectura deberia gritar el dominio de negocio por módulos de dominio y no los detalles técnicos por capas técnicas.

propuesta habitual

Controllers/
Repositories/
Data/
Services/

screaming architecture

Invoices/
  CreateInvoice/
  PayInvoice/
  CancelInvoice/
Customers/
  RegisterCustomer/
  UpdateCustomer/

La desventaja es que puede incurrir en duplicación de código y requiere de conocimiento técnico avanzado.

C# User Secrets

Never store passwords or sensitive data in source code or configuration files. Production secrets shouldn’t be used for development or test. Secrets shouldn’t be deployed with the app. Production secrets should be accessed through a controlled means like Azure Key Vault.

Secret manager

This tool hides implementation details. The secret values are stored in a JSON file in the local machine’s user profile folder.

This tool operates on project-specific configuration settings and (!) it’s only meant for local development (!). Don’t use it for production as it’s not encrypted.

To use user secrets, run the following command in the project directory

dotnet user-secrets init

You can do this through visual studio Right click on your project inside vstudio > Administrar secretos de usuario

Set a new secret

Define an app secret containing a key > value

dotnet user-secrets set "OpenAI:ApiKey" "sk-xxxx"

Read More

.NET AI integration

Today’s AI landscape moves so fast and providers differ so much that vendor lock-in can become expensive. You need a clean, testeable way to add AI without tying your architecture to one SDK.

The solution to this problem is a model-agnostic solution.

Nuggets to use (you need to click “see preliminar versions”):

  • Microsoft.Extensions.AI - This nugget implements IChatClient interface, which is an abstraction to use several LLM providers, from ChatGPT to Ollama.
  • Microsoft.Extensions.AI.OpenAI
  • OllamaSharp (previously Microsoft.Extensions.AI.Ollama)

You’ll need to go to Open AI platform to set up a project, billing, and get an openAI API key.

This repository is a test implementation which connects to OpenAi’s ChatGPT and is able to send prompts.

Best Practices

  • Keep inputs short and specific
  • Validate outputs with regex/JSON schema. Reject or re-ask when invalid
  • Log prompts, token counts, latency and provider responses
  • Improve cost ops. Cache results, batch requests and prefer smaller models by default
  • Don’t commit or send secrets or personal information
  • Failover. Implement timeouts, retries, and fallback models
  • LLMs are stateless; maintaining and reconstructing conversational context is a developer’s responsibility (chat history or memory abstractions)

Security

  • prompt injection: beware with malicious prompts to subvert model guardrails, steal data or execute unintended actions
  • LLMs may leak private or internal data via crafted prompts
  • Training data poisoning may be injected by malicious actors
  • DoS and rate limiting: prevent overuse / abuse

Reference(s)

https://roxeem.com/2025/09/04/the-practical-net-guide-to-ai-llm-introduction/
https://roxeem.com/2025/09/08/how-to-correctly-build-ai-features-in-dotnet/

EF Core multithreading

I’ve had issues with EF Core when operating with multiple threads and with multiple calls at the same time.

The most important things to check are:

  1. The DbContext is not being shared between calls or threads
  2. All classes which have the context inyected must be scoped (not singleton)
  3. If working with async methods, you need to await calls

I have the following service

public class PersonService(AppDbContext _context)
{
	public async Task<Person> GetPerson(string id)
	{
		return await context.Persons.Find(id);
	}
}

which I may configure as follows

// if I inject it as singleton, this would cause exceptions on multiple calls
services.AddSingleton<IPersonService, PersonService>

// we have to inject it as scoped so it creates a context new for each call
services.AddScoped<IPersonService, PersonService>

Caching in .NET (IMemoryCache)

.NET offers several cache types. I’m going to explore here IMemoryCache which stores data in the memory of the web server. It’s simple but not suitable for distributed scenarios.

first of all we need to register the services

builder.Services.AddMemoryCache();

GetOrCreateAsync

here’s how you can inject and use it, without manipulating the cache itself

public class PersonService(IMemoryCache _cache)
{
	private const string CACHE_PERSON_KEY = "PersonService:GetPerson:";

	public async Task<Person> GetPerson(string id)
	{
		return await _cache.GetOrCreateAsync(CACHE_PERSON_KEY + id, async entry =>
		{
			entry.AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5);
			return await GetPersonNoCache(id);
		});
	}

	public async Task<Person> GetPersonNoCache(string id)
	{
		// do operations to get a person here
	}
}

Read More