• Updated:
  • Featured post

Embeddings y Vector Search

Un ordenador no puede entender texto ni relaciones semánticas o significados entre palabras. Solo puede entender números. Este problema lo resolvemos mediante el uso de embeddings.

Un embedding es la representación de texto (en forma de números) en un espacio vectorial. Esto permite a los modelos de IA comparar y operar sobre el significado de las palabras.

flowchart TD
    A["perro"] --> B{{Modelo de embedding}}
    B --> C["[-0.003, 0.043, ..., -0.01]"]
    
    N1["(texto que queremos convertir)"]:::note --> A
    N2["(vectores con contenido semántico)"]:::note --> C
    
    classDef note fill:none,stroke:none,color:#777;    

Los vectores de cada palabra o documento capturan el significado semántico del texto.

  • perro estará cerca de mascota
  • contrato estará lejos de playa

Vector vs SQL databases

El problema con las bases de datos típicas es que solo buscan matches exactos. Si yo busco por coche solo me sacará las entradas que contengan coche.

En cambio las BBDD vectoriales pueden interpretar la semántica de las palabras mediante vectores. Si busco por coche puede sacarme valores como sedán, SUV, Land Rover, etc.

Las BBDD vectoriales son muy buenas cuando necesitamos buscar items similares por proximidad uno respecto al otro.

Algunos ejemplos de uso son:

  • buscar películas parecidas (Netflix)
  • Recomendadores de items parecidos en tiendas online (Amazon)
  • buscar canciones parecidas (Spotify)

Read More

C# How to get headers

This is how to retrieve headers from any call.

// how to retrieve a mandatory header
if(Request.Headers.TryGetValue("mandatory-header", out var mandatoryHeader))
{
	// this one may be either filled or empty
	string optionalHeader = Request.Headers["optional-header"];
	var result = await _service.DoWork(mandatoryHeader, optionalHeader)
}
else 
{
	// log error as mandatory-header isn't included in the call
}

C# Task async programming (TAP) and parallel code

The core for asynchronous programming are the objects Task and Task<T>. Both of them are compatible with the keywords async and await.

First of all we need to identify if the code’s I/O-bound or CPU-bound.

  • the code’s limited for external operations and waits for something a lot of time. Examples of this are DDBB calls, or a server’s response. In this case we have to use async/await to free the thread while we wait
  • the code does a CPU-intensive operation. Then we move the work to another thread using Task.Run() so we don’t block the main thread.

async code vs parallel code

(!) Asynchronous code is not the same as parallel code (!)

  • In async code you are trying to make your threads do as little work as possible. This will keep your app responsibe, capable to serve many requests at once and scale well.
  • In parallel code you do the opposite. You use and keep a hold on a thread to do CPU-intensive calculations

async code

The importante of async programming is that you choose when to wait on a task. This way, you can start other tasks concurrently

In async code, one single thread can start the next task concurrently before the previous one completes.
(!) async code doesn’t cause additional threads to be created because an async method doesn’t run on its own thread. (!) It runs on the current synchronization context and uses time on the thread only when the method is active.

parallel code

For parallelism you need multiple threads where each thread executes a task, and all of those tasks are executed at the same time

Read More

Azure AD B2C - Notes

In Azure AD B2C you’ve two ways to provide identity UX:

  • user flows - predefined, built-in, configurable policies so you can create sign-up/in and policy editing UX in minutes.
  • custom policies - enable you to create your own user journeys for complex identity experience scenarios.

User Flows

For the most common identity tasks. Things like:

  • Account types used for sign-in, such as social accounts or local accounts.
  • Attributes to be collected from the consumer.
  • MFA
  • User interface customization
  • Set of claims in a token your app receives

Custom Policies

They’re config files that define the behaviour of your Azure AD B2C tenant UX. They can be fully edited by an identity developer to complete many different tasks.

It’s fully configurable and policy-driven.

  • Federate with other identity providers
  • Third party MFA
  • Collect any user input
  • Integrate with external systems using REST API communication

Each user journey is defined by a policy. You can build as many or as few policies as you need.

Defined by several XML files that refer to each other in a hierarchical chain.

Starter Pack

The starter pack comes with pre-built policies.

  • LocalAccounts - Enables the use of local accounts only.
  • SocialAccounts - Enables the use of social (or federated) accounts only.
  • SocialAndLocalAccounts - Enables the use of both local and social accounts. Most of our samples refer to this policy.
  • SocialAndLocalAccountsWithMFA - Enables social, local, and multi-factor authentication options.

Each starter pack includes:

  • a base file that contains most definitions. To help with troubleshooting and long-term maintenance of your policies, try to minimize the number of changes you make to this file.
  • an extension file holds the unique config changes for your tenant. This file is derived from the base file. Use this to add new functionality or override existing functionality. e.g. to federate with new identity providers.
  • a relying party (RP) file is the single task-focused file that’s invoked directly by the relying party application, such as web, mobile or desktop app. Each unique task such as sign-up/in, password reset etc requires its own relying party policy file. This file is derived from the extensions file. You can add more relying party policies. (e.g. delete my account, change a phone number…)

The basics

Claim

A claim provides temporary storage of data during an Azure AD B2C policy execution. It can store info about the user, such as first name, last name… or any other claim obtainer from the user or other systems. The claim schema is the place where you declare your claims.

When a policy runs, B2C sends and receives claims to and from internal and external parties, then sends a subset of these claims to your relying party app as part of the token.

  • Claims are saved, read, or updated against the directory user object.
  • Claims are received from an external identity provder.
  • Claims are sent or received using a custom REST API service.
  • Data is collected as claims from the user during sign-up or edit profile flows.

Claims transformation are predefined functions to convert a given claim into another one, evaluate a claim or set a claim value.

User Journey

They allow you to define business logic with path through which users will follow to gain access to your application. The user is taken through this journey to retrieve the claims that are to be presented to your app.

A user journey is built from a sequence of orchestration steps.

How you can add orchestration steps to social and local account starter pack.

Orchestration Steps

A user must reach the last step to acquire a token. Orchestration steps can be conditionally executed based on preconditions. After an step completes, B2C stores the outputted claims in the claims bag. This bag can be used by any further orchestration steps in the user’s journey.

Reference(s)

https://docs.microsoft.com/en-us/azure/active-directory-b2c/user-flow-overview
https://docs.microsoft.com/en-us/azure/active-directory-b2c/custom-policy-overview

Azure AD B2C

Azure AD B2C is Identity as a Service. It allows you to login into apps or use Twitter, Google or other authenticators. They can sign up or sign in. OpenID, OAuth, SAML.

Customers get to use their preferred accounts to sign up or create a regular user / pwd. You can customize the login experience.

App Types

This is best for the following types:

  • Server-based web apps work perfectly. They use OpenID Connect for all UX.
  • Mobile apps. They use OAuth2 auth code flow.
  • Web services and APIs. OAuth2.

This doesn’t work together with Azure AD B2C.

  • Daemons.
  • Web API chains.

Configure Azure Active Directory B2C

Create new resource -> B2C -> Create new tenant -> enter general information

Notice for the initial domain name, the full domain will be *.onmicrosoft.com

Once a tenant is provisioned, click to manage it. A new tab opens, this is because the B2C tenant is separate from your Azure subscription’s tenant. Your Azure subscription has an Azure Active Directory Tenant associated with it and this new B2C one is distinct and separate. It’s its own directory.

Inside we have the following modules or tabs:

  • Applications: websites or web APIs or Apps
  • Identity provider: lists all identity providers such as Facebook or Github and your users may log in through them.
  • User attributes: you can see all metadata you collect from users.
  • Users: you can see and edit all users who have created accounts.
  • User flows: Heart and soul of Azure AD B2C as it guides users through the sign in or sign up, or password resets.
  • Audit Logs: Authentication of various apps or administrative events.

Before being able to use the B2C instance, you also have to link the B2C directory back to the main subscription. This makes it a service for you to use.

Create new resource -> Azure Active Directory B2C -> Create -> Link existing Azure AD B2C Tenant to my Azure subscription -> select the one you just created -> put it into a resource group

User flow

The flows are reusable. They controls when social accounts can be used to sign in.
The attributes to collect from the user. (you can collect standard or custom ones). To use MFA or not.
User interface customization. Information inside the token that’s returned from B2C. It can be used across B2C apps. They’re entirely reusable.

Azure AD B2C Application - it’s not your web or API. It’s models apps that have authentication added. It makes sure only your users can sign-in.

Redirect URI - where to redirect response after the user’s been authenticated. As a test, put https://jwt.ms to see what’s inside the JWT token.

Built-in User Flows

There’re 3 recommended to have, but you can create custom ones:

  • Sign-in/up.
  • Profile editing.
  • Password reset.

Everything that controls the journeys of the users is located under Policies.

Social Providers

Social network provider are called identity providers (IdP). In order to use them, you have to configure them first. You will get a key which you enter into B2C and the authentication will happen seamlessly.

B2C Application Deep Dive

A B2C app models the real world. Every real-world app needs a B2C app.

reply URL: URL where responses will be returned to your webapp. App ID: unique ID that identifies your B2C app.
They obey standards like OAuth 2.0 or OpenID Connect.
When you want to interact with it, you specify your user flow.

Everything from Sign in to get Resource token happens inside Azure, not your app.

Tokens

They’re the way that B2C uses to transmit claims about a user calling Apps. They’re all JWT tokens.

ID token: contains claims user to identify user. (e.g. user’s object ID in AD). You must always validate it. Access token: Claims used to identify API permissions. Validate it. Refresh token: It should be a black box to your app. Both others tokens expire.


How server-based apps authentication works with B2C. Underlying authentication protocol is OpenID Connect.

Custom Policies

Individual steps into a user’s journey:

They’re XML files
have a Schema definition (which will be used to return the tokens)
Content definition - Things like how to render pages. Technical profiles - they’re the endpoints and how to communicate with identity providers. Orchestration - steps that are contained within the custom policies

Policy Files

In order to work with custom policies we’ll need to handle three files:

  • base file - contains most definitions. make minimal changes. you shouldn’t ever have to change this file. Starting point that every B2C tenant would use. Contains the common elements for everything.
  • Extensions file - unique config for tenant. Make changes here that overrides anything in the base file. These changes apply to the entire tenant.
  • Relying party file - single task focused that’s invoked by app. This is the file that your app invokes. Use this file to make any final tweaks to the user’s journey.

Reference(s)

User flows and custom policies in Azure Active Directory B2C - Azure AD B2C | Microsoft Learn

Mock multiple calls with same params

This is an example on how to mock a call when it’s called multiple times, and with the same parameter type every time.

Setup

I have the following class…

public class ConnectorRequest
{
	public string Query { get; set; }
}

… which will be consumed by the following service

public class IConnectorService
{
	Task<string> Execute(ConnectorRequest request);
}

Then I have a class which calls IConnectorService multiple times

public class ConnectorConsumerService
{
	private IConnectorService _service;
	
	// ...
	
	public async Task<string> Process() 
	{
		// ... does whatever
		var response1 = await _service.Execute(request1);
		// ... does whatever with that information
		var response2 = await _service.Execute(request2);
		// ... does whatever with that information
		var response3 = await _service.Execute(request3);
		// ... does whatever with that information
		// ... does whatever else
	}
	
	// ...
	
}

Test

Test which mocks multiple calls

public class ConnectorConsumerServiceTest
{
	// all mocks and stubs
	private Mock<IConnectorService> _dependencyMock;

	// service under test
	private ConnectorConsumerService _service;

	public ConnectorConsumerServiceTest()
	{
		_dependencyMock = new Mock<IConnectorService>();
		_service = new ConnectorConsumerService(_dependencyMock.Object);
	}

	[Fact]
	public async Task ProcessXXX_CaseXXX_ShouldReturnOkay()
	{
		// ARRANGE
		// example starts here! -> 
		var responseToExecution1 = new ConnectorRequest
		{
			Query = "some response";
		}
		var responseToExecution2 = new ConnectorRequest
		{
			Query = "another response";
		}
		var responseToExecution3 = new ConnectorRequest
		{
			Query = "oh no! a response";
		}
		
		_dependencyMock.SetupSequence(mock => mock.Execute(It.IsAny<ConnectorRequest>()))
			.ReturnsAsync(responseToExecution1)
			.ReturnsAsync(responseToExecution2)
			.ReturnsAsync(responseToExecution3);
		// <- example ends here

		// ACT
		var result = await _service.Process();
	
		// ASSERT
		result.status.Should().NotBeNull();
		// ... assert whatever
	}
}

XUnit test examples

Controller test example

this includes how to mock a request’s header - but it’s a bogus test. this only shows how to use XUnit and its structure.

public class XXXControllerTest
{
	// all mocks and stubs
	private Mock<IXXXService> _serviceMock;
	
	// controller under test 
	private XXXController _controller;
	
	public XXXControllerTest() 
	{
		_serviceMock = new Mock<IXXXService>();
		_controller = new XXXController(_serviceMock.Object);
	}
	
	[Fact]
	public async Task CallXXX_ShouldCall_Service()
	{
	// ARRANGE
	
	// mock header
	var httpContext = new DefaultHttpContext();
	httpContext.Request.Headers["someHeader"] = "my-mocked-value";
	_controller.ControllerContext = new ControllerContext
	{
		HttpContext = httpContext
	};
	
	// mock request
	string mockedValue = "someInputValueTo_serviceMock";
	string mockedResponse = "someResponseValueFrom_serviceMock";
	_serviceMock.Setup(mock => mock.SomeMethodCall(mockedValue)).ReturnsAsync(mockedResponse);
	
	// ACT
	var response = await _controller.CallSomething(mockedValue) as OkObjectResult;
	
	// ASSERT
	response.Should().NotBeNull();
	response.StatusCode.Should().Be(200);
	response.Value.Should().BeEquivalentTo(mockedResponse);
	}
}

Basic service test example

public class XXXServiceTest
{
	// all mocks and stubs
	private Mock<IXXXDependency> _dependency;

	// service under test
	private XXXService _serviceMock;

	public XXXServiceTest()
	{
		_dependency = new Mock<IXXXDependency>();
		_serviceMock = new XXXService(_dependency.Object);
	}

	[Fact]
	public async Task ProcessXXX_CaseXXX_ShouldReturnOkay()
	{
		// ARRANGE
		string paramX = "something";
		string responseX = "some response";
		_serviceMock.Setup(mock => mock.SomeMethodCall(paramX)).ReturnsAsync(responseX);
		
		// ACT
		var result = await _service.ProcessXXX(paramX);
	
		// ASSERT
		result.status.Should().NotBeNull();
		// ... assert whatever
	}
}

C# Async await with lambdas

If we want to use a method that’s marked as async inside a lambda expression, we have to split it in 2 steps:

  • task declaration
  • (async/await) task execution

example 1

var adminUserTask = users
	.Where(user => "admin".Equals(user.type.ToLower()))
	.Select(async user => { return await ProcessAdmin(user);});
List<UserResults> results = (await Task.WhenAll(adminUserTask)).ToList();

example 2

// task declaration
var mapTask = animals.Select(Map).ToList();

// task execution
var animalsMapped = (await Task.WhenAll(mapTask)).ToList();

// mapping method
private async Task<Animal> Map(Animal animal)
{
	// ... do whatever mapping is needed
}

example 3

private async Task<List<AnimalDTO>> MapAnimalsToDto(List<Animal> animals)
{
	var dtos = await Task.WhenAll(animals.Select(a => MapSingleAnimal(a)));
	return dtos.ToList();
}

// method signature
private async Task<AnimalDTO> MapSingleAnimal(Animal animal);

Bash scripts for port-forwards

The following is an example of the .sh scripts I use to forward and debug pods or features.

#!/bin/bash
# to use: sh portforward_microservice_xxx.sh env-dev | env-pre | env-pro
# it accepts additional params such as no-connectors to ignore some port forwards
# 	no-connectors - ignores forwards to XXX
args=("$@")
echo "INFO: using namespace ${args[0]}"

# only needed if we have several clusters for each env
if [ $1 == "env-pre" ]; then
	kubectl config use-context context-for-pre
elif [ $1 == "env-pro" ]; then
	kubectl config use-context context-for-pro
else	
	# default value - always DEV
	kubectl config use-context context-for-dev
fi

# add here new variables or cases to omit
no_connectors=false
for arg in "$@"; do
	case $arg in
		no-connectors)
			no_connectors=true
			shift
			;;	
		*)
			shift
			;;
	esac
done

if [ "$no_connectors" = false ]; then
	kubectl port-forward -n ${args[0]} svc/connector1 9210:9210 &
	kubectl port-forward -n ${args[0]} svc/connector2 8000:8000 &
else
	echo "INFO: ignoring connectors"
fi

# common pods we always need to call
kubectl port-forward -n ${args[0]} svc/service1 5050 &
kubectl port-forward -n ${args[0]} svc/service2 5060

EF Core Global Filters

(TODO: add link -> working code inside this project)

Let’s set a case where we have the following User class where we want to soft delete it, as we want to keep deleted records.

public class User
{
	public int Id { get; set; }
	public string Name { get; set; }
	public bool Active { get; set; }
}

In many cases we don’t care about “deleted” records so most times we will filter out deleted records like this

// DON'T DO THIS
public async Task<List<User>> GetUsers()
{
	return await _context.Users.Where(user => user.Active).ToList();
}

Instead of always doing this, which is too verbose, we may use Global Query Filters. This way we apply the filter globally.

public class ApDbContext : DbContext
{
	// ... more code
	
	protected override void OnModelCreating(ModelBuilder modelBuilder)
	{
		modelBuilder.Entity<User>().HasQueryFilter(user => user.Active);
	}
}

From now on, everytime we need to retrieve something from the User table, it will automatically filter out the deleted records.

Read More

Testing. Code example for c#

The following are code examples to test several scenarios.

(check this project to see more testing code examples)

Test a controller

  • GivenCorrectDate_WhenGetAvailability_ThenAssertCorrectReturnValue asserts a controller returns 200 when everything goes right
  • GivenWrongDate_WhenGetAvailability_ThenAssert400ReturnValue assert a controller returns 400 with specific error message
  • GivenServiceThrowsException_WhenReserveSlot_ThenAssertExceptionCaught assert method throws an exception, but it is correctly caught
public class SlotsControllerTest
{
	// class we're testing
	private SlotsController _controller;

	private Mock<ISlotsService> _slotsServiceMock;
	
	private Mock<IOptions<CoreConfig>> _iOptConfigMock;
	private Mock<CoreConfig> _configMock;
	
	[SetUp]
	public void SetUp()
	{
		_configMock = new Mock<CoreConfig>();
		_iOptConfigMock = new Mock<IOptions<CoreConfig>>();
		_iOptConfigMock.Setup(iOpt => iOpt.Value).Returns(_configMock);
		
		_slotsServiceMock = new Mock<ISlotsService>();
		_controller = new SlotsController(_slotsServiceMock.Object, _iOptConfigMock.Object);
	}

[Test]
public async Task GivenWrongDate_WhenGetAvailability_ThenAssert400ReturnValue()
{
	// given
	string date = "";
	string errorMessage = "oh no! something went wrong!";
	
	var errorMessages = new ErrorMessages
	{
		GeneralErrorMessage = errorMessage
	};
	
	_configMock.Setup(conf => conf.GeneralErrorMessage).Returns(errorMessage);
	
	// when
	var result = await _controller.GetAvailability(date) as BadRequestObjectResult;
	
	// then
	result.Should().NotBeNull();
	result.StatusCode.Should().Be(400);
	result.Value.ToString().Should().Contain(errorMessage);
}

[Test]
public async Task GivenCorrectDate_WhenGetAvailability_ThenAssertCorrectReturnValue()
{
	// given
	string date = "20241012";
	var parsedDate = new DateOnly(2024, 10, 12);
	
	string dateFormat = "yyyyMMdd";
	_configMock.Setup(conf => conf.InputDateFormat).Returns(dateFormat);
	
	var dto = new WeekAvailabilityResponse();
	_slotsServiceMock.Setup(service => service.GetWeekSlotsAsync(parsedDate)).ReturnsAsync(dto);
	
	// when
	var result = await _controller.GetAvailability(date) as OkObjectResult;
	
	// then
	result.Should().NotBeNull();
	result.StatusCode.Should().Be(200);
	result.Value.Should().Be(dto);
}

[Test]
public async Task GivenServiceThrowsException_WhenReserveSlot_ThenAssertExceptionCaught()
{
	// given
	var request = new ReserveSlotRequest();

	string errorMessage = "error when throw exception";
	var errorMessages = new ErrorMessages
	{
		GeneralErrorMessage = errorMessage
	};
	
	_configMock.Setup(conf => conf.GeneralErrorMessage).Returns(errorMessage);
	_slotsServiceMock.Setup(service => service.ReserveSlotsAsync(parsedDate)).ThrowAsync(new HttpRequestException(errorMessage));

	// when
	var result = await _controller.ReserveSlot(request) as BadRequestObjectResult
	
	// then
	result.Should().NotBeNull();
	result.StatusCode.Should().Be(400);
	result.Value.ToString().Should().Contain(errorMessage);
}

Read More