diff --git a/05_OPERACIONES/runpod-serverless.md b/05_OPERACIONES/runpod-serverless.md new file mode 100644 index 0000000..8d81334 --- /dev/null +++ b/05_OPERACIONES/runpod-serverless.md @@ -0,0 +1,169 @@ +# RunPod Serverless - GPU Services + +## Resumen + +RunPod Serverless proporciona GPUs on-demand para los servicios GRACE, PENNY y FACTORY. + +**Cuenta**: rpd@tzr.systems +**Balance**: ~$69 USD +**Cuota máxima**: 5 workers simultáneos + +--- + +## Endpoints Activos + +| Servicio | Endpoint ID | Workers Max | Módulos | +|----------|-------------|-------------|---------| +| **GRACE** | `rfltzijgn1jno4` | 2 | ASR, OCR, TTS, Face, Embeddings, Avatar | +| **PENNY** | `zsu7eah0fo7xt6` | 2 | TTS (voz) | +| **FACTORY** | `hffu4q5pywjzng` | 1 | Embeddings, procesamiento | + +**GPUs soportadas**: RTX 3090, RTX 4090, RTX A4000, NVIDIA L4 + +--- + +## Configuración Crítica + +### Problema Resuelto (2025-12-27) + +Los workers serverless no iniciaban usando la API GraphQL. La solución fue usar la **REST API** con parámetros específicos. + +### API Correcta: REST (no GraphQL) + +```bash +# CORRECTO - REST API +curl -X POST "https://rest.runpod.io/v1/endpoints" \ + -H "Authorization: Bearer $RUNPOD_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "name": "mi-endpoint", + "templateId": "TEMPLATE_ID", + "gpuTypeIds": ["NVIDIA GeForce RTX 3090", "NVIDIA GeForce RTX 4090"], + "scalerType": "QUEUE_DELAY", + "scalerValue": 4, + "workersMin": 0, + "workersMax": 2, + "idleTimeout": 60, + "executionTimeoutMs": 600000, + "flashboot": true + }' +``` + +**Campos obligatorios**: +- `gpuTypeIds`: Array de strings (NO string separado por comas) +- `scalerType`: "QUEUE_DELAY" +- `scalerValue`: 4 (segundos de delay antes de escalar) +- `flashboot`: true (arranque rápido) + +--- + +## Uso de la API + +### Enviar Job + +```bash +curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/run" \ + -H "Authorization: $RUNPOD_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"input": {"module": "ASR_ENGINE", "audio_base64": "..."}}' +``` + +### Verificar Estado + +```bash +curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/status/{JOB_ID}" \ + -H "Authorization: $RUNPOD_API_KEY" +``` + +### Health Check + +```bash +curl "https://api.runpod.ai/v2/{ENDPOINT_ID}/health" \ + -H "Authorization: $RUNPOD_API_KEY" +``` + +Respuesta esperada: +```json +{ + "jobs": {"completed": 5, "failed": 0, "inQueue": 0}, + "workers": {"idle": 1, "ready": 1, "running": 0} +} +``` + +--- + +## Módulos GRACE + +| Módulo | Descripción | Modelo | +|--------|-------------|--------| +| `ASR_ENGINE` | Speech-to-Text | Faster Whisper Large V3 | +| `OCR_CORE` | Reconocimiento de texto | GOT-OCR 2.0 | +| `TTS` | Text-to-Speech | XTTS-v2 | +| `FACE_VECTOR` | Vectores faciales | InsightFace | +| `EMBEDDINGS` | Embeddings de texto | BGE-Large | +| `AVATAR_GEN` | Generación de avatares | SDXL | + +--- + +## Docker Image + +**Registry temporal**: `ttl.sh/tzzr-grace:24h` (expira en 24h) + +### Rebuild + +```bash +# En servidor deck (72.62.1.113) +cd /tmp/docker-grace +docker build -t ttl.sh/tzzr-grace:24h . +docker push ttl.sh/tzzr-grace:24h +``` + +### Dockerfile clave + +```dockerfile +FROM runpod/pytorch:2.1.0-py3.10-cuda11.8.0-devel-ubuntu22.04 +# Fix blinker antes de requirements +RUN pip install --no-cache-dir --ignore-installed blinker +RUN pip install --no-cache-dir -r requirements.txt +``` + +--- + +## Troubleshooting + +### Workers no inician (0 workers) + +1. Usar REST API, no GraphQL +2. Verificar `gpuTypeIds` es array +3. Incluir `flashboot: true` +4. Verificar cuota de workers no excedida + +### Job en cola indefinidamente + +```bash +# Purgar cola +curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/purge-queue" \ + -H "Authorization: $RUNPOD_API_KEY" +``` + +### Error de modelo (TOS) + +Agregar variable de entorno: `COQUI_TOS_AGREED=1` + +--- + +## Credenciales + +Ubicación: `/home/orchestrator/.secrets/runpod_api_key` + +```bash +export RUNPOD_API_KEY=$(cat ~/.secrets/runpod_api_key) +``` + +--- + +## Referencias + +- [RunPod REST API](https://docs.runpod.io/api-reference) +- [Serverless Endpoints](https://docs.runpod.io/serverless/endpoints/manage-endpoints) +- [Status Page](https://uptime.runpod.io)